Browser automation skill using browser-pilot CLI. Use this when you need to control a web browser, inspect a page, capture a workflow, trace a realtime issue, or exercise voice and environment conditions.
Route the task before choosing commands.
For local Chrome on Chrome 144+, try plain bp connect first after enabling remote debugging in chrome://inspect/#remote-debugging. Only narrow with --channel or --user-data-dir if auto-discovery is ambiguous.
bp snapshot, bp page, bp forms, bp text, bp diagnosebp exec, bp runbp review, outcome conditions in bp execbp recordbp tracebp audio, bp envbp snapshot -ibp textbp pagebp reviewbp diagnosebp evalbp connect --name dev
bp exec -s dev '{"action":"goto","url":"https://example.com"}'
bp snapshot -i -s dev
bp exec -s dev '[
{"action":"fill","selector":"ref:e5","value":"[email protected]"},
{"action":"click","selector":"ref:e7"}
]'
If multiple Chrome profiles are eligible, use bp connect --channel beta or bp connect --user-data-dir <path>.
When you need to verify that an action actually worked (not just clicked):
bp exec -s dev '[
{"action":"click","selector":"#save-btn",
"expectAny":[
{"kind":"textAppears","text":"Changes saved"},
{"kind":"elementVisible","selector":"#success-toast"}
],
"failIf":[{"kind":"textAppears","text":"Error"}],
"dangerous":true}
]'
The result includes outcomeStatus (success/failed/ambiguous/unsafe_to_retry), matchedConditions, and retrySafe.
When you need structured business state (not raw snapshot):
bp review -s dev --json
Returns headings, forms, alerts, tables, key-value pairs, and status labels. Useful after form submissions, checkout flows, or any page with business data.
Rules:
bp snapshot -ibp page caches the refs it shows, but it is a compact overview, not a full target inventorybp text --selector main or another containerbp review for confirmations, detail pages, tables, alerts, and key-values, not dense catalog gridsbp text for readable copy and bp review for structured verificationbp evalbp diagnose before dropping to raw JSwaitFor: "networkIdle" only means transport quiet; on hydrated apps follow it with bp snapshot -i, bp text, bp review, or an explicit assertionUse record when the workflow is being demonstrated manually.
bp record -s demo --profile automation -f ./artifacts/demo.recording.json
bp record summary ./artifacts/demo.recording.json
bp record derive ./artifacts/demo.recording.json -o workflow.json
bp run workflow.json
Do not start by reading the raw artifact.
Use trace when the question spans time, websocket traffic, console failures, permission state, media, or voice.
bp trace start -s dev --timeout 20000
bp trace summary -s dev --view session
bp trace summary -s dev --view ws
bp trace watch -s dev --view console --assert no-console-errors --timeout 5000
bp listen ... is compatibility only. Prefer bp trace tail ....
Voice control:
bp audio setup -s vt
bp exec -s vt '{"action":"goto","url":"https://my-voice-app.com"}'
bp audio check -s vt
bp audio roundtrip -s vt -i prompt.wav --transcribe
bp trace summary -s vt --view voice
Browser-state controls:
bp env permissions grant -s vt microphone
bp env network offline -s vt --duration 5000
bp env visibility hidden -s vt
Useful steps for realtime and voice apps:
waitForWsMessageassertNoConsoleErrorsassertTextChangedassertPermissionassertMediaTrackLiveExample:
bp exec -s vt '[
{"action":"waitForWsMessage","match":"*realtime*","where":{"type":"session.ready"}},
{"action":"assertTextChanged","selector":"#status","from":"Connecting","to":"Live"},
{"action":"assertNoConsoleErrors","windowMs":500}
]'
Any action step can include outcome conditions:
expectAny: success if any condition matchesexpectAll: success only if all conditions matchfailIf: failure if any condition matches (checked first)dangerous: never auto-retry on ambiguous outcomeCondition kinds: urlMatches, elementVisible, elementHidden, textAppears, textChanges, networkResponse, stateSignatureChanges.
bp snapshot -ibp pagebp execbp runbp recordbp trace summarybp reviewbp audiobp env