Name: Browser Automation with agent-browser
Author: CurtisNewbie

Core Workflow

Every browser automation follows this pattern:

Navigate: agent-browser open <url>
Snapshot: agent-browser snapshot -i (get element refs like @e1, @e2)
Interact: Use refs to click, fill, select
Re-snapshot: After navigation or DOM changes, get fresh refs

```bash agent-browser open https://example.com/form agent-browser snapshot -i

Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

agent-browser fill @e1 "[email protected]" agent-browser fill @e2 "password123" agent-browser click @e3 agent-browser wait --load networkidle agent-browser snapshot -i # Check result ```

Reference	When to Use
references/commands.md	Full command reference with all options
references/snapshot-refs.md	Ref lifecycle, invalidation rules, troubleshooting
references/session-management.md	Parallel sessions, state persistence, concurrent scraping
references/authentication.md	Login flows, OAuth, 2FA handling, state reuse
references/video-recording.md	Recording workflows for debugging and documentation
references/proxy-support.md	Proxy configuration, geo-testing, rotating proxies

Template	Description
templates/form-automation.sh	Form filling with validation
templates/authenticated-session.sh	Login once, reuse state
templates/capture-workflow.sh	Content extraction with screenshots

Browser Automation with agent-browser

Core Workflow

Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

Browser Automation with agent-browser

Core Workflow

Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

Essential Commands

Navigation

Snapshot

Interaction (use @refs from snapshot)

Get information

Wait

Capture

Common Patterns

Form Submission

Authentication with State Persistence

Reuse in future sessions

Data Extraction

JSON output for parsing

Parallel Sessions

Visual Browser (Debugging)

Local Files (PDFs, HTML)

Open local files with file:// URLs

iOS Simulator (Mobile Safari)

List available iOS simulators

Launch Safari on a specific device

Same workflow as desktop - snapshot, interact, re-snapshot

Take screenshot

Close session (shuts down simulator)

Ref Lifecycle (Important)

Semantic Locators (Alternative to Refs)

Deep-Dive Documentation

Ready-to-Use Templates

Feishu Perm

Discord

Coding Agent (bash-first)

Apple Notes

Feishu Wiki

Bear Notes

Browser Automation with agent-browser

Core Workflow

Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

Browser Automation with agent-browser

Core Workflow

Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

Essential Commands

Navigation

Snapshot

Interaction (use @refs from snapshot)

Get information

Wait

Capture

Common Patterns

Form Submission

Authentication with State Persistence

Login once and save state

Reuse in future sessions

Data Extraction

JSON output for parsing

Parallel Sessions

Visual Browser (Debugging)

Local Files (PDFs, HTML)

Open local files with file:// URLs

iOS Simulator (Mobile Safari)

List available iOS simulators

Launch Safari on a specific device

Same workflow as desktop - snapshot, interact, re-snapshot

Take screenshot

Close session (shuts down simulator)

Ref Lifecycle (Important)

Semantic Locators (Alternative to Refs)

Deep-Dive Documentation

Ready-to-Use Templates

Feishu Perm

Discord

Coding Agent (bash-first)

Apple Notes

Feishu Wiki

Bear Notes