技能檔案

Browser

Name: Browser
Author: devtodollars

Automate browser interactions using agent-browser CLI — navigate, click, fill forms, scrape content, take screenshots, and more.

devtodollars976 星標2026年3月5日

職業: 網頁開發人員
分類: Frontend

技能內容

Browser Automation with agent-browser

Control a headless Chromium browser via CLI commands. All commands are run with npx agent-browser <command>.

Setup (first time only)

npx agent-browser install

Command	Description
`open <url>`	Navigate to a URL
`close`	Close browser
`tab`	List open tabs
`tab new [url]`	Open new tab
`tab <n>`	Switch to tab n
`tab close [n]`

相關技能

Browser | Skills Pool

Command	Description
`snapshot`	Get accessibility tree with `@e1`, `@e2` element refs (best for AI)
`screenshot [path]`	Capture screenshot
`screenshot --full`	Full page screenshot
`screenshot --annotate`	Screenshot with numbered element labels
`get text <sel>`	Extract text content
`get html <sel>`	Get innerHTML
`get value <sel>`	Get input value
`get attr <sel> <attr>`	Get element attribute
`get title`	Get page title
`get url`	Get current URL
`get count <sel>`	Count matching elements
`get box <sel>`	Get bounding box
`get styles <sel>`	Get computed styles
`is visible <sel>`	Check if element is visible
`is enabled <sel>`	Check if element is enabled
`is checked <sel>`	Check if checkbox is checked

Command	Description
`click <sel>`	Click an element
`click <sel> --new-tab`	Click, opening in new tab
`dblclick <sel>`	Double-click
`hover <sel>`	Hover over element
`focus <sel>`	Focus element
`scroll <dir> [px]`	Scroll up/down/left/right
`scrollintoview <sel>`	Scroll element into view
`drag <src> <tgt>`	Drag and drop

Command	Description
`fill <sel> <text>`	Clear field and type text
`type <sel> <text>`	Type into focused element (appends)
`keyboard type <text>`	Type with real keystrokes
`keyboard inserttext <text>`	Insert text without key events
`press <key>`	Press a key (Enter, Tab, Control+a, etc.)
`keydown <key>`	Hold key down
`keyup <key>`	Release key

find role <role> <action> [value]         # By ARIA role
find text <text> <action>                 # By visible text
find label <label> <action> [value]       # By label
find placeholder <ph> <action> [value]    # By placeholder
find alt <text> <action>                  # By alt text
find title <text> <action>                # By title attribute
find testid <id> <action> [value]         # By data-testid
find first <sel> <action> [value]         # First match
find last <sel> <action> [value]          # Last match
find nth <n> <sel> <action> [value]       # Nth match

npx agent-browser find role button click --name "Submit"
npx agent-browser find text "Sign In" click
npx agent-browser find label "Email" fill "[email protected]"
npx agent-browser find placeholder "Search..." fill "vinyl records"

Command	Description
`wait <selector>`	Wait for element to appear
`wait <ms>`	Wait N milliseconds
`wait --text "text"`	Wait for text to appear
`wait --url "**/path"`	Wait for URL pattern
`wait --load networkidle`	Wait for network idle
`wait --fn "window.ready"`	Wait for JS condition

Command	Description
`cookies`	Get all cookies
`cookies set <name> <val>`	Set a cookie
`cookies clear`	Clear cookies
`storage local`	Get all localStorage
`storage local <key>`	Get localStorage key
`storage local set <k> <v>`	Set localStorage value
`storage local clear`	Clear localStorage
`storage session`	Same commands for sessionStorage

Command	Description
`network route <url>`	Intercept requests matching URL
`network route <url> --abort`	Block matching requests
`network route <url> --body <json>`	Mock response body
`network unroute [url]`	Remove intercept routes
`network requests`	View tracked requests
`network requests --filter <pat>`	Filter requests by pattern

Command	Description
`set viewport <w> <h>`	Set viewport size
`set device <name>`	Emulate device (e.g. "iPhone 14")
`set media dark`	Emulate dark color scheme
`set media light`	Emulate light color scheme
`set offline on`	Enable offline mode
`set offline off`	Disable offline mode
`set geo <lat> <lng>`	Set geolocation
`set headers <json>`	Set extra HTTP headers
`set credentials <user> <pass>`	Set HTTP basic auth

Command	Description
`console`	View console messages
`errors`	View uncaught exceptions
`highlight <sel>`	Visually highlight element
`trace start [path]`	Start trace recording
`trace stop [path]`	Stop and save trace
`profiler start`	Start CPU profiling
`profiler stop [path]`	Save profile
`pdf <path>`	Save page as PDF

Command	Description
`diff snapshot`	Compare current vs previous snapshot
`diff snapshot --baseline <file>`	Compare against saved file
`diff screenshot --baseline <file>`	Visual pixel comparison
`diff url <a> <b>`	Compare two URLs

Page	URL	Notes
Landing	`http://localhost:3000`	Sections: Navbar, Hero, Logos, Items, Stats, Pricing, FAQ, Cta, Footer
Sign In / Sign Up	`http://localhost:3000/auth`	Supabase Auth UI
Account	`http://localhost:3000/account`	Authenticated - user profile & subscription
Pricing	`http://localhost:3000/#pricing`	Stripe checkout integration

# Toggle dark mode via the ModeToggle button in the navbar
npx agent-browser find role button click --name "Toggle theme"

# Or force via browser emulation
npx agent-browser set media dark
npx agent-browser set media light

# Mobile
npx agent-browser set viewport 375 812

# Tablet
npx agent-browser set viewport 768 1024

# Desktop
npx agent-browser set viewport 1440 900

Command	Description
`check <sel>`	Check a checkbox
`uncheck <sel>`	Uncheck a checkbox
`select <sel> <val>`	Select dropdown option
`upload <sel> <files>`	Upload file(s) to input

Command	Description
`eval <js>`	Run JavaScript in page context
`eval -b <base64>`	Run base64-encoded JS

Command	Description
`mouse move <x> <y>`	Move cursor to coordinates
`mouse down [button]`	Press mouse button
`mouse up [button]`	Release mouse button
`mouse wheel <dy> [dx]`	Scroll wheel

Command	Description
`state save <path>`	Save auth/session state
`state load <path>`	Load saved state
`state list`	List saved states
`state clear [name]`	Clear a saved state
`state clear --all`	Clear all saved states

Browser

Browser Automation with agent-browser

Setup (first time only)

Navigation

Browser

Browser Automation with agent-browser

Setup (first time only)

Navigation

Page Inspection (use these to understand page structure)

Clicking & Interaction

Text Input

Form Controls

Semantic Find (locate elements by meaning, not CSS)

Waiting

Cookies & Storage

Network Interception

JavaScript Execution

Browser Configuration

Mouse Control

State Persistence

Debugging

Comparison & Diffing

Typical Workflow

Selectors

Project-Specific Pages

Testing Dark Mode

Testing Responsive Design

Extract Errors

Frontend Patterns

Coding Standards

Coding Standards

Coding Standards

Frontend Patterns

Browser

Browser Automation with agent-browser

Setup (first time only)

Navigation

Browser

Browser Automation with agent-browser

Setup (first time only)

Navigation

Page Inspection (use these to understand page structure)

Clicking & Interaction

Text Input

Form Controls

Semantic Find (locate elements by meaning, not CSS)

Waiting

Cookies & Storage

Network Interception

JavaScript Execution

Browser Configuration

Mouse Control

State Persistence

Debugging

Comparison & Diffing

Typical Workflow

Selectors

Default Login Flow

Project-Specific Pages

Testing Dark Mode

Testing Responsive Design

Extract Errors

Frontend Patterns

Coding Standards

Coding Standards

Coding Standards

Frontend Patterns