Use this skill when a task needs browser automation through PinchTab: open a website, inspect interactive elements, click through flows, fill out forms, scrape page text, log into sites with a persistent profile, export screenshots or PDFs, manage multiple browser instances, or fall back to the HTTP API when the CLI is unavailable. Prefer this skill for token-efficient browser work driven by stable accessibility refs such as `e5` and `e12`.
PinchTab gives agents a browser they can drive through stable accessibility refs, low-token text extraction, and persistent profiles or instances. Treat it as a CLI-first browser skill; use the HTTP API only when the CLI is unavailable or you need profile-management routes that do not exist in the CLI yet.
Preferred tool surface:
pinchtab CLI commands first.curl for profile-management routes or non-shell/API fallback flows.jq only when you need structured parsing from JSON responses.http://localhost targets. Only use a remote PinchTab server when the user explicitly provides it and, if needed, a token.text, snap -i -c, snap -d, find, click, fill, type, press, select, , .hoverscrollEvery PinchTab automation follows this pattern:
pinchtab nav <url> or pinchtab instance navigate <instance-id> <url>.pinchtab snap -i -c, pinchtab snap --text, or pinchtab text, then collect the current refs such as e5.click, fill, type, press, select, hover, or scroll.Rules:
pinchtab text when you need content, not layout.pinchtab snap -i -c when you need actionable elements.PinchTab uses a unified selector system. Any command that targets an element accepts these formats:
| Selector | Example | Resolves via |
|---|---|---|
| Ref | e5 | Snapshot cache (fastest) |
| CSS | #login, .btn, [data-testid="x"] | document.querySelector |
| XPath | xpath://button[@id="submit"] | CDP search |
| Text | text:Sign In | Visible text match |
| Semantic | find:login button | Natural language query via /find |
Auto-detection: bare e5 -> ref, #id / .class / [attr] -> CSS, //path -> XPath. Use explicit prefixes (css:, xpath:, text:, find:) when auto-detection is ambiguous.
pinchtab click e5 # ref
pinchtab click "#submit" # CSS (auto-detected)
pinchtab click "text:Sign In" # text match
pinchtab click "xpath://button[@type]" # XPath
pinchtab fill "#email" "[email protected]" # CSS
pinchtab fill e3 "[email protected]" # ref
The same syntax works in the HTTP API via the selector field:
{"kind": "click", "selector": "text:Sign In"}
{"kind": "fill", "selector": "#email", "text": "[email protected]"}
{"kind": "click", "selector": "e5"}
Legacy ref field is still accepted for backward compatibility.
Use && only when you do not need to inspect intermediate output before deciding the next step.
Good:
pinchtab nav https://example.com && pinchtab snap -i -c
pinchtab click --wait-nav e5 && pinchtab snap -i -c
pinchtab nav https://example.com --block-images && pinchtab text
Run commands separately when you must read the snapshot output first:
pinchtab nav https://example.com
pinchtab snap -i -c
# Read refs, choose the correct e#
pinchtab click e7
pinchtab snap -i -c
Pick one of these five patterns before you start interacting with the site.
Use a temporary instance for public pages, scraping, or tasks that do not need login persistence.
pinchtab instance start
pinchtab instances
# Point CLI commands at the instance port you want to use.
pinchtab --server http://localhost:9868 nav https://example.com
pinchtab --server http://localhost:9868 text
Use this for recurring tasks against the same authenticated site.
pinchtab profiles
pinchtab instance start --profile work --mode headed
pinchtab --server http://localhost:9868 nav https://mail.google.com
If the login is already stored in that profile, you can switch to headless later:
pinchtab instance stop inst_ea2e747f
pinchtab instance start --profile work --mode headless
Use this when you need a durable profile and it does not exist yet.
curl -X POST http://localhost:9867/profiles \
-H "Content-Type: application/json" \
-d '{"name":"billing","description":"Billing portal automation","useWhen":"Use for billing tasks"}'
curl -X POST http://localhost:9867/profiles/billing/start \
-H "Content-Type: application/json" \
-d '{"headless":false}'
Then target the returned port with --server.
Use this for CAPTCHA, MFA, or first-time setup.
pinchtab instance start --profile work --mode headed
# Human completes login in the visible Chrome window.
pinchtab --server http://localhost:9868 nav https://app.example.com/dashboard
pinchtab --server http://localhost:9868 snap -i -c
Once the session is stored, reuse the same profile for later tasks.
Use this when the agent cannot call the CLI directly.
curl http://localhost:9867/health
curl -X POST http://localhost:9867/instances/launch \
-H "Content-Type: application/json" \
-d '{"name":"work","headless":true}'
curl -X POST http://localhost:9868/action \
-H "Content-Type: application/json" \
-d '{"kind":"click","selector":"e5"}'
If the server is exposed beyond localhost, require a token and use a dedicated automation profile. See TRUST.md and config.md.
pinchtab server # Start server foreground
pinchtab daemon install # Install as system service
pinchtab health # Check server status
pinchtab instances # List running instances
pinchtab profiles # List available profiles
pinchtab --server http://localhost:9868 snap -i -c # Target specific instance
pinchtab nav <url>
pinchtab nav <url> --new-tab
pinchtab nav <url> --tab <tab-id>
pinchtab nav <url> --block-images
pinchtab nav <url> --block-ads
pinchtab back # Navigate back in history
pinchtab forward # Navigate forward
pinchtab reload # Reload current page
pinchtab tab # List tabs or focus by ID
pinchtab tab new <url>
pinchtab tab close <tab-id>
pinchtab instance navigate <instance-id> <url>
pinchtab snap
pinchtab snap -i # Interactive elements only
pinchtab snap -i -c # Interactive + compact
pinchtab snap -d # Diff from previous snapshot
pinchtab snap --selector <css> # Scope to CSS selector
pinchtab snap --max-tokens <n> # Token budget limit
pinchtab snap --text # Text output format
pinchtab text # Page text content
pinchtab text --raw # Raw text extraction
pinchtab find <query> # Semantic element search
pinchtab find --ref-only <query> # Return refs only
Guidance:
snap -i -c is the default for finding actionable refs.snap -d is the default follow-up snapshot for multi-step flows.text is the default for reading articles, dashboards, reports, or confirmation messages.find --ref-only is useful when the page is large and you already know the semantic target.All interaction commands accept unified selectors (refs, CSS, XPath, text, semantic). See the Selectors section above.
pinchtab click <selector> # Click element
pinchtab click --wait-nav <selector> # Click and wait for navigation
pinchtab click --x 100 --y 200 # Click by coordinates
pinchtab dblclick <selector> # Double-click element
pinchtab type <selector> <text> # Type with keystrokes
pinchtab fill <selector> <text> # Set value directly
pinchtab press <key> # Press key (Enter, Tab, Escape...)
pinchtab hover <selector> # Hover element
pinchtab select <selector> <value> # Select dropdown option
pinchtab scroll <selector|pixels> # Scroll element or page
Rules:
fill for deterministic form entry.type only when the site depends on keystroke events.click --wait-nav when a click is expected to navigate.click, press Enter, select, or scroll if the UI can change.pinchtab screenshot
pinchtab screenshot -o /tmp/pinchtab-page.png # Format driven by extension
pinchtab screenshot -q 60 # JPEG quality
pinchtab pdf
pinchtab pdf -o /tmp/pinchtab-report.pdf
pinchtab pdf --landscape
Use these only when the task explicitly requires them and safer commands are insufficient.
pinchtab eval "document.title"
pinchtab download <url> -o /tmp/pinchtab-download.bin
pinchtab upload /absolute/path/provided-by-user.ext -s <css>
Rules:
eval is for narrow, read-only DOM inspection unless the user explicitly asks for a page mutation.download should prefer a safe temporary or workspace path over an arbitrary filesystem location.upload requires a file path the user explicitly provided or clearly approved for the task.curl -X POST http://localhost:9868/navigate \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com"}'
curl "http://localhost:9868/snapshot?filter=interactive&format=compact"
curl -X POST http://localhost:9868/action \
-H "Content-Type: application/json" \
-d '{"kind":"fill","selector":"e3","text":"[email protected]"}'
curl http://localhost:9868/text
Use the API when:
pinchtab nav https://pinchtab.com && pinchtab snap -i -c
pinchtab nav https://example.com/login
pinchtab snap -i -c
pinchtab fill e3 "[email protected]"
pinchtab fill e4 "correct horse battery staple"
pinchtab click --wait-nav e5
pinchtab text
pinchtab nav https://example.com
pinchtab snap -i -c
pinchtab fill e2 "quarterly report"
pinchtab press Enter
pinchtab text
pinchtab nav https://example.com/checkout
pinchtab snap -i -c
pinchtab click e8
pinchtab snap -d -i -c
When you know the page structure, skip the snapshot and use CSS or text selectors directly:
pinchtab click "text:Accept Cookies"
pinchtab fill "#search" "quarterly report"
pinchtab click "xpath://button[@type='submit']"
pinchtab profiles
pinchtab instance start --profile work --mode headed
# Human signs in once.
pinchtab --server http://localhost:9868 text
pinchtab instance start --profile work --mode headless
pinchtab instance start --profile staging --mode headless
pinchtab instances
Then point each command stream at its own port using --server.
text, snap -i -c, and snap -d before screenshots, PDFs, eval, downloads, or uploads.--block-images for read-heavy tasks that do not need visual assets.pinchtab snap -d after each state-changing action in long workflows.pinchtab text to confirm success messages, table updates, or navigation outcomes.pinchtab screenshot only when visual regressions, CAPTCHA, or layout-specific confirmation matters.PinchTab is a fully open-source, local-only browser automation tool:
127.0.0.1 by default. No external network calls are made by PinchTab itself.