Browser automation CLI using DOMShell MCP server. Maps Chrome's Accessibility Tree to a virtual filesystem for agent-native navigation.
A command-line interface for browser automation using DOMShell's MCP server. Navigate web pages using filesystem commands: ls, cd, cat, grep, click.
Node.js and npx (for DOMShell MCP server):
# Install Node.js from https://nodejs.org/
npx --version
Chrome/Chromium with DOMShell extension:
Python 3.10+
cd browser/agent-harness
pip install -e .
page — Page Navigationpage open <url> — Navigate to URLpage reload — Reload current pagepage back — Navigate back in historypage forward — Navigate forward in historypage info — Show current page infofs — Filesystem Commands (Accessibility Tree)fs ls [path] — List elements at pathfs cd <path> — Change directoryfs cat [path] — Read element contentfs grep <pattern> [path] — Search for text patternfs pwd — Print working directoryact — Action Commandsact click <path> — Click an elementact type <path> <text> — Type text into inputsession — Session Managementsession status — Show session statesession daemon-start — Start persistent daemon modesession daemon-stop — Stop daemon mode# Open a page
cli-anything-browser page open https://example.com
# Explore structure
cli-anything-browser fs ls /
cli-anything-browser fs cd /main
cli-anything-browser fs ls
# Go back to root
cli-anything-browser fs cd /
cli-anything-browser fs grep "Login"
cli-anything-browser act click /main/button[0]
cli-anything-browser act type /main/input[0] "[email protected]"
cli-anything-browser act click /main/button[0]
cli-anything-browser --json fs ls /
# Start persistent connection
cli-anything-browser session daemon-start
# Run commands (uses persistent connection)
cli-anything-browser fs ls /
cli-anything-browser fs cd /main
# Stop daemon when done
cli-anything-browser session daemon-stop
cli-anything-browser
DOMShell uses a filesystem-like path for the Accessibility Tree:
/ — Root (document)
/main — Main landmark
/main/div[0] — First div in main
/main/div[0]/button[2] — Third button in first div
button[0] is the first button.. to go up one level/ for rootAll commands support --json flag for machine-readable output:
cli-anything-browser --json fs ls /
Returns:
{
"path": "/",
"entries": [
{"name": "main", "role": "landmark", "path": "/main"}
]
}
The CLI provides clear error messages for common issues:
npx @apireno/domshell --versionCheck is_available() return value before running commands.
For agent workflows with multiple commands, use daemon mode:
cli-anything-browser session daemon-startcli-anything-browser session daemon-stopThis avoids the 1-3 second cold start overhead for each command.
IMPORTANT: When using this CLI with AI agents, be aware of the following security considerations:
The browser harness validates all URLs before navigation:
http:// or https:// scheme (scheme-less URLs like example.com are rejected)file://, javascript://, data://, vbscript://, about://, chrome://, and browser-internal schemeshttp:// and https:// only (configurable via CLI_ANYTHING_BROWSER_ALLOWED_SCHEMES)CLI_ANYTHING_BROWSER_BLOCK_PRIVATE=true (disabled by default)The Accessibility Tree includes all visible and hidden elements on a page. Malicious websites could:
Mitigation: When interacting with untrusted websites, consider:
--json flag for structured output that's easier to parse safelyBy default, the browser can access localhost and private networks (192.168.x.x, 10.x.x.x, etc.). To block:
export CLI_ANYTHING_BROWSER_BLOCK_PRIVATE=true
cli-anything-browser page open http://localhost:8080 # Will be blocked
Multiple browser sessions share the same Chrome instance. Cookies and authentication state may persist across sessions. For sensitive operations, consider: