Playwright CLI scraper with stealth anti-bot support
| Target | Anti-Bot | Method |
|---|---|---|
| Static HTML | None | web_fetch tool (built-in) |
| Dynamic / JS-rendered | Low | playwright-simple.js |
| Cloudflare / 403 | High | playwright-stealth.js |
# Simple (fast, no anti-bot)
node skills/playwright-scraper-skill/scripts/playwright-simple.js "https://example.com"
# Stealth (Cloudflare/anti-bot)
node skills/playwright-scraper-skill/scripts/playwright-stealth.js "https://example.com"
# Env options
WAIT_TIME=8000 SAVE_HTML=true SCREENSHOT_PATH=/tmp/shot.png node scripts/playwright-stealth.js <URL>
HEADLESS=false node scripts/playwright-stealth.js <URL> # show browser
Both scripts print JSON to stdout:
{
"title": "Page Title",
"url": "https://...",
"content": "text content (up to 5000 chars)",
"elapsedSeconds": "3.45",
"screenshot": "/path/to/screenshot.png"
}
web_fetch first — fastest, zero overheadplaywright-simple.jsplaywright-stealth.jsWAIT_TIME, try HEADLESS=false