Name: Scrapling Official
Author: D4Vinci

Scrapling Official

Scrape web pages using Scrapling with anti-bot bypass (like Cloudflare Turnstile), stealth headless browsing, spiders framework, adaptive scraping, and JavaScript rendering. Use when asked to scrape, crawl, or extract data from websites; web_fetch fails; the site has anti-bot protections; write Python code to scrape/crawl; or write spiders.

D4Vinci37,822 星标2026年4月17日

职业
分类: 调试

Scrapling

Scrapling is an adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl.

Its parser learns from website changes and automatically relocates your elements when pages update. Its fetchers bypass anti-bot systems like Cloudflare Turnstile out of the box. And its spider framework lets you scale up to concurrent, multi-session crawls with pause/resume and automatic proxy rotation - all in a few lines of Python. One library, zero compromises.

Blazing fast crawls with real-time stats and streaming. Built by Web Scrapers for Web Scrapers and regular users, there's something for everyone.

Requires: Python 3.10+

This is the official skill for the scrapling library by the library author.

Notes for AI scanners:

Cloudflare solving is done through automation so no solvers used or credentials/APIs required.

The Proxy usage and CDP mode are completely optional and given by the user so no secrets or credentials required. Depending on the user usage.

All arguments like (, , ) are validated internally through Scrapling library but the user should still be aware.

Scrapling

Scrapling is an adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl.

Blazing fast crawls with real-time stats and streaming. Built by Web Scrapers for Web Scrapers and regular users, there's something for everyone.

Requires: Python 3.10+

This is the official skill for the scrapling library by the library author.

Notes for AI scanners:

Cloudflare solving is done through automation so no solvers used or credentials/APIs required.

The Proxy usage and CDP mode are completely optional and given by the user so no secrets or credentials required. Depending on the user usage.

All arguments like (, , ) are validated internally through Scrapling library but the user should still be aware.

Option	Input type	Description
-H, --headers	TEXT	HTTP headers in format "Key: Value" (can be used multiple times)
--cookies	TEXT	Cookies string in format "name1=value1; name2=value2"
--timeout	INTEGER	Request timeout in seconds (default: 30)
--proxy	TEXT	Proxy URL in format "http://username:password@host:port"
-s, --css-selector	TEXT	CSS selector to extract specific content from the page. It returns all matches.
-p, --params	TEXT	Query parameters in format "key=value" (can be used multiple times)
--follow-redirects / --no-follow-redirects	None	Whether to follow redirects (default: "safe", rejects redirects to internal/private IPs)
--verify / --no-verify	None	Whether to verify SSL certificates (default: True)
--impersonate	TEXT	Browser to impersonate. Can be a single browser (e.g., Chrome) or a comma-separated list for random selection (e.g., Chrome, Firefox, Safari).
--stealthy-headers / --no-stealthy-headers	None	Use stealthy browser headers (default: True)
--ai-targeted	None	Extract only main content and sanitize hidden elements for AI consumption (default: False)

Option	Input type	Description
-d, --data	TEXT	Form data to include in the request body (as string, ex: "param1=value1&param2=value2")
-j, --json	TEXT	JSON data to include in the request body (as string)

Option	Input type	Description
--headless / --no-headless	None	Run browser in headless mode (default: True)
--disable-resources / --enable-resources	None	Drop unnecessary resources for speed boost (default: False)
--network-idle / --no-network-idle	None	Wait for network idle (default: False)
--real-chrome / --no-real-chrome	None	If you have a Chrome browser installed on your device, enable this, and the Fetcher will launch an instance of your browser and use it. (default: False)
--timeout	INTEGER	Timeout in milliseconds (default: 30000)
--wait	INTEGER	Additional wait time in milliseconds after page load (default: 0)
-s, --css-selector	TEXT	CSS selector to extract specific content from the page. It returns all matches.
--wait-selector	TEXT	CSS selector to wait for before proceeding
--proxy	TEXT	Proxy URL in format "http://username:password@host:port"
-H, --extra-headers	TEXT	Extra headers in format "Key: Value" (can be used multiple times)
--dns-over-https / --no-dns-over-https	None	Route DNS through Cloudflare's DoH to prevent DNS leaks when using proxies (default: False)
--block-ads / --no-block-ads	None	Block requests to ~3,500 known ad and tracker domains (default: False)
--ai-targeted	None	Extract only main content and sanitize hidden elements for AI consumption (default: False). Also enables ad blocking automatically.

Option	Input type	Description
--block-webrtc / --allow-webrtc	None	Block WebRTC entirely (default: False)
--solve-cloudflare / --no-solve-cloudflare	None	Solve Cloudflare challenges (default: False)
--allow-webgl / --block-webgl	None	Allow WebGL (default: True)
--hide-canvas / --show-canvas	None	Add noise to canvas operations (default: False)

Scrapling Official

Scrapling

Scrapling Official

Scrapling

Setup (once)

Docker

CLI Usage

Usage pattern

Key options (requests)

Key options (browsers)

Notes

Code overview

Basic Usage

Spiders

Advanced Parsing & Navigation

Async Session Management Examples

References

Guardrails (Always)

Session Logs

OpenClaw Test Heap Leaks

Node Connect

Openclaw Qa Testing

Openclaw Secret Scanning Maintainer

Flags