Curate proxywhirl proxy sources — prune broken/stale/duplicate sources and discover new high-quality ones via web research. Syncs all downstream files: sources.py, tests, docs, and reference inventories. Use when asked to update, clean up, audit, curate, or refresh proxy sources.
Maintain the proxy source list in proxywhirl/sources.py and keep all downstream files in sync: tests, docs, reference inventories, and exports.
Invoke the source-curator agent:
/agent source-curator
Or run the validation script directly:
uv run python scripts/curate_sources.py validate # Full validation report (JSON)
uv run python scripts/curate_sources.py check-candidate URL # Check a candidate source
scripts/curate_sources.py validate to get health status of all sourcessources.pySee the full file inventory in .claude/agents/source-curator.md.
Key files: proxywhirl/sources.py (primary), references/known-sources.md (inventory), tests/unit/test_sources.py, tests/contract/test_proxy_sources.py, tests/unit/test_sources_audit.py, README.md (feature card count), and all code that imports sources (cli.py, tui.py, exports.py, __init__.py).
| Condition | Action |
|---|---|
| HTTP 404/410 for 7+ days | Remove entirely |
| GitHub repo archived | Remove entirely |
| GitHub repo not pushed in 90+ days | Disable (enabled=False) |
| GitHub repo <10 stars | Flag for review |
| Empty response / no proxy data | Disable if persistent |
| Duplicate of existing source | Remove the lower-quality one |
In RECOMMENDED_SOURCES | NEVER prune without user approval |
In API_SOURCES | NEVER prune (high-value) |
New sources must meet ALL criteria:
known-sources.md)| Source Type | Pattern | Example |
|---|---|---|
| GitHub repo | GITHUB_{OWNER}_{PROTOCOL} | GITHUB_MONOSANS_HTTP |
| API service | {SERVICE}_{PROTOCOL} | PROXY_SCRAPE_SOCKS5 |
| Web-hosted | {DOMAIN}_{PROTOCOL} | PROXYSPACE_HTTP |
| Collection | Criteria |
|---|---|
ALL_HTTP_SOURCES | Any source providing HTTP/HTTPS proxies |
ALL_SOCKS4_SOURCES | Any source providing SOCKS4 proxies |
ALL_SOCKS5_SOURCES | Any source providing SOCKS5 proxies |
ALL_SOURCES | Auto-computed: HTTP + SOCKS4 + SOCKS5 |
RECOMMENDED_SOURCES | Only the most reliable, fast, pre-validated sources (user approval required) |
API_SOURCES | Only API-based sources (GeoNode, etc.) |
# owner/repo-name - Description (stars, update frequency)
GITHUB_OWNER_PROTOCOL = ProxySourceConfig(
url="https://raw.githubusercontent.com/owner/repo/branch/file.txt",
format="plain_text",
protocol="socks5", # omit for HTTP (default)
trusted=True, # only if source pre-validates proxies
)
See the full checklist in .claude/agents/source-curator.md.
Quick verification:
uv run python -c "from proxywhirl.sources import ALL_SOURCES; print(f'{len(ALL_SOURCES)} sources loaded')"
uv run pytest tests/unit/test_sources.py tests/unit/test_sources_audit.py tests/contract/test_proxy_sources.py -v