Manage the Grounded Docs MCP Server documentation index. Covers scraping and indexing documentation from URLs or local files, refreshing existing indexes with changed content, and removing libraries from the index. Use when you need to add, update, or delete indexed documentation.
Index, refresh, and remove library documentation in the local Grounded Docs store. These commands modify the index and produce plain-text status messages on stdout.
Download and index documentation from a URL or local directory.
npx @arabold/docs-mcp-server@latest scrape <library> <url> [options]
| Flag | Alias | Default | Description |
|---|---|---|---|
--version <ver> | -v | Library version label |
--max-pages <n> | -p | config default | Maximum pages to scrape |
--max-depth <n> | -d | config default | Maximum navigation depth |
--max-concurrency <n> | -c | config default | Concurrent page requests |
--ignore-errors | true | Continue on individual page errors |
--scope subpages|hostname|domain | subpages | Crawling boundary |
--follow-redirects | true | Follow HTTP redirects |
--no-follow-redirects | Disable following redirects |
--scrape-mode auto|fetch|playwright | auto | HTML processing strategy |
--include-pattern <glob> | URL include pattern (repeatable) |
--exclude-pattern <glob> | URL exclude pattern (repeatable, takes precedence) |
--header "Name: Value" | Custom HTTP header (repeatable) |
--embedding-model <model> | Embedding model configuration |
--server-url <url> | Remote pipeline worker URL |
--clean | true | Clear existing documents before scraping |
--quiet | Suppress non-error diagnostics |
--verbose | Enable debug logging |
Examples:
# Scrape React docs, version-tagged
npx @arabold/docs-mcp-server@latest scrape react https://react.dev/reference/react --version 19.0.0
# Scrape local files
npx @arabold/docs-mcp-server@latest scrape mylib file:///Users/me/docs/my-library
# Scrape with depth and page limits
npx @arabold/docs-mcp-server@latest scrape nextjs https://nextjs.org/docs --max-pages 200 --max-depth 3
# Scrape with custom headers (e.g. authentication)
npx @arabold/docs-mcp-server@latest scrape internal-api https://docs.internal.com \
--header "Authorization: Bearer tok_xxx"
# Exclude changelog pages
npx @arabold/docs-mcp-server@latest scrape react https://react.dev/reference/react \
--exclude-pattern "**/changelog*"
Output is a plain-text status line, e.g. Successfully scraped 42 pages.
Progress updates appear on stderr during the run.
Re-scrape an existing library version, skipping unchanged pages via HTTP ETags.
npx @arabold/docs-mcp-server@latest refresh <library> [options]
| Flag | Alias | Description |
|---|---|---|
--version <ver> | -v | Version to refresh (omit for latest) |
--embedding-model <model> | Embedding model configuration | |
--server-url <url> | Remote pipeline worker URL | |
--quiet | Suppress non-error diagnostics | |
--verbose | Enable debug logging |
Example:
npx @arabold/docs-mcp-server@latest refresh react --version 19.0.0
The library and version must already be indexed. Use scrape for first-time
indexing.
Delete a library (or a specific version) from the index.
npx @arabold/docs-mcp-server@latest remove <library> [options]
| Flag | Alias | Description |
|---|---|---|
--version <ver> | -v | Specific version to remove (omit to remove latest) |
--server-url <url> | Remote pipeline worker URL | |
--quiet | Suppress non-error diagnostics | |
--verbose | Enable debug logging |
Example:
npx @arabold/docs-mcp-server@latest remove react --version 18.3.1
This is destructive and cannot be undone. Re-run scrape to re-index.
All three commands write plain-text status messages to stdout and
diagnostics to stderr. The global --output flag is accepted but has no
effect because the output is plain text, not structured data.
In non-interactive sessions, diagnostics are suppressed by default. Use
--verbose (or set LOG_LEVEL=INFO) to re-enable them. Use --quiet to
suppress all non-error diagnostics regardless of session type.
# 1. Index documentation for the first time
npx @arabold/docs-mcp-server@latest scrape react https://react.dev/reference/react --version 19.0.0
# 2. Later, refresh to pick up any changes
npx @arabold/docs-mcp-server@latest refresh react --version 19.0.0
# 3. Clean up old versions
npx @arabold/docs-mcp-server@latest remove react --version 18.3.1
--max-pages and --max-depth to limit
scope when you only need a subset.file:// URL scheme
(e.g. file:///absolute/path/to/docs).--clean is on by default for scrape, meaning existing documents for
the same library+version are removed before re-indexing. Pass --no-clean to
append instead.refresh only works on previously indexed content. It uses HTTP ETags to
skip pages that have not changed, making it much faster than a full re-scrape.