MinerU Extract (official API)

Use MinerU as an upstream “content normalizer”: submit a URL to MinerU, poll for completion, download the result zip, and extract the main Markdown.

Quick start (MCP-aligned)

We align to the MinerU MCP mental model, but we do not run an MCP server.

Primary script (MCP-style): scripts/mineru_parse_documents.py
- Input: --file-sources (comma/newline-separated)
- Output: JSON contract on stdout: { ok, items, errors }
Low-level script (single URL): scripts/mineru_extract.py

Auth:

Default model heuristic:

Use MinerU as an upstream “content normalizer”: submit a URL to MinerU, poll for completion, download the result zip, and extract the main Markdown.

We align to the MinerU MCP mental model, but we do not run an MCP server.

Primary script (MCP-style): scripts/mineru_parse_documents.py
- Input: --file-sources (comma/newline-separated)
- Output: JSON contract on stdout: { ok, items, errors }
Low-level script (single URL): scripts/mineru_extract.py

Auth:

Default model heuristic: