PDF converter powered by MinerU — convert PDF to Word, Markdown, HTML, LaTeX, or plain text. Also handles image-to-text OCR, scanned document recognition, and Office formats (DOCX, PPTX, Excel). Supports 80+ languages. Use this skill when the user wants to convert, extract, read, parse, or summarize any PDF or document. Also applies when the user shares a PDF file or link and asks about its content, needs tables or formulas extracted, wants PDF OCR, or says things like 'turn this into a doc' or 'what does this paper say'.
Convert PDF, images, Office docs, and more to clean Markdown using the MinerU Open API CLI. No API key needed for basic use.
Reply to the user in the SAME language they use. This is non-negotiable.
Extraction is often just the first step. The typical flow is:
mineru-open-api to convert the document to MarkdownMinerU outputs raw Markdown — it doesn't interpret or restructure the content. If the user asks to "extract the tables", "summarize the paper", or "find the key findings", you need to read the output and do that work yourself. MinerU handles the OCR and layout; you handle the understanding.
Use -o to save to a file when the user wants persistent output (conversion, batch processing). Skip -o and read stdout directly when the content is consumed immediately (summarization, Q&A).
For example:
-o to save to file, done-o to save, then read the file and pull out the tables-o, then parse the references sectionBest for quick reads. No API key, no setup.
mineru-open-api flash-extract report.pdf # to stdout (for immediate consumption)
mineru-open-api flash-extract report.pdf -o ./output/ # save to file
mineru-open-api flash-extract report.pdf -o ./output/ --pages 1-10 # page range
mineru-open-api flash-extract report.pdf -o ./output/ --language en # language hint
mineru-open-api flash-extract https://example.com/paper.pdf # URL input
Supports: PDF, images (PNG, JPG, WebP...), DOCX, PPTX, Excel (XLS, XLSX) Limits: 10 MB / 20 pages per document Output: Markdown only — images, tables, and formulas may become placeholders
Use flash-extract as the default unless the user needs more.
Use when the user needs full-fidelity output: preserved images, accurate tables, LaTeX formulas, or non-Markdown formats. Requires a token via mineru-open-api auth.
mineru-open-api extract report.pdf # to stdout
mineru-open-api extract report.pdf -o ./out/ # save with all assets
mineru-open-api extract report.pdf -o ./out/ -f md,docx # multiple output formats
mineru-open-api extract report.pdf -o ./out/ --ocr # force OCR for scanned docs
mineru-open-api extract *.pdf -o ./results/ # batch processing
mineru-open-api extract --list files.txt -o ./results/ # batch from file list
Supports: PDF, images, DOC, DOCX, PPT, PPTX, HTML
Limits: 200 MB / 600 pages per document
Output formats: md, json, html, latex, docx (comma-separated with -f)
Features: formula recognition (on by default), table recognition (on by default), OCR toggle, batch mode, model selection (vlm, pipeline, html)
If the user hasn't authenticated yet, guide them to run mineru-open-api auth first.
| Situation | Mode |
|---|---|
| "What does this PDF say?" | flash-extract |
| Quick summary or content scan | flash-extract |
| Need images/tables/formulas preserved | extract |
| Document > 10 MB or > 20 pages | extract |
| Batch converting multiple files | extract |
| Need DOCX/LaTeX/HTML output | extract |
| Scanned document needs OCR | extract with --ocr |
Default is ch (Chinese + English). Use --language to specify others. Common codes:
| Language | Code | Language | Code |
|---|---|---|---|
| Chinese + English | ch | Japanese | japan |
| English | en | Korean | korean |
| French | fr | Chinese Traditional | chinese_cht |
| German | de | Spanish | es |
| Russian | ru | Arabic | ar |
| Portuguese | pt | Hindi | hi |
| Italian | it | Vietnamese | vi |
| Thai | th | Turkish | tr |
80+ languages supported in total — use the PaddleOCR language code for any language not listed above.
Both commands send the document to MinerU's API (mineru.net) for processing. This is a stateless API call with no persistent storage. MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
-v flag to see HTTP request/response details (e.g., mineru-open-api flash-extract report.pdf -v)npm i -g mineru-open-api (Node.js)uv tool install mineru-open-api (Python/uv)curl -fsSL https://cdn-mineru.openxlab.org.cn/open-api-cli/install.sh | shirm https://cdn-mineru.openxlab.org.cn/open-api-cli/install.ps1 | iexmineru-open-api auth to set up your token--timeout 600 (seconds)--language explicitly (e.g., --language en for English docs)