Extract text from PDFs as structured, semantic Markdown. Use when converting a PDF to Markdown, extracting text from a PDF, processing one or more PDFs into Markdown output, reading PDF contents for analysis, ingesting documents for RAG pipelines, preparing PDFs for LLM context, or any task where PDF text needs to be in a machine-readable format. ALWAYS use this skill when the user has a PDF and needs its content as text or Markdown — even if they don't explicitly say "convert to markdown".
Convert PDFs into structured, semantic Markdown that preserves the document's logical structure — headings, tables, lists, and reading order — rather than producing flat text. This is significantly higher quality than reading a PDF directly with the read tool, which only extracts raw text without structure.
Before running any commands, set SKILL_DIR to the absolute path of the directory containing this SKILL.md file. Use $SKILL_DIR/bin/pdf-to-markdown in all commands below.
The $SKILL_DIR/bin/pdf-to-markdown wrapper automatically installs the platform-specific binary into ~/.local/share/nutrient/cli/ from the CDN. It caches the binary and only checks for updates every 6 hours, so subsequent runs are fast.
$SKILL_DIR/bin/pdf-to-markdown INPUT.pdf OUTPUT.md
If OUTPUT.md is omitted, the converter writes the Markdown to stdout instead.
For multiple files, pass directories instead of individual files. The converter processes all PDFs in the input directory in parallel, which is much faster than converting one at a time.
$SKILL_DIR/bin/pdf-to-markdown INPUT_DIR/ OUTPUT_DIR/
$SKILL_DIR/bin/pdf-to-markdown INPUT [OUTPUT]Free for processing up to 1,000 documents per calendar month.
Commercial license required for:
Contact [email protected] for commercial licensing.