Convert PDF files to well-structured Markdown documents. Extracts text with heading detection, optional image extraction, and page range selection. Use this skill whenever the user wants to convert, extract, or transform any PDF into Markdown — including research papers, technical docs, reports, manuals, or datasheets. Trigger even when the user says things like "extract text from this PDF", "convert PDF to md", "turn this PDF into markdown", or "get the content from this PDF".
Convert a PDF file into a clean, well-structured Markdown document — preserving headings, paragraphs, and optionally images.
#, ##, ### headings---) between pages for claritypip install pymupdf
scripts/pdf_to_markdown.py
python scripts/pdf_to_markdown.py input.pdf output.md
# Extract images alongside the markdown
python scripts/pdf_to_markdown.py input.pdf output.md --images
# Convert only pages 1 through 5
python scripts/pdf_to_markdown.py input.pdf output.md --pages "1-5"
# Convert specific pages
python scripts/pdf_to_markdown.py input.pdf output.md --pages "1,3,7-10"
# Plain text extraction without heading detection
python scripts/pdf_to_markdown.py input.pdf output.md --no-formatting
# Combined options
python scripts/pdf_to_markdown.py input.pdf output.md --images --pages "1-20"
When a user asks to convert a PDF to Markdown:
.pdf filespython .claude/skills/pdf-to-markdown/scripts/pdf_to_markdown.py <input.pdf> <output>.md [--images] [--pages "..."] [--no-formatting]
Place the output alongside the input file unless the user specifies a different location.| Situation | Recommendation |
|---|---|
| User doesn't mention images | Skip --images (text-only is faster and cleaner) |
| User wants figures or diagrams preserved | Use --images |
| User wants only specific pages | Use --pages "1-5" with the requested range |
| PDF has complex multi-column layout | Consider --no-formatting for cleaner raw text |
| User mentions a large PDF (100+ pages) | Suggest --pages to process in smaller batches |
After running the script, Claude may optionally:
| Error | Cause | Fix |
|---|---|---|
Input file not found | Wrong path | Verify the file path and confirm filename |
Missing dependency — pymupdf | PyMuPDF not installed | pip install pymupdf |
Warning: not a PDF | Non-.pdf extension | Check if the file is actually a PDF |
| Poor text extraction | Scanned/image-based PDF | The PDF may need OCR preprocessing (e.g., ocrmypdf) |