Name: Pdf Reading
Author: aniketpanjwani

Use this skill for PDF-first work on papers and reports. The default path is Docling-first.

Rules

Never read a PDF directly. Convert it first, then read the generated artifact.
Default to Docling for reading. Use --fast only when the user explicitly prefers speed over fidelity.
Extract figures or tables only when the user asks for them or when the answer depends on them.
For tables, prefer the cleaned Markdown for quick review, but inspect the crop image when numeric fidelity matters.
If values or labels look suspicious, cite the crop or page artifact rather than trusting the structured export.
If Docling fails, fall back to pdftotext -layout and continue with a degraded-quality warning.

Quick Start

Run the extractor on a PDF:

python3 scripts/pdf_extract.py /path/to/paper.pdf

Extract figures with structural labels:

Use this skill for PDF-first work on papers and reports. The default path is Docling-first.

Never read a PDF directly. Convert it first, then read the generated artifact.
Default to Docling for reading. Use --fast only when the user explicitly prefers speed over fidelity.
Extract figures or tables only when the user asks for them or when the answer depends on them.
For tables, prefer the cleaned Markdown for quick review, but inspect the crop image when numeric fidelity matters.
If values or labels look suspicious, cite the crop or page artifact rather than trusting the structured export.
If Docling fails, fall back to pdftotext -layout and continue with a degraded-quality warning.

Run the extractor on a PDF:

python3 scripts/pdf_extract.py /path/to/paper.pdf

Extract figures with structural labels: