Use this skill for PDF generation, conversion, inspection, extraction, editing, form filling, OCR, redaction, or render comparison. Triggers include requests to create a PDF, convert Markdown or HTML or LaTeX or DOCX or PPTX to PDF, extract text or tables or images, fill or inspect forms, OCR scans, compare revisions, or redact content.
Use the repo-local toolkit under tools/pdfs/. The default operating loop is:
Even if the user wants a PDF deliverable, PDF is not always the right authoring format.
python3 tools/pdfs/scripts/lo_convert_to_pdf.py ...If you are hand-tuning line breaks in a programmatically generated PDF, stop and reconsider whether DOCX or PPTX is the better source format.
Render before and after any meaningful change:
python3 tools/pdfs/scripts/render_pdf.py input.pdf --out_dir /tmp/pdf-renders-in --dpi 200
python3 tools/pdfs/scripts/compare_renders.py before.pdf after.pdf --out_dir /tmp/pdf-diff --dpi 200
Rendered PNGs are the source of truth for layout QA. Do not trust extracted text alone for tables, forms, spacing, or clipping.
python3 tools/pdfs/scripts/pdf_inspect.py input.pdf
python3 tools/pdfs/scripts/pdf_extract.py text input.pdf --method pdfplumber
python3 tools/pdfs/scripts/pdf_extract.py tables input.pdf
python3 tools/pdfs/scripts/pdf_extract.py forms input.pdf --include_widgets
python3 tools/pdfs/scripts/pdf_edit.py paginate input.pdf -o output.pdf
python3 tools/pdfs/scripts/pdf_edit.py merge a.pdf b.pdf -o merged.pdf
python3 tools/pdfs/scripts/pdf_edit.py rotate input.pdf -o rotated.pdf --pages 1 --degrees 90
python3 tools/pdfs/scripts/pdf_preflight.py input.pdf
python3 tools/pdfs/scripts/pdf_redact.py text input.pdf redacted.pdf --text "secret" --ignore_case
python3 tools/pdfs/scripts/ocr_pdf.py scan.pdf -o searchable.pdf --force
python3 tools/pdfs/scripts/md_to_pdf.py input.md -o output.pdf
python3 tools/pdfs/scripts/html_to_pdf.py input.html -o output.pdf
python3 tools/pdfs/scripts/latex_to_pdf.py input.tex -o output.pdf
python3 tools/pdfs/scripts/lo_convert_to_pdf.py input.docx -o output.pdf
Best-effort Python path:
python3 tools/pdfs/scripts/pdf_edit.py fill-form in.pdf --values values.json -o out.pdf
If the form is stubborn, use the Node helpers:
bash tools/pdfs/js/install_deps.sh
node tools/pdfs/js/extract_form_fields.mjs --input in.pdf
node tools/pdfs/js/fill_form.mjs --input in.pdf --values values.json --output out.pdf --flatten
tools/pdfs/tasks/js_tools.md: Node helpers for forms and PDF.js extractiontools/pdfs/tasks/forms_debugging.md: widget-level debugging workflowtools/pdfs/troubleshooting/common.md: renderer and OCR troubleshootingtools/pdfs/examples/smoke_test.md: runnable smoke flowstools/pdfs/scripts/render_pdf.py: render PDF pages to PNGstools/pdfs/scripts/compare_renders.py: render and diff two PDFstools/pdfs/scripts/pdf_inspect.py: metadata and structure overviewtools/pdfs/scripts/pdf_extract.py: text, tables, images, attachments, annotations, formstools/pdfs/scripts/pdf_edit.py: merge, split, rotate, crop, paginate, encrypt, optimize, fill-formtools/pdfs/scripts/pdf_preflight.py: warnings and normalization hintstools/pdfs/scripts/pdf_redact.py: true redactiontools/pdfs/scripts/ocr_pdf.py: OCR wrappertools/pdfs/scripts/md_to_pdf.py: Markdown to PDFtools/pdfs/scripts/html_to_pdf.py: HTML to PDFtools/pdfs/scripts/latex_to_pdf.py: LaTeX to PDFtools/pdfs/scripts/lo_convert_to_pdf.py: LibreOffice-based conversiontools/pdfs/js/*.mjs: PDF.js and pdf-lib helpers- over typographic dashes in generated content when renderer compatibility is uncertain.