Name: Oma Pdf
Author: first-fluke

PDF Skill - PDF to Markdown Conversion

When to use

Format	Flag	Use case
Markdown	`--format markdown`	Default. Clean text for LLM/RAG
JSON	`--format json`	Structured data with bounding boxes
HTML	`--format html`	Web display
Text	`--format text`	Plain text extraction
Combined	`--format markdown,json`	Multiple formats at once

Issue	Solution
Garbled text in output	Try `--use-struct-tree` for Tagged PDFs
Scanned PDF (no text layer)	Use hybrid mode with `--force-ocr`
Tables not extracted properly	Use hybrid mode for complex/borderless tables
Non-English PDF	Add `--ocr-lang` with appropriate language codes
Large PDF (100+ pages)	Process in page ranges or use batch mode
Formula not extracted	Use hybrid mode with `--enrich-formula`