Extract text and structured data from images and PDFs using OpenAI Vision
Extract text, data, and structured content from images and PDF documents using OpenAI Vision.
# Extract text from an image
bun run src/index.ts extract --input ./receipt.png --output ./receipt.txt
# Extract as Markdown from a PDF
bun run src/index.ts extract -i ./document.pdf -o ./document.md -f markdown
# Extract with custom prompt
bun run src/index.ts extract \
--input ./invoice.png \
--format json \
--prompt "Extract invoice number, date, total amount, and line items"
# High-detail extraction for small text
bun run src/index.ts extract \
--input ./handwriting.jpg \
--detail high \
--format text
| Option | Short | Description |
|---|---|---|
--input | -i | Input file path (required) |
--output | -o | Output file path (optional) |
--format | -f | Output format: text, markdown, json |
--prompt | -p | Custom extraction prompt |
--model | -m | OpenAI model (default: gpt-4o) |
--detail | -d | Image detail: low, high, auto |
export OPENAI_API_KEY="your-openai-key"
bun run src/index.ts extract \
--input ./receipt.jpg \
--format json \
--prompt "Extract store name, date, items with prices, subtotal, tax, and total"
bun run src/index.ts extract \
--input ./report.pdf \
--format markdown \
--output ./report.md
bun run src/index.ts extract \
--input ./notes.jpg \
--detail high \
--prompt "Transcribe the handwritten text, preserving the structure"