Extract structured invoice data from images using OCR + Gemini vision. Use when asked to extract data from invoice images, scan invoices, or parse supplier invoice photos.
Extract structured invoice data from supplier invoice images using OCR + Gemini 2.5 Flash vision.
cd /Users/jacques/DevFolder/sage_v2
source .venv/bin/activate
export SAGE_PROJECT_ROOT="$(pwd)" PYTHONPATH="$(pwd):$PYTHONPATH"
Drop image into watch-folder/supplier-invoices/<SupplierName>/ — daemon handles OCR → Gemini vision → classify → validate.
curl -X POST http://localhost:49000/api/processing/ingest \
-F "file=@/path/to/invoice.jpg" \
-F "supplier=SupplierName"
python3 scripts/capture_invoice.py --image /path/to/invoice.jpg --supplier "SupplierName"
{
"invoice_number": "INV-12345",
"invoice_date": "2025-01-15",
"supplier_name": "ACDC Express",
"subtotal": 1000.00,
"vat_amount": 150.00,
"vat_rate": 0.15,
"total_amount": 1150.00,
"currency": "ZAR",
"line_items": [
{
"description": "Solar Panel 400W",
"quantity": 2,
"unit_price": 500.00,
"line_total": 1000.00,
"tax_amount": 150.00
}
]
}
subtotal and unit_price must be pre-VATtotal_amount is the INCLUSIVE amount (subtotal + VAT)vat_rate should be 0.15 for South African suppliersbackend/core/pipeline/manager.py — _extract_image_with_llm() (Gemini vision)backend/core/processors/parsers/image/image_parser.py — image OCR parserbackend/core/processors/parsers/ocr/ocr_reader.py — EasyOCR wrapperbackend/core/processors/parsers/invoice/invoice_parser.py — InvoiceMetadata dataclassbackend/core/parsers/llm_invoice_parser.py — LLM vision fallback (supports Gemini)