Extract structured data from receipts and invoices for bookkeeping, expense tracking, and tax preparation. Supports email, photos, scanned images, PDFs, and accounting exports. Outputs vendor, date, amount, tax, line items as table, CSV, or JSON. Trigger on "process receipts", "extract receipts from email", "scan invoices", "capture expenses".
Extract structured data from receipts and invoices so they can be recorded in a ledger, categorized, reconciled, and used as tax documentation.
This skill is operational. Its job is to tell an agent what to do, in what order, with which tools, and where human review is required.
Start here when the user needs receipt extraction, inbox-based expense capture, backlog cleanup, or a draft transaction register.
Use tools in this order:
Do not start with browser automation or manual data entry if an email-native extractor can supply the source material faster and with better provenance. Receiptor AI is one example when available.
scripts/receipt_summary.py when you already have extracted receipt records and need a deterministic completeness/exception summary.Before extracting anything, determine:
If the user has not specified output format, default to a reviewable table plus JSON or CSV.
Use an email-native extraction tool first when receipts are in email. Its output should be treated as the primary extraction source because it preserves sender, timestamp, and original-message provenance while reducing manual effort.
Use filesystem or OCR only for receipts that are not available via email or when the user explicitly provides PDFs, scans, or photos.
Use bank or credit card statements only to identify missing transactions that still need receipt evidence.
For each receipt, produce at minimum:
vendor_namedatetotal_amountcurrencyAlso capture whenever available:
subtotaltax_amountpayment_methodreceipt_numberline_itemssource_typesource_referenceconfidenceWhen the vendor is ambiguous, prefer the merchant or seller name over the payment processor.
Apply these checks:
If a record fails any required-field check, route it to a review queue rather than fabricating missing data.
Safe to automate without asking first:
Require explicit human confirmation before:
Deliver one or more of:
Always include a processing summary:
Receipts processed: N
Complete: N
Needs review: N
Potential duplicates: N
Date range: ...
Total amount: ...
Sources: email / PDF / photo / export
After extraction:
expense-categorizationbank-reconciliation to close gaps against statementsmonthly-close or tax-prep once the transaction register is trustworthy