Transform a folder of financial documents (bank statements, credit card exports, checking/savings JSON dumps) into a privacy-preserving consolidated CSV plus a beautiful interactive HTML financial report. Handles PDF, XLSX, JSON, and CSV sources. Strips all personally identifiable information except first names. Categorizes every transaction into 3 levels (source → merchant → category), flags tax-relevant items, and surfaces insights about spending patterns, subscriptions, savings optimization, and tax deductibility. Use when a user drops a folder of statements and says "help me understand our finances", "prepare for tax season", "analyze our spending", or "make sense of all these statements".
Turn a messy folder of financial documents into a clean, analyzable, shareable picture of a household's money — without ever exposing account numbers, addresses, phone numbers, or last names. The workflow is explicitly designed so a user can hand the resulting HTML report to a spouse, accountant, or financial advisor without privacy concerns.
The canonical workflow:
These rules exist because the user will share the output with family and friends. A single leaked account number could expose them to fraud.
NEVER write any of these to output files (CSV, HTML, memory, or logs):
DO write:
When in doubt, redact. It is always safe to strip more than needed.
The skill handles these file types. When you encounter an unfamiliar format, read a sample and adapt — the Python script is designed to be extended.
| Format | Example sources | Parser |
|---|---|---|
| PDF statements | Chase, BofA, Citi, Capital One, Discover credit card statements | pdfplumber → regex extraction of transaction blocks |
| XLSX transaction exports | AMEX, Fidelity, Schwab activity exports | openpyxl → auto-detect header row |
| JSON account dumps | Navy Federal, USAA, PNC (many credit unions) | Built-in json |
| CSV exports | Almost every bank offers this as fallback | Built-in csv |
| OFX/QFX | Quicken-style downloads | ofxparse (install on demand) |
The skill needs Python 3 with three libraries. Check if they're installed and install them if missing:
python3 -c "import openpyxl, pdfplumber" 2>&1
# If ModuleNotFoundError:
pip install openpyxl pdfplumber
Some users will have Python in a virtual environment. If pip install fails
with "externally-managed-environment", guide them to create a venv or use
pipx install. Do NOT suggest --break-system-packages.
Ask the user for the folder path, then list its contents. Classify each file by likely source:
Looking at /path/to/folder, I found:
📄 5 PDF files matching "*statements*" → credit card statements (Chase?)
📊 3 XLSX files with "AMEX" in the name → AMEX card exports
📋 2 JSON files with "Checking" in the name → bank account dumps
❓ 1 PDF file named "W2_2025.pdf" → probably tax form, skip
If you find files that look like tax forms (W-2, 1099, 1098) or ID documents, do not parse them. They contain SSNs and personal addresses.
Ask the user the following, one topic at a time so it doesn't feel like a form. Nothing here is optional for a high-quality report, but all of it is minimal and non-sensitive.
(a) Household members — first names only
"Whose finances are these? Give me first names only — something like 'Alex and Jamie' or just 'Alex' if you're solo."
(b) The year being analyzed
"Which tax year should I focus on? (I'll filter transactions to that calendar year — other years will be ignored.)"
(c) Tax filing structure
"What's your tax filing situation? Pick one:
- Single
- Married Filing Jointly (MFJ)
- Married Filing Separately (MFS)
- Head of Household (HoH)
- Qualifying Widow(er)"
This drives the correct standard deduction baseline, itemization threshold, and SALT cap behavior in the report insights.
(d) Dependents
"Any dependents? For each, give me just:
- First name (or nickname like 'youngest')
- Age or date of birth
- Relationship ('child', 'parent', etc.)
- Any special notes about them for the year (e.g., 'in preschool until August', 'starts college in fall', 'full-time caregiver')"
Ages and relationships are critical for tax credits:
(e) Any specific categorization corrections
"Are there any regular payments in your statements that might be categorized wrong by a generic tool? For example, a preschool that shows up as 'University' in the merchant name, or a recurring charge that's actually a business expense?"
Write any corrections into the family config's category_overrides
section so they apply every year.
(f) Files to skip
"Any files in the folder I should ignore?"
Do NOT ask for:
Use the Python script at scripts/build_report.py (in this skill directory).
The script takes a folder path and a JSON config file describing the family
members and optional custom categorization rules.
python3 ~/.claude/skills/family-finance-report/scripts/build_report.py \
--input-dir "/path/to/folder" \
--output-dir "/path/to/folder" \
--family-config "/path/to/folder/.family_config.json"
The script will:
transactions.csv (the analyzable dataset)report_data.json (structured data for the HTML template)After the script produces report_data.json, use the template at
templates/report_template.html as the starting point. The template
contains placeholders like {{TOTAL_SPENDING}} that you fill in with
real values. Save the final report as Finance_Report_{YEAR}.html in
the same folder.
Important: The HTML should be a single self-contained file. Load Chart.js from CDN so the report works offline once loaded. Do not embed any external tracking or analytics.
After generating the raw report, read the CSV and look for non-obvious findings to add to the narrative. These are the insights that wow users:
Tell the user what you produced:
I generated three files in /path/to/folder:
1. transactions.csv — 1,851 categorized transactions, 99% coverage
(open in Excel/Numbers for ad-hoc analysis)
2. Finance_Report_2025.html — interactive dashboard with charts
(open in a browser; shareable with family)
3. report_data.json — structured data powering the report
(keep for re-generation or custom analysis)
Here are the 3 most interesting findings:
- [finding 1]
- [finding 2]
- [finding 3]
The script ships with ~40 default merchant patterns covering common U.S. merchants (groceries, gas, dining, utilities, subscriptions, etc.). Aim for >95% categorization coverage. For the remaining uncategorized