Review, categorize, and approve unapproved YNAB transactions via DuckDB staging + YNAB MCP
Batch-review unapproved YNAB transactions: auto-approve matched, detect cross-account duplicates, classify by memo patterns, and present ambiguous items for user decision. Designed for Vietnamese bank statements (TCB, SCB, MoMo) imported via bank-statement-wrangler pipeline.
ynab-dump CLI → staging/transactions.json → DuckDB queries (READ)
↓
classify + present
↓
YNAB MCP (WRITE only)
DuckDB for reads — query staging/transactions.json via fs:query_duckdb tool. Full SQL: JOINs, GROUP BY, window functions, CTEs. No 80KB truncation, no narrow date windows.
YNAB MCP for writes — via for approvals, categorization, memo updates, transfer conversion.
utils:ynab__updateTransactionscall_tool_destructiveynab-dump --since <window> before invoking this skillutils server, prefix ynab__, use call_tool_destructivefs server: fs:query_duckdb via call_tool_readretrieve_tools("ynab budget transactions") and retrieve_tools("duckdb query")scripts/bank-statement-wrangler/staging/transactions.json
All DuckDB queries use this path. Shorthand in examples below: TXN_JSON.
Before anything, confirm staging data exists and is fresh:
SELECT
COUNT(*) AS total,
MIN(date) AS earliest,
MAX(date) AS latest,
SUM(CASE WHEN approved = false THEN 1 ELSE 0 END) AS unapproved,
SUM(CASE WHEN approved = false AND matched_transaction_id IS NOT NULL THEN 1 ELSE 0 END) AS matched_unapproved
FROM read_json_auto('scripts/bank-statement-wrangler/staging/transactions.json')
WHERE deleted = false
If no data or stale dates → ask user to run ynab-dump --since <window>.
Matched = matched_transaction_id IS NOT NULL and approved = false. These are bank-imported transactions that YNAB already paired with manual entries. Approve immediately, zero review needed.
SELECT id, account_name, date, memo, amount/1000 AS amount_vnd
FROM read_json_auto('TXN_JSON')
WHERE deleted = false
AND approved = false
AND matched_transaction_id IS NOT NULL
ORDER BY account_name, date
Collect all IDs → batch updateTransactions with {id, approved: true} (max 50 per call).
Find transaction pairs across different accounts with same |amount| within ±3 days, where at least one side is unapproved:
WITH txns AS (
SELECT *, amount/1000 AS amount_vnd
FROM read_json_auto('TXN_JSON')
WHERE deleted = false
)
SELECT
a.id AS id_a, a.account_name AS acct_a,
b.id AS id_b, b.account_name AS acct_b,
a.date AS date_a, b.date AS date_b,
a.amount_vnd AS amt_a, b.amount_vnd AS amt_b,
a.memo AS memo_a, b.memo AS memo_b,
a.approved AS appr_a, b.approved AS appr_b
FROM txns a
JOIN txns b
ON ABS(a.amount) = ABS(b.amount)
AND a.account_id != b.account_id
AND a.amount != 0
AND ABS(DATE_DIFF('day', a.date::DATE, b.date::DATE)) <= 3
AND a.id < b.id
WHERE NOT (a.approved AND b.approved)
ORDER BY a.date DESC, ABS(a.amount) DESC
Present to user as table. Possible actions per pair:
payee_id to destination account's transfer payee⚠️ BLOCKING: Phase 1.2 (duplicates) must COMPLETE before spawning Phase 2 subagents. Running them in parallel caused SCB "CIA CASH IN" records to be classified as MoMo transfers when they were actually duplicates of approved TCB records. The duplicates agent identifies which transactions to EXCLUDE from classification.
After Phase 1, query remaining unmatched unapproved per account:
SELECT id, date, amount/1000 AS amount_vnd, memo,
category_name, payee_name, account_name
FROM read_json_auto('TXN_JSON')
WHERE deleted = false
AND approved = false
AND matched_transaction_id IS NULL
AND account_name LIKE '%tcb%' -- or '%scb%' for SCB subagent
ORDER BY date
Subagent strategy: Spawn parallel subagents per account, each with:
{id, category_id, approved: true, memo?} objectsClassification tiers:
DO NOT use AskUserQuestion for multi-item reviews. Present as inline markdown table:
| # | Account | Date | Memo | Amount | Proposed |
|---|---------|------|------|--------|----------|
Let user respond naturally in chat with decisions.
Batch all confirmed categorizations into updateTransactions calls (max 50 per call):
call_tool('utils', 'ynab__updateTransactions', {
plan_id: 'last-used',
transactions: [
{id: 'uuid', category_id: 'cat-uuid', approved: true},
{id: 'uuid', category_id: 'cat-uuid', memo: 'Human-readable memo', approved: true},
{id: 'uuid', payee_id: 'transfer-payee-uuid', approved: true}
]
});
Memo updates: When original memo is cryptic (e.g., QR S624260M, MCCVMAYBIHYPKNP 464728034) and user identifies it, include a memo field with human-readable description.
User re-runs ynab-dump → check counts drop to zero. Or query staging directly if user hasn't re-dumped:
SELECT account_name, COUNT(*) AS remaining
FROM read_json_auto('TXN_JSON')
WHERE deleted = false AND approved = false
GROUP BY 1 ORDER BY 2 DESC
Note: staging data is a snapshot — approvals via MCP won't reflect until next dump. Use this query only as pre-dump sanity check against the user's in-session notes.
| Memo Pattern | Category | Action |
|---|---|---|
OP*XanhSM or XanhSM | Biz travel | category_id |
Grab* | Biz travel | category_id |
CASHIN \d+ 0903194808 | Transfer → MoMo | payee_id = MoMo transfer |
LUU TUAN KIET chuyen | ⚠️ AMBIGUOUS | Could be wallet transfer, biz travel, or sis allowance — ASK |
T.Toan QR - tai | Likely food | Ask if unsure |
GD THE TREN INTERNET...TAI {merchant} | Parse merchant name | Map to category |
GD THE QUA POS...TAI {merchant} | Parse merchant name | Map to category |
Credit Int from Another Acc | Income / FU money | Ask user |
{person} chuyen tien | Incoming transfer | Check personal context |
PAYMENT FOR SALARY | Income | Approve without category → Ready to Assign |
TRO CAP TET | Income (Tet bonus) | Approve without category → Ready to Assign |
Interest amount | ❗️Stuff I forgot to budget for | Bank interest |
PHARMACITY | ❗️Stuff I forgot to budget for | Pharmacy |
GITHUB, INC. | 🧐WIG tools | Dev subscription |
cafe, coffee, pho, mi, bun, com, banh, hu tiu, caztus, doner, kebab, bap rang, bap, bot chien, matcha, cookie, com tam, tra sua, sinh to, tau hu, oc, lau, nem, nui, taco, pizza, breakfast, snack, snacks, groceries, bluish, hadilao, bon chon, spicy box
HIGHLANDS, 7-ELEVEN, MINISTOP, CARL JR, METRO HCM, PAYOO, FAMIMA, CIRCLE K, GS25, THUC COFFEE, QUAY NFK
| Memo Pattern | Category | Notes |
|---|---|---|
...THE 422151XXXXXX9444...Grab* | Biz travel | International card |
...THE 422151XXXXXX9444...IFTTT | 📊 BI | |
...THE 422151XXXXXX9444...YOU NEED A BUDGET | 🌳 YNAB subscription | |
...THE 422151XXXXXX9444...CLAUDE.AI | LLM | Claude subscription |
...THE 422151XXXXXX9444...OPENAI | LLM | ChatGPT subscription |
...THE 422151XXXXXX9444...HETZNER | 📊 BI | Hosting |
...THE 422151XXXXXX9444...GOOGLE*CLOUD | 📊 BI | GCP |
...THE 422151XXXXXX9444...BACKBLAZE | B2 cloud storage | Cloud backup |
...THE 422151XXXXXX9444...GITHUB | 🧐WIG tools | Dev subscription |
...THE 422151XXXXXX9444...GS25 | 🛒 food & drinks | Convenience store |
CR{date}-ANTHROPIC | LLM | Anthropic refund/credit (inflow) |
MM\d+ CIA CASH IN-0903194808 | Transfer → MoMo OR duplicate | ⚠️ PRE-CHECK: extract txn ID from memo, query TCB for matching Nạp tiền...{same txn ID}. If TCB match exists AND approved → SCB is a DUPLICATE, delete it. If no TCB match → genuine SCB→MoMo transfer. |
PHI T\d+/\d+- | Bank fee | ❗️Stuff I forgot to budget for |
CHI LAI TK TIEN GUI | Refund/interest | ❗️Stuff I forgot to budget for |
| Same memo + 10,000₫ fee | Same category as main charge | SCB adds separate fee row for foreign charges |
| Memo Pattern | Category | Action |
|---|---|---|
Gói xem phim Netflix | 📺 netflix | category_id |
Điện lực Hồ Chí Minh | electric & water | category_id |
Nước Tân Hoà | electric & water | category_id |
Thanh toán Google | 📊 BI | category_id |
Thanh toán Apple Services | 📊 BI | category_id |
Nạp tiền điện thoại Mobifone | 📱 Phone | category_id |
Thanh toán XANH SM / Thanh toán be | Biz travel | category_id |
Nạp tiền vào ví từ Techcombank | Transfer from TCB | payee_id = TCB transfer |
Nạp tiền vào ví từ Sacombank | Transfer from SCB | payee_id = SCB transfer |
Nạp tiền vào Ví để thanh toán dịch vụ | Transfer from TCB | Bill auto-debit top-up, default bank |
Thanh toán NGUYỄN ĐẶNG BÍCH CHI | 🛒 food & drinks | Cafe purchases |
Thanh toán Mua vé xem phim | 🍽️ Date & dine | Movie date |
| Account | ID | Transfer Payee ID |
|---|---|---|
| 🔴tcb | 59c7a8a2-2242-4fab-95d9-d6a24bdee35c | 962ff7c0-725c-410b-a2ff-f8edd3264abb |
| 🔵scb | 4a6175f8-ba3e-4dee-a3db-8ccd3d087d69 | 00bfefa1-1b3a-4398-945e-69dd0fac037c |
| 🎟️momo | f0014267-18eb-4ca4-a01a-d1f1350c3768 | 8ec03915-4081-418c-8958-414d0fcb5b97 |
| 💸wallet | a837e9a3-a651-40ec-9640-74ad72ac7667 | e4d53712-a633-4edb-acea-f04026806f2a |
| 🟢vcb | 29496fb1-75b0-435b-941f-aa09c1fa2136 | 573d1e4b-c5c1-40ee-8a62-c2189367d513 |
| Category | ID |
|---|---|
| 🛒 food & drinks | e0113187-1adb-411e-b335-5e172088c757 |
| 🍽️ Date & dine | 4453f8e7-eabc-40c5-bf4f-9edd825e96bf |
| Biz travel | 6e11f0d6-f55d-4036-8430-a1ac8cd42f03 |
| 📊 BI | 55bd34da-dce1-4ff2-834d-869085086d97 |
| 📱 Phone | 79c74407-a2e2-481f-b914-558df817ffa6 |
| 📺 netflix | d02b4646-65e7-494c-9f1e-77720a218a71 |
| electric & water | 336c5a02-6065-4a59-86fb-b2151d9b2d37 |
| 🌳 YNAB subscription | dd32f3e7-787c-448c-8352-db62c572f4d4 |
| 🧐WIG tools | 79131af0-ff3b-47dd-9b8c-2c7601627fc4 |
| ⛽️ Gas | 38ce93cd-417c-45f8-afaa-6fa8aab36a21 |
| Sis allowance | a5348013-eae0-4406-89f9-1625c3cfffe1 |
| FU money | ce164a88-8d4c-42fb-bbfe-aafc46964ad5 |
| 🍿 Entertainment | 17ee52cf-5245-4cb4-b56c-28916e9084d2 |
| ❗️Stuff I forgot to budget for | 7e257345-66c5-484f-a5aa-3ccbae0992ad |
| Uncategorized | a9bc5299-4cd3-4734-8a99-5294d81d4239 |
| LLM | 5cbbe75e-88c5-4a2d-a917-f476a4affc89 |
| 🎁 family support | 292d5dd0-23e2-4563-9d85-95eb05c95093 |
| B2 cloud storage | 90cac504-ee08-4e30-a1f7-9dbb04183e1d |
| cloudfare domain | 780d23b5-b7b6-4fc4-870c-cb062e060b2a |
| 🎊 Celebrations | get from API if needed |
| Name / Pattern | Who | Default Category |
|---|---|---|
| NGUYỄN ĐẶNG BÍCH CHI | Acquaintance | 🛒 food & drinks (cafe) |
| QUANG BOI KHA | Girlfriend | 🍽️ Date & dine |
| DO NGUYEN LAM GIANG | Work colleague | Biz travel |
| LUU TUAN KIET | User's own name | ⚠️ AMBIGUOUS — could be wallet transfer, biz expense, or sis allowance. ASK. |
| 0903194808 | User's MoMo phone | Used in CASHIN patterns |
| NGUYEN HOANG LONG | Food vendor (QR) | 🛒 food & drinks |
| NGUYEN THI THANH | VietQR merchant | 🛒 food & drinks (small amounts) |
| NGUYEN LE MINH NGOC | VietQR merchant | 🛒 food & drinks |
| NGUYEN THI BICH HANG | Unknown — reimbursement | 🛒 food & drinks |
| CDJOON | Employer | Income → Ready to Assign |
| Vietnamese Pattern | English Meaning |
|---|---|
GD THE TREN INTERNET SO THE...NGAY...TAI | Online card payment on [date] at [merchant] |
GD THE QUA POS SO THE...NGAY...TAI | POS card payment on [date] at [merchant] |
T.Toan QR - tai | QR code payment at [location] |
CASHIN {id} {phone} | MoMo wallet top-up |
{name} chuyen tien / {name} chuyen | Person-to-person transfer |
Credit Int from Another Acc | Interest or transfer from savings account |
Nạp tiền vào Ví để thanh toán dịch vụ | MoMo auto-debit from bank to pay bill |
Nạp tiền vào ví từ {bank} | Manual MoMo top-up from specific bank |
Nạp tiền điện thoại | Phone credit top-up |
Thanh toán {merchant} | MoMo payment to merchant |
Tái tục Bảo hiểm | Insurance renewal |
MM{id} CIA CASH IN | SCB-side memo for MoMo top-up |
Điện lực | Electricity |
Nước | Water |
updateTransactions call.memo field in the update.ynab-dump again after each round to confirm counts.Nạp tiền...{txn ID}. If match → SCB is a duplicate, delete it. This is the #1 source of double-counting.When spawning per-account classification subagents, inject:
## Task: Classify {ACCOUNT} unapproved transactions
Query staging data via DuckDB, classify each transaction, return results.
### Tool Access
- `fs:query_duckdb` via `mcp__proxy__call_tool_read` — SQL on staging JSON
- Staging path: scripts/bank-statement-wrangler/staging/transactions.json
### Your Query
SELECT id, date, amount/1000 AS amount_vnd, memo, category_name, payee_name
FROM read_json_auto('scripts/bank-statement-wrangler/staging/transactions.json')
WHERE deleted = false AND approved = false AND matched_transaction_id IS NULL
AND account_name LIKE '%{ACCOUNT_FILTER}%'
ORDER BY date
### Classification Rules
{PASTE AUTO-CATEGORIZATION RULES FOR THIS ACCOUNT}
{PASTE FOOD KEYWORDS}
{PASTE PERSONAL CONTEXT}
{PASTE VIETNAMESE MEMO DECODER}
### Category IDs
{PASTE CATEGORY ID TABLE}
### Output Format
Return TWO lists:
1. **auto_categorize** — JSON array, one object per line. Main agent will paste directly into `updateTransactions`:
```json
{"id": "full-uuid", "category_id": "full-uuid", "approved": true}
{"id": "full-uuid", "payee_id": "full-uuid", "approved": true}
{"id": "full-uuid", "category_id": "full-uuid", "memo": "human-readable", "approved": true}
CRITICAL: Use full UUIDs from the query results — no placeholders, no abbreviations.