Analyse a folder of txtified tax source documents, map each file to its tax purpose, and save the result as source-map.md. Use when given a directory of converted tax documents and asked to categorise or organise them.
Analyse every file in the given directory, infer each file's tax purpose, and write source-map.md into that same directory.
/map-tax-docs examples-cases-data/txtified/2024-2025
If no path is given, ask the user which directory to analyse.
Use the argument provided, or ask. Confirm the path exists before proceeding.
List every file in the directory (excluding subdirectories, .failed files, source-map.md, and hidden files such as .textify-cache.sqlite). Note which .failed files exist separately — they are important and must be reported.
Look for a file matching pozn*.txt or pozn*.docx.txt. If found:
For every non-notes file read enough to identify:
For large or garbled OCR files, read the first ~100 lines — enough to identify, no more.
Match each file to a pozn category. Rules:
.failed files: report their error message content (usually "Incorrect password" or a tool error), their likely category based on filename, and any password hint visible in the notes.vysledek/ or similar output folder exists, list its contents separately as "Output / filed documents" — these are not input source documents.Flag explicitly:
Write the analysis to <target-directory>/source-map.md.
Structure of the output file:
# Source Document Mapping — Tax Year <YEAR>
Inferred from `<notes-file>` and directory contents.
---
## 1. <Category name from pozn>
*Brief description of what this category covers*
| File | Notes |
|------|-------|
| `filename.ext` | What it is, key amounts, period, status. |
---
## 2. <Next category>
...
---
## vysledek/ directory (OUTPUT — not source documents)
...
---
## Extra files not referenced in notes
...
---
## Issues / Missing documents
| Issue | Detail |
|-------|--------|
| ⚠️ ... | ... |
| ❓ ... | ... |
| ℹ️ ... | ... |
Formatting rules:
✓ when a file clearly matches a pozn category.⚠️ for failures, missing files, or password issues.❓ for ambiguous/unresolved items.ℹ️ for informational notes (duplicates, context-only documents).Report back to the user:
source-map.md.A completed example for tax year 2024 exists at:
examples-cases-data/txtified/2024-2025/source-map.md
Use it as a reference for output format and depth of analysis expected.
Common document types you will encounter:
| Czech name | English | Tax relevance |
|---|---|---|
| Potvrzení o zdanitelných příjmech | Employment income confirmation | §6 income, employer withholding |
| Evidenční list důchodového pojištění (ELDP) | Pension insurance record | Social security, not income tax directly |
| Výpis z Registru Obyvatel | Population register extract | Identity document |
| Oznámení o podpoře v nezaměstnanosti | Unemployment benefit notice | §6 income (benefit period) |
| Potvrzení o délce evidence | Unemployment registration confirmation | §6 / doplňkové info |
| Potvrzení o příspěvcích na penzijní spoření | Pension contribution confirmation | §15 deduction |
| Výpis / výkaz (Degiro, Mintos, Twino, Zonky) | Investment income statement | §8 or §10 income |
| Roční výkaz (brokerage) | Annual brokerage statement | Capital gains / dividends |
| Dohoda o rozvázání pracovního poměru | Employment termination agreement | Severance (§6), context |
| Výplatní páska | Payslip | §6 income detail |
| Krypto výpočet / export | Crypto tax calculation / trade export | §7 or §10 income |
| Přehled pro OSSZ / VZP | OSVČ annual overview | Self-employment contributions |
| DPFO XML | Personal income tax return (filed) | Output, not input |