United States federal income tax filing assistant for US citizens, resident aliens, and nonresident aliens. Guides users through the entire IRS filing workflow: determining filer type and filing status, identifying which forms and schedules are needed, calculating amounts from source documents (W-2, 1099s, etc.), and filling IRS PDF forms using pypdf. This skill covers US federal taxes only — not state, local, or non-US tax systems. Use this skill whenever the user mentions taxes, tax returns, 1040, 1040-NR, filing status, "which forms do I need", "help me file my taxes", "tax prep", deductions, NRA tax filing, nonresident alien taxes, treaty-based returns, "check my tax forms", "review my 1040-NR", or any request related to preparing a US federal tax return — even if they don't specify their filer type. This skill handles the routing and covers both citizen/RA (Form 1040) and NRA (Form 1040-NR) workflows end to end, including PDF form filling and validation.
A guided workflow for preparing US federal income tax returns. This skill covers all filer types — US citizens, resident aliens (RA), and nonresident aliens (NRA) — by first determining the correct filer type, then routing to the appropriate forms and procedures. Both citizen/RA and NRA workflows are fully self-contained in this skill, including PDF form field mappings, cross-form validation, and the safe update_form.py script.
Before anything else, ask the user what documents they have. Common source docs:
| Document | What it tells you |
|---|---|
| W-2 | Wages, federal/state tax withheld, employer HSA contributions |
| 1099-NEC | Contractor / self-employment income |
| 1099-INT | Bank interest |
| 1099-DIV | Dividends (qualified and ordinary) |
| 1099-B | Stock/crypto sales (proceeds and cost basis) |
| 1099-MISC | Other income (royalties, rents, etc.) |
| 1099-SA / 5498-SA |
| HSA distributions and contributions |
| 1098 | Mortgage interest paid |
| 1098-T | Tuition paid (education credits) |
| I-94 | Travel history (needed for NRA determination) |
If the user has an I-94 or mentions a visa type, that's a strong signal they may be NRA — proceed to Step 2 with that in mind.
This is the critical routing decision. Read references/filing-status.md for the full decision tree. The short version:
Ask the user directly if unclear. Don't assume.
Read references/form-routing.md to determine which schedules and forms are needed based on the user's income types. For field-level details on individual schedule lines and common pitfalls, read references/common-schedules.md when filling specific forms.
references/filing-status.md)references/form-routing.mdscripts/update_form.py (bundled with this skill) or write equivalent code following the three critical rules:
auto_regenerate=FalseThis section covers the complete NRA filing workflow. For NRA-specific field-to-line mappings, see references/form-field-maps.md. For PDF recovery procedures, see references/pypdf-recovery.md.
These rules prevent data loss. Violating them will corrupt PDF files. The bundled scripts/update_form.py enforces all three automatically — use it instead of writing update logic from scratch.
NEVER write output to the same path as input. PdfReader uses lazy reading — if you write to the same file, you truncate it while the reader still holds references into it. Page annotations (already in memory) may survive, but the AcroForm catalog gets corrupted during the partial read/write overlap. Always write to a temp path first, then copy.
Always use auto_regenerate=False when calling update_page_form_field_values(). The default True removes /AP (appearance stream) entries from each field. Without appearance streams, some PDF viewers render the field as blank even though the /V value is correct — the data is there but invisible.
Iterate all pages when updating fields, even if you think fields are on page 1. Some IRS forms silently split fields across pages — if you only update page 0, fields on page 1 will be silently skipped with no error.
If a PDF gets corrupted (field tree broken but annotation values survive):
page.get("/Annots") → annot.get("/V")/Fields array from page annotationsreferences/pypdf-recovery.md when you see this symptom — it has the full step-by-step repair procedureA bundled script at scripts/update_form.py encodes all three critical rules above plus post-write verification. Use it for all form updates:
# CLI usage — fix a field
python scripts/update_form.py Form1040NR.pdf /tmp/Form1040NR_fixed.pdf --set "f1_53=5000"
# Fix multiple fields and clear one
python scripts/update_form.py Form8843.pdf /tmp/Form8843_fixed.pdf \
--set "f1_14=338" "f1_17=338" --clear "f1_15"
# Or import as a library in your own script
from scripts.update_form import update_form
import shutil
update_form("Form.pdf", "/tmp/Form_fixed.pdf", {"f1_53": "5000"}, clear_fields=["f1_65"])
shutil.copy("/tmp/Form_fixed.pdf", "Form.pdf") # only then overwrite original
The script automatically verifies that fields survived the write and warns if the output looks corrupted. Ensure pypdf is available: pip install pypdf --break-system-packages.
Before modifying any form, always extract and map fields first.
Step 1: Extract all field names and values
reader = PdfReader("Form.pdf")
fields = reader.get_form_text_fields()
for name, value in sorted(fields.items()):
short = name.split(".")[-1].replace("[0]", "")
print(f"{short} = {value}")
Step 2: Map fields to line numbers via Y-position
Before this step, read references/form-field-maps.md for the expected field-to-line table — it covers 1040-NR, 8843, Schedule NEC, Schedule OI, Form 8833, Form 8889, and Schedule 1. Use it as a reference while verifying the Y-position analysis below.
IRS PDFs use positional layout. Extract annotation rectangles to determine which line a field corresponds to:
page = reader.pages[0]
annots = page.get("/Annots")
field_positions = []
for annot_ref in annots:
annot = annot_ref.get_object()
t = str(annot.get("/T", ""))
v = annot.get("/V", "")
rect = annot.get("/Rect", [])
ft = str(annot.get("/FT", ""))
if ft == "/Tx": # text fields only
y = float(rect[1]) if rect else 0
x = float(rect[0]) if rect else 0
field_positions.append((y, x, t, v))
# Sort by Y descending = top of page to bottom (matches line order)
for y, x, t, v in sorted(field_positions, reverse=True):
short = t.split(".")[-1].replace("[0]", "")
print(f"Y={y:.0f} X={x:.0f} {short} = {v}")
Compare the Y-position ordering against the physical form layout to create a definitive field-to-line map.
Step 3: Check checkboxes and radio buttons
all_fields = reader.get_fields()
for name, field in sorted(all_fields.items()):
v = field.get("/V", "")
ft = field.get("/FT", "")
if ft == "/Btn":
short = name.split(".")[-1].replace("[0]", "")
print(f"{short} = {v} (button)")
A typical NRA (F-1 OPT) filing includes these forms. See references/form-field-maps.md for complete field-to-line mappings.
| Form | Purpose | Key Fields |
|---|---|---|
| 1040-NR | Main return | Income lines, AGI, tax, withholding, refund |
| Schedule 1 | Additional income/adjustments | Contractor income (Line 8h), HSA deduction |
| Schedule NEC | Tax on non-effectively-connected income | Dividends, capital gains, NEC tax |
| Schedule OI | Other information | Visa type, country, treaty claims, days present |
| Form 8843 | Statement for exempt individuals | Days of presence, visa status, exclusion days |
| Form 8833 | Treaty-based return position | Treaty article, exemption amount |
| Form 8889 | HSA | Contributions, employer contributions, deduction |
references/form-field-maps.mdscripts/update_form.py (different output path!)After filling, validate these consistency checks:
Watch for these — they are the most frequent mistakes when auto-filling:
This section covers the US-China treaty as a concrete example. Similar treaties exist for other countries (e.g., India Article 21(2), South Korea Article 21(1)) — verify article numbers and rates against the specific treaty if your country differs.
For Chinese nationals on F-1 visa:
auto_regenerate=False, iterate all pages)