Use this skill when the user asks to check PDF accessibility, ADA compliance, Section 508 compliance, PDF/UA compliance, populate PDF metadata, make a PDF accessible, fix PDF metadata, run an accessibility audit, check alt text, verify heading hierarchy, or discusses PDF accessibility requirements and WCAG compliance for PDFs.
You are helping the user make their PDFs ADA-compliant by analyzing them for missing accessibility metadata and fixing what can be fixed programmatically.
The scripts are bundled with this skill at:
${CLAUDE_SKILL_DIR}/scripts/pdf_accessibility_audit.py${CLAUDE_SKILL_DIR}/scripts/pdf_metadata_fix.py${CLAUDE_SKILL_DIR}/scripts/pdf_structure_generator.py${CLAUDE_SKILL_DIR}/scripts/regression_test.pyRequired: pypdf — pip install pypdf
Optional: pikepdf — pip install pikepdf (enables structure tree generation for untagged PDFs)
The user's argument is: $ARGUMENTS
*.pdf files and process each.python3 "${CLAUDE_SKILL_DIR}/scripts/pdf_accessibility_audit.py" "<pdf_path>" 2>/dev/null
Parse the JSON output.
Build a fixes JSON object:
Metadata (always auto-fixable):
title: If missing or looks like a filename, read page 1 visually and infer a meaningful title.author: If missing, always ask the user before proceeding: "Who is the author of this
document?" Do not leave blank or skip — author is a required accessibility metadata field.
In batch mode, ask once and apply the same author to all files unless the user specifies otherwise.subject: If missing, infer from content.language: If missing, default to "en-US".display_doc_title: Set to true.set_tagged: Set to true ONLY after confirming the PDF has a structure tree (from Acrobat Autotag or Path B). Do NOT set on an untagged PDF — it would make the audit falsely report the PDF as tagged.set_pdfua: Set to true.Bookmarks (always auto-fixable):
generate_bookmarks: Set to true if no bookmarks exist and document has more than 1 page.Link descriptions (Claude generates):
non_descriptive_links array.link_descriptions dict mapping "pageNum_linkIndex" to description string.{"2_0": "YouTube: Technical Due Process lecture", "7_0": "DHS AI use case inventory"}Alt text (Claude generates — MANDATORY for all figures):
alt_texts dict mapping figure index (as string) to alt text string.If the audit reports structure.has_structure_tree == false, the PDF has no structure tree.
This is the single most important accessibility feature — without it, screen readers
get unstructured text with no headings, lists, or figure descriptions.
Ask the user:
This PDF has no structure tree (the most important feature for screen readers).
Do you have an Acrobat-tagged version of this PDF? If you've already run Adobe Acrobat Pro > Accessibility > Autotag Document on this file, give me the tagged version and I'll apply all remaining fixes (metadata, bookmarks, link descriptions, alt text, PDF/UA flag) on top of Acrobat's structure tree.
If not, I'll handle everything automatically — I'll generate a structure tree with headings, lists, figures, and tables, plus all metadata fixes, in one pass.
Note: Acrobat's structure tree is higher quality (deeper heading hierarchy, better table detection, MCID-linked content). But our automated path produces a solid structure tree that works well for most documents, especially slide decks.
Wait for the user to provide the tagged file path. Once they do, re-run the audit on the tagged file to confirm the structure tree now exists:
python3 "${CLAUDE_SKILL_DIR}/scripts/pdf_accessibility_audit.py" "<tagged_pdf_path>" 2>/dev/null
Verify structure.has_structure_tree == true. If it's still false, the user may
not have saved correctly — ask them to try again.
Re-determine fixes based on the new audit of the tagged file. Now that the
structure tree exists, set set_tagged: true and generate alt text targeting the
actual /Figure elements in the structure tree.
Run pypdf on the tagged file:
python3 "${CLAUDE_SKILL_DIR}/scripts/pdf_metadata_fix.py" "<tagged_pdf_path>" "<fixes_json_path>" 2>/dev/null
Clean up the temp fixes JSON.
The output _accessible.pdf will have the Acrobat structure tree PLUS all metadata,
bookmarks, link descriptions, and PDF/UA identifier.
Check if pikepdf is available:
python3 -c "import pikepdf; print('available')" 2>/dev/null
If pikepdf IS available: Write the fixes JSON to a temp file, then run the structure tree generator with the fixes included (single-pass pipeline — all changes via pikepdf):
python3 "${CLAUDE_SKILL_DIR}/scripts/pdf_structure_generator.py" "<pdf_path>" "<output_accessible.pdf>" "<fixes_json_path>" 2>/dev/null
This generates the structure tree, bookmarks, and applies all metadata fixes
in a single operation. Do NOT run pdf_metadata_fix.py afterward — the
pikepdf output should not be re-processed by pypdf (this corrupts the file).
If pikepdf is NOT available:
Tell the user: "pikepdf is not installed. Run pip install pikepdf to enable
structure tree generation, or provide an Acrobat-tagged version of the PDF."
Show a structured checklist. Adapt based on which path was used:
## Auto-Fixed
- [x] Structure tree (Acrobat Autotag)
- [x] Document title set to "<title>"
- [x] Language set to en-US
- [x] Tagged PDF flag set
- [x] Display title enabled
- [x] PDF/UA identifier added
- [x] Bookmarks generated for N pages
- [x] Descriptive text added to N links
- [x] Alt text added to N figures
## Needs Human Review
- [ ] Verify heading hierarchy matches document structure
- [ ] Review alt text accuracy for each image
- [ ] Check reading order on multi-column slides
Saved: `filename_accessible.pdf`
## Auto-Fixed (Acrobat Accessibility Checks)
- [x] Document title set to "<title>"
- [x] Language set to en-US
- [x] Display title enabled
- [x] PDF/UA identifier added
- [x] Bookmarks generated for N pages
- [x] Descriptive text added to N links
- [x] Structure tree generated (N elements: headings, lists, figures, tables)
- [x] Tagged content (BDC/EMC) — all page content marked with structure MCIDs
- [x] Tagged annotations — link annotations tagged as /Link structure elements
- [x] Tab order set to structure order on all pages
## Needs Human Review
- [ ] Verify heading hierarchy matches document structure (heuristic-based)
- [ ] Review alt text accuracy for each image
- [ ] Check reading order on multi-column slides
Saved: `filename_accessible.pdf`
After showing the checklist, ask:
Want me to save this checklist as a file? (markdown / spreadsheet)
If user asks to save:
filename_remediation.md with full checklist, page numbers, Acrobat instructionsWhen processing a folder:
Step 1: Collect shared metadata upfront. Before auditing, ask the user TWO questions:
- Who is the author of these documents?
- Fix all files, or do you want to select specific ones after seeing the summary?
This avoids re-asking per file. Store the author for all files.
Step 2: Audit all PDFs.
python3 "${CLAUDE_SKILL_DIR}/scripts/pdf_accessibility_audit.py" --summary file1.pdf file2.pdf file3.pdf 2>/dev/null
This outputs a JSON array of compact summaries (one Python process for all files).
Show summary table:
| File | Pages | Pass | Warn | Fail | Has Tags | Figures No Alt |
|------|-------|------|------|------|----------|----------------|
Step 3: Process each file fully. For each file (or user-selected subset):
--summary) to get link and figure detailsFor Path B files (untagged), process sequentially since each needs pikepdf:
python3 "${CLAUDE_SKILL_DIR}/scripts/pdf_structure_generator.py" "<pdf_path>" "<output_accessible.pdf>" "<fixes_json_path>" 2>/dev/null
For Path A files (already tagged), batch mode is available:
python3 "${CLAUDE_SKILL_DIR}/scripts/pdf_metadata_fix.py" --batch batch_fixes.json 2>/dev/null
Where batch_fixes.json is an array of {"input": "path.pdf", "fixes": {...}} entries.
Step 4: Show per-file results with individual checklists and a final summary table.
See ${CLAUDE_SKILL_DIR}/references/compliance-checklist.md for the full checklist.
_accessible.pdf