Generate and update wiki-style documentation for codebases using AST-aware structural analysis. Use when the user wants to generate documentation, update stale documentation, create a documentation wiki, or assess documentation coverage for a repository.
Generate comprehensive wiki-style documentation for any codebase using AST-aware structural analysis and Claude's reasoning.
Supported languages: Python, JavaScript, TypeScript, Java, C, C++, C#, Kotlin, PHP
Output: Flat directory of markdown files with Mermaid diagrams, cross-references, and a repository overview. Default output path: docs/wiki/ relative to repo root.
Compatibility: This skill produces its own documentation format. It is not compatible with CodeWiki or DeepWiki output. The "update existing documentation" flow only works with docs previously generated by this skill.
Warning: Full-codebase documentation generation is token-intensive. Repos with >200 components will use significant tokens. Repos with >500 components should be documented in phases (core modules first).
The analysis scripts are in the scripts/ subdirectory next to this SKILL.md file. Before running any commands, determine the absolute path to this skill directory. For example, if this SKILL.md is at , then the scripts are at .
/home/user/.claude/skills/doc-generation/SKILL.md/home/user/.claude/skills/doc-generation/scripts/Store this path in a variable for use throughout:
SKILL_SCRIPTS="/absolute/path/to/.claude/skills/doc-generation/scripts"
Before starting, verify these in order:
python3 --version
If not available, stop and tell the user.
python3 "$SKILL_SCRIPTS/analyze_repo.py" --check-deps /path/to/repo
When a repo path is provided, this scans the repo for file extensions, determines which languages are present, and only checks whether those parsers are installed. It exits 0 if all needed parsers are available, exits 1 if any are missing.
Without a repo path, it checks all parsers globally (useful for a one-time environment setup).
If needed parsers are missing, the command prints install instructions. Python analysis uses stdlib ast and never requires tree-sitter.
The target repo must be a git repository. If git rev-parse HEAD fails in the repo directory, stop and tell the user: "This directory is not a git repository. Documentation versioning requires git."
Check if docs/wiki/.doc-meta.json exists in the target repo:
Run the analysis script. Determine the absolute path to the scripts directory first.
python3 "$SKILL_SCRIPTS/analyze_repo.py" /path/to/repo --output /path/to/repo/.doc-analysis.json
Scoping options (pass as needed):
--include "*.py,*.ts" — Only analyze files matching these patterns--exclude "*test*,*spec*,*mock*" — Skip files matching these patterns--focus "src/core,src/api" — Tag directories for deeper documentation (see Phase 3)--max-files 300 — Limit total files analyzed--verbose — Show detailed loggingRead the output JSON with the Read tool. Report to the user:
Critical check: If languages_skipped is non-empty, stop and warn the user. Offer two options:
If >200 components, warn: "This repo has {N} components. Documentation generation will be token-intensive. Proceed?"
If >500 components, suggest phased approach: "Consider documenting core modules first using --include to scope to specific directories. Which areas are most important?"
Using the analysis JSON, group components into logical modules. This is your reasoning — no external tool needed.
Clustering heuristics (apply in order):
File path proximity: Components in the same directory or subdirectory belong together. Group by the first 2-3 path segments (e.g., src/auth/* → "authentication" module).
Dependency relationships: Components that call each other frequently belong together. Use the relationships array from the analysis.
Naming conventions: Components with common prefixes or domain terms (auth, db, api, payment, user) form natural modules.
For small repos (<30 components): Skip clustering. Document everything in a single comprehensive overview.
For large repos (>100 components): Create hierarchical sub-modules. A top-level module like "api-layer" might have sub-modules for routes, middleware, and validators.
Module naming: Use lowercase-with-hyphens for all module names. These become filenames: authentication.md, database-layer.md, api-routes.md.
Persist the module tree: Write docs/wiki/module-tree.json with the clustering result:
{
"authentication": {
"components": ["src.auth.service.AuthService", "src.auth.models.User"],
"description": "User authentication and session management",
"doc_file": "authentication.md"
},
"database-layer": {
"components": ["src.db.pool.ConnectionPool", "src.db.models.BaseModel"],
"description": "Database connection and ORM layer",
"doc_file": "database-layer.md"
}
}
Process modules leaf-first (modules with no sub-modules first, then parent modules).
Focus areas: If --focus was used, each component in the analysis JSON has an is_focus field (true/false). Modules containing focus components get deeper documentation: more detailed component descriptions, more code snippets, additional Mermaid diagrams for internal flows. Non-focus modules get standard documentation.
For each module, use the Read tool to read the actual source files of its components, then write a markdown documentation file.
Each module doc ({module-name}.md) should include:
graph TD
A[ComponentA] --> B[ComponentB]
B --> C[ComponentC]
graph LR
ThisModule --> DatabaseModule
ThisModule --> ConfigModule
[Database Layer](database-layer.md)Important rules:
docs/wiki/)[Display Name](filename.md) — all files are siblings, filenames use lowercase-with-hyphensAfter all module docs are written, generate overview.md:
graph TD
API[API Layer] --> Auth[Authentication]
API --> DB[Database]
Auth --> DB
API --> Cache[Cache Layer]
Before declaring success, verify:
Cross-reference links: Scan each generated .md file for markdown links of the form [...](*.md). Verify every linked filename exists in docs/wiki/. Report any broken links.
Mermaid diagrams: Scan each file for mermaid code blocks. Read each one and verify:
graph, flowchart, sequenceDiagram, classDiagram, stateDiagram, erDiagram)Completeness: Every module in module-tree.json has a corresponding .md file. The overview.md exists and links to all modules.
No empty sections: Scan for markdown headings followed immediately by another heading or end-of-file with no content between them. Fill or remove empty sections.
Write .doc-meta.json to the output directory:
{
"generated_at": "2026-03-14T10:30:00Z",
"commit_hash": "abc123def456",
"module_tree": {
"authentication": {
"components": ["src.auth.service.AuthService"],
"doc_file": "authentication.md"
}
},
"files_generated": ["overview.md", "authentication.md", "database-layer.md"],
"component_count": 142,
"repo_name": "my-service",
"output_path": "docs/wiki",
"scoping": {
"include_patterns": null,
"exclude_patterns": null
}
}
Report completion: "Documentation generated: {N} module docs + overview. Output: docs/wiki/"
Use this flow when .doc-meta.json already exists.
Read docs/wiki/.doc-meta.json using the Read tool. Extract the commit_hash field.
Run git rev-parse HEAD in the repo directory to get the current commit.
If they match, docs are current. Tell the user: "Documentation is up to date (generated at commit {hash})."
If they differ, proceed.
git diff --name-only {stored_hash} HEAD
This gives you the list of files that changed since docs were last generated.
Run the analysis script again, using the same scoping options from .doc-meta.json if they were set:
python3 "$SKILL_SCRIPTS/analyze_repo.py" /path/to/repo --output /path/to/repo/.doc-analysis.json
Read the new analysis JSON and the stored .doc-meta.json module tree. Compare:
relative_path appears in the git diff file listA module is affected if any of its components were changed, added, or removed.
For each affected module:
For unaffected modules: leave their docs untouched.
Run the same validation checks from Phase 5 of Fresh Generation (cross-references, Mermaid, completeness, empty sections).
If any module was affected:
overview.md to ensure it reflects current state.doc-meta.json with new commit hash, timestamp, and module treeReport: "Updated documentation for {N} modules. {M} modules unchanged."
docs/wiki/
├── .doc-meta.json # Generation metadata (commit hash, module tree, scoping)
├── module-tree.json # Module clustering structure
├── overview.md # Repository overview with architecture diagram
├── authentication.md # Module documentation
├── database-layer.md # Module documentation
├── api-routes.md # Module documentation
└── ... # Additional module docs
Do not proceed silently in these situations:
--include or --exclude to focus on core areas first.docs/wiki/ exists but has no .doc-meta.json, it wasn't generated by this skill. Ask whether to overwrite or use a different output path.Good documentation answers: What does this do? Why does it exist? How does it connect to the rest of the system? How would I use or modify it?
Mermaid diagram types to use:
graph TD or graph LR — Architecture and dependency diagramssequenceDiagram — For request/response flows or multi-step processesclassDiagram — For class hierarchies (use sparingly)flowchart — For decision trees or process flowsWhat to include:
What NOT to include: