Codebase architect - Maps and documents SystemVerilog projects. This skill should be used when the user wants to understand a codebase structure, generate architecture documentation, or onboard to a new RTL project. Example requests: "map this codebase", "document the architecture", "show module hierarchy"
Maps SystemVerilog codebases using parallel subagents.
CRITICAL: You orchestrate, Sonnet reads. Never read codebase files directly. Always delegate file reading to Sonnet subagents - even for small codebases. You plan the work, spawn subagents, and synthesize their reports.
| Codebase Tokens | Agents | Rationale |
|---|---|---|
| < 50k | 2 | Minimum for parallelism |
| 50k-300k | 3 | Balance load, related files together |
| 300k-600k | 4-5 | Efficient parallel analysis |
| 600k-1M | 6-8 | Stay under 150k per agent |
| > 1M | 8-10 | Cap at 10, use incremental updates |
Rules:
.gateflow/map/ filesCLAUDE.md with summary.gateflow/map/
├── CODEBASE.md # Main summary (AI-friendly index)
├── hierarchy.md # Module tree diagram
├── signals.md # Port and signal flow
├── clock-domains.md # CDC analysis, resets
├── fsm.md # State machine diagrams
├── packages.md # Package dependencies
├── types.md # Structs, unions, typedefs
├── functions.md # Functions and tasks
├── macros.md # Preprocessor directives
├── verification.md # SVA, coverage, checkers
├── interfaces.md # Interfaces, modports (if found)
├── classes.md # UVM/OOP classes (if found)
├── generate.md # Generate blocks (if found)
├── dpi.md # DPI imports/exports (if found)
├── recipe.md # Compile order, filelists
└── modules/ # Per-module detail pages
└── <module_name>.md
ls .gateflow/map/CODEBASE.md 2>/dev/null
If exists: Check for changes since last map:
# Read last commit from metadata
last_commit=$(cat .gateflow/map/.last_scan_commit 2>/dev/null)
git diff --name-only $last_commit HEAD -- "*.sv" "*.svh" 2>/dev/null
If not exists: Proceed to full mapping.
mkdir -p .gateflow/map/modules
Scan files with token counts:
find . \( -name "*.sv" -o -name "*.svh" \) -not -path "./.gateflow/*" | while read f; do
tokens=$(wc -c < "$f" | awk '{print int($1/4)}')
echo "$tokens $f"
done | sort -rn
Build assignment table:
| File | Tokens | Assignment |
|---|---|---|
| top.sv | 50000 | Agent 1 |
| uart_tx.sv | 8000 | Agent 1 |
| hmac_core.sv | 120000 | Agent 2 (LARGE - use Grep) |
For files exceeding 80k tokens, use chunked analysis:
# Get module declaration
grep -n "^\s*module\s" large_file.sv
# Get ports
grep -n "(input|output|inout)" large_file.sv
# Get instances
grep -n "^\s*\w\+\s\+\w\+\s*(" large_file.sv
Read file with offset=0, limit=500 (header, ports)
Read file with offset=500, limit=500 (logic section 1)
... continue until covered
CRITICAL: Spawn ALL subagents in a SINGLE message.
Use Task tool with:
subagent_type: "Explore"Example - spawn 3 agents in ONE message:
Task 1:
description: "Analyze UART files"
subagent_type: "Explore"
prompt: |
Read and analyze these SystemVerilog files:
- rtl/uart_pkg.sv
- rtl/uart_tx.sv
- rtl/uart_rx.sv
For EACH file, extract:
1. Module/Package name
2. Purpose (one-line)
3. Ports table: name, direction, width
4. Parameters: name, type, default
5. Instances: what it instantiates
6. FSM states (if any)
7. Clock/Reset signals
8. Package imports
Return structured markdown.
Task 2:
description: "Analyze SHA files"
subagent_type: "Explore"
prompt: |
Read and analyze these SystemVerilog files:
- rtl/sha2_pad.sv
- rtl/sha2_core.sv
[Same extraction request...]
Task 3:
description: "Analyze large file with Grep"
subagent_type: "Explore"
prompt: |
This file is large. Use Grep to extract structure first:
- rtl/hmac_core.sv (120k tokens)
1. Grep for module declaration
2. Grep for ports
3. Grep for instances
4. Read specific sections if needed
Return structured markdown.
After all subagents complete:
---
last_mapped: YYYY-MM-DDTHH:MM:SSZ
total_files: N
total_tokens: N