μArch design criteria, clock domain rules, protocol assignment rules, BFM validation requirements, signal naming conventions, and checklists for the Phase 3 μArch design pipeline. Pure reference — no orchestration.
Each module document MUST contain:
Every P3 run MUST produce mapping each requirement from (REQ-F-) and (REQ-A-*) to the uArch module(s) and section(s) implementing it.
docs/phase-3-uarch/req-uarch-traceability.mddocs/phase-1-research/iron-requirements.jsondocs/phase-2-architecture/iron-requirements.jsonFormat:
| REQ ID | uArch Module(s) | Section(s) | Status |
|----------|-------------------------|-----------------------------|----------|
| REQ-001 | intra_pred | 3.2 Mode Decision FSM | MAPPED |
| REQ-002 | deblock_filter, sao | 2.1 Pipeline, 4.3 SAO Ctrl | MAPPED |
| REQ-003 | — | — | UNMAPPED |
requirements.json changes after P3 completion, this table identifies which uArch sections need re-design. The P3 orchestrator uses mtime comparison to detect staleness.clk / rst_n{domain}_clk / {domain}_rst_n (e.g., sys_clk, pixel_clk)Every storage element in the μArch spec MUST include a storage type decision with rationale:
| Total Bits | Ports | Storage Type | Rationale |
|---|---|---|---|
| ≤256 | any | Flip-flop array | SRAM macro overhead exceeds register cost at this size |
| 257–4096 | ≤2 R/W | SRAM wrapper (recommended) | Area-efficient; register file acceptable with documented rationale |
| 257–4096 | >2 R/W | Register file | Multi-port SRAM macros are rare; register file provides arbitrary port count |
| >4096 | ≤2 R/W | SRAM wrapper (mandatory) | Register file at this size wastes area and power |
| >4096 | >2 R/W | Banked SRAM or register file | Bank SRAM to reduce port count; register file only with PPA justification |
Read latency as selection criterion:
Streaming buffer rule:
Exceptions requiring register file regardless of size:
SpyGlass compatibility: SpyGlass set_option mthresh <bits> flags register arrays exceeding the threshold as SRAM candidates. The default varies by methodology; teams typically set mthresh to 4096–65536 bits based on target technology. Align the plugin's >4096-bit mandatory SRAM rule with your project's mthresh setting in rat_config.json if different.
SRAM wrapper interface specification (required for every SRAM instance in μArch doc):
DEPTH, WIDTH (derived ADDR_W = $clog2(DEPTH))clk, i_ce, i_we, i_addr, i_wdata, o_rdataclk, i_wen, i_waddr, i_wdata, i_ren, i_raddr, o_rdatawclk, i_wen, i_waddr, i_wdata, rclk, i_ren, i_raddr, o_rdataEvery internal bus, FIFO, and result packing width MUST be derived from design parameters — hardcoded width constants are prohibited in parameterized designs.
BLOCK_PIXELS * NUM_COMPONENTS * (BPC+1) + META_BITS)When bus widths are hardcoded for a specific parameter set (e.g., BPC=12), switching parameters (e.g., BPC=8) leaves unused upper bits that:
localparam WIDTH = <literal> for datapath widths is flagged —
must derive from upper parameters (e.g., localparam L_FIFO_W = BLK_PIXELS * COMPONENTS * (BPC+1) + META)Three sub-gates, applied in order (G4a → G4b → G4c):
Invoke rtl-planner when execution risk is the blocker rather than local RTL details:
Expected rtl-planner output: explicit task dependency graph, critical path, and parallel work groups that the orchestrator can apply to Step 3/5 sequencing.
Invoke clock-architect when clocking strategy is non-trivial:
Expected clock-architect output: review report at
reviews/phase-3-uarch/clock-architecture-review.md and concrete fixes to
docs/phase-3-uarch/clock-domain-map.md.
i_ prefix (NOT _i suffix)o_ prefix (NOT _o suffix)io_ prefixclk (single) or {domain}_clk — NOT clk_irst_n (single) or {domain}_rst_n — NOT rst_niu_ prefix (e.g., u_fifo)gen_ prefixtypedef enum logic [N:0] with UPPER_SNAKE_CASE valuessnake_case_t suffix (e.g., state_t, bus_req_t)UPPER_SNAKE_CASE (e.g., DATA_WIDTH)logic only (no reg/wire)Mandatory 3 rounds, coordinated by rtl-architect:
4 mandatory + 1 conditional parallel reviewers each round:
domain-packages/{domain}/ exists (e.g., vcodec-architecture-expert for video codec).
When no domain expert available, rtl-architect covers algorithm consistency in its scope.Round 1-2: review → rebuttal (designer accepts/rejects each finding with rationale) → tree exploration for accepted issues → targeted revision (rejections recorded in per-round artifact)
Last round (converged or max reached): cross-module interface audit, clock domain map consistency, memory conflict analysis, model consistency matrix, BFM final pass, μArch code review
Convergence check after round >= min_rounds: finding_delta < 0.1, all critical resolved, wonder stable
After max_rounds if not converged → escalate to user via AskUserQuestion
Conditional reviewers:
Save to reviews/phase-3-uarch/feature-preservation.md:
# Phase 3 Review: Feature Preservation
- Date: YYYY-MM-DD
- Reviewer: rtl-architect
- Upper Spec: architecture.md
- Verdict: PASS | FAIL
## Feature Coverage Checklist
| Feature | Architecture Block | μArch Doc | Status |
## Findings
### [severity] Finding-N: ...
## Verdict
PASS | FAIL: [reason]
Each review round MUST produce a wonder-log entry:
docs/phase-3-uarch/wonder-log.mdReview rounds use dynamic convergence instead of fixed 3 rounds:
| Parameter | Value | Rationale |
|---|---|---|
| min_rounds | 2 | Minimum for meaningful review |
| max_rounds | 5 | Prevent infinite loops |
| finding_delta_threshold | 0.1 | < 10% new findings = stable |
| critical_resolution | ALL | All Critical/High must be resolved |
| wonder_stability | true | No new High-risk assumptions |
Early exit (round 2): When findings converge quickly (simple designs) Extended review (rounds 4-5): For complex designs with emergent issues
This is inspired by Ouroboros's ConvergenceCriteria:
Module decomposition & structure:
rate_per_cycle × clock_freq ≥ target_throughput. Pipeline changes that reduce net throughput below target are rejected.Clock domain assignment:
Protocol assignment:
Register/SRAM/FSM allocation:
BFM validation:
Review & compliance:
Rebuttal & per-round artifacts:
Phase 3 receives docs/phase-2-architecture/open-requirements.json containing OPEN-2-* research topics.
For each OPEN-2-* item, the μArch team must:
candidates and evaluation_criteriadocs/phase-3-uarch/iron-requirements.json (REQ-U-*) with:
resolved_from: the OPEN-2-* ID that was resolvedresolution_rationale: why this candidate was selectedrejected_alternatives: all non-selected candidates with rejection reasonsupstream_compliance: verification that new REQ-U-* does not violate P1+P2 ironviolation_policy: "agent_retry" (authority=3)acceptance_criteria: measurable criteria for the μArch decisionEach REQ-U-* entry SHOULD include structured acceptance_criteria:
"acceptance_criteria": [
{
"ac_id": "REQ-U-NNN.AC-M",
"description": "measurable criterion text",
"test_method": "assertion|cocotb|formal|inspection",
"verifiable": true
}
]
Rules:
ac_id format: {parent_req_id}.AC-{N} (e.g., REQ-U-012.AC-1)test_method guides testbench-dev on verification approach:
assertion: protocol properties verifiable by SVA (e.g., valid stable during !ready)cocotb: functional behavior verifiable by simulation (e.g., transfer completes correctly)formal: invariants provable by formal verification (e.g., no deadlock)inspection: non-automatable criteria — set verifiable: falseverifiable: false criteria are excluded from automated coverage tracking;
documented in RTM as NOT_VERIFIABLE[] treated same as absent field (backward compatible)Each REQ-U-* entry SHOULD include a traces_to field linking to upstream requirements:
"traces_to": ["REQ-F-NNN", "REQ-A-NNN"] // Upstream requirements this REQ-U-* decomposes from
Rules:
traces_to linking to P1 REQ-F-* or P2 REQ-A-*Phase 3 MUST NOT produce an open-requirements.json. All research topics must be resolved here. If unresolved items remain at Phase 3 exit → EXIT GATE FAIL. P4 (Implementation) requires all requirements to be iron — no open items may remain.
After docs/phase-3-uarch/iron-requirements.json (REQ-U-*) is finalized:
upstream_iron: ["docs/phase-1-research/iron-requirements.json", "docs/phase-2-architecture/iron-requirements.json"]target_artifacts: Phase 3 output artifactsSame as Phase 2, but challenges may target P1 or P2 iron requirements. Challenge report must identify which upstream authority is being challenged. PPA estimates required with mandatory fields: frequency_mhz, area_gate_count, pixel_rate_mpps, achievable_fps.
Apply ambiguity scoring to all new REQ-U-* decisions:
Artifacts saved:
reviews/phase-3-uarch/feature-preservation.md
reviews/phase-3-uarch/uarch-review.md (consolidated)
reviews/phase-3-uarch/pipeline-diagram.md (Mermaid — see format below)
docs/phase-3-uarch/clock-domain-map.md
docs/phase-3-uarch/protocol-assignments.md
docs/phase-3-uarch/phase-3-summary.md
docs/phase-3-uarch/req-uarch-traceability.md (100% REQ coverage)
docs/decisions/ADR-*.md generated (3-5 key μArch decisions)
docs/phase-3-uarch/wonder-log.md exists with per-round assumption tracking
All High-risk assumptions in wonder-log resolved or explicitly accepted
docs/phase-3-uarch/upstream-feedback-report.md generated (P1/P2 gap analysis)
docs/phase-3-uarch/requirement-delta.md generated (REQ implementability scan)
Per-round review artifacts saved (r1.md through rN.md, minimum 2 rounds)
All OPEN-2-* items from P2 resolved with rationale and rejected_alternatives
Zero remaining open items (P4 entry invariant — no open-requirements.json produced)
docs/phase-3-uarch/iron-requirements.json exists with REQ-U-* entries and resolved_from tracking
Compliance check against P1+P2 iron: verdict = PASS
Ambiguity score ≤ 0.5 for all new REQ-U-* requirements
Per diagram policy: use Mermaid for pipeline/flow diagrams (ASCII art prohibited).
flowchart LR
S0[S0: Fetch] -->|"data[31:0]"| S1[S1: Decode]
S1 -->|"op[7:0], operands"| S2[S2: Execute]
S2 -->|"result[47:0]"| S3[S3: Writeback]
S0 -.->|stall| S0
S2 -.->|forward| S1
Each stage node: SN[SN: stage_name]. Edges: data width annotation.
Stall/forward paths: dashed arrows (-.->).