Policy rules, 10-Wave pipeline definitions, coding conventions, wave overlap strategy, escalation conditions, and checklists for the Phase 4 RTL implementation pipeline. Pure reference — no orchestration.
Wave 0: Prepare — **Wave 0: Preparation + Test Plan**
- Step 0a: Module enumeration, directory creation, TODO list (existing)
- Step 0b: Test plan generation per module via test-plan-writer agent
- Gate: sim/{module}/{module}_test_plan.md exists for every module
- Failure: retry once, then proceed with WARNING (Wave 6a must generate missing plans)
Wave 1: Write All — One rtl-coder per module, all parallel
Wave 2: Lint All — One lint-checker per module, all parallel, collect results
Wave 3: Fix Lint — ONLY FAIL modules, max 3 rounds, re-lint only fixes
Wave 4: Code Review — rtl-critic per lint-clean module, parallel
Wave 5: Bugfix — ONLY REVIEW_FAIL modules, max 3 review→fix iterations
Wave 6a: Tier 1 Smoke — testbench-dev + eda-runner per module, parallel
Wave 6b: Tier 2 Unit — Ref model comparison, REQ-U-* tracing, FSM >= 50%, line >= 60%, covergroups >= 1, codec conformance PASS/N/A
Wave 7: Module CDC — cdc-checker per multi-domain module, parallel
Wave 8: Module Proto — protocol-checker per bus-interface module, parallel
Wave 9: Refactoring — rtl-p4s-refactor for flagged modules, selective
Wave 10: Integration — smoke test + spec compliance + Stream B + Phase 4 gate
Key principles:
{
"module": "{module}",
"wave_1_write": "DONE",
"wave_2_lint": "PASS|FAIL",
"wave_3_fix": "PASS|SKIP",
"wave_4_review": "REVIEW_PASS|REVIEW_FAIL",
"wave_5_bugfix": "PASS|SKIP",
"wave_6a_tier1_smoke": "PASS|FAIL",
"wave_6b_tier2_unit": "PASS|FAIL",
"wave_7_cdc": "CDC_PASS|CDC_FAIL|SKIP",
"wave_8_protocol": "PROTOCOL_PASS|PROTOCOL_FAIL|SKIP",
"wave_9_refactor": "DONE|SKIP",
"wave_10_gate": "PASS|FAIL"
}
All RTL produced in Phase 4 MUST follow:
i_, outputs o_, bidirectional io_ (NOT suffix _i/_o)clk (single domain) or {domain}_clk (e.g., sys_clk) — NOT clk_i, clk_sysrst_n (single domain) or {domain}_rst_n (e.g., sys_rst_n) — NOT rst_nii_ prefixlogic only — reg and wire keywords FORBIDDENalways_ff for sequential, always_comb for combinational — no bare alwaystypedef enum logic [N:0] for FSM states, typedef struct packed for grouped signalsu_, generate prefix: gen_UPPER_SNAKE_CASE, localparam: L_ prefix, types: snake_case_tStorage elements specified as "SRAM wrapper" in Phase 3 μArch docs MUST use standardized wrappers:
Wrapper placement: rtl/common/ (shared across modules)
Standard wrappers (parameterized behavioral models for simulation, replaced by foundry macros for synthesis):
| Wrapper | File | Use |
|---|---|---|
sram_sp | rtl/common/sram_sp.sv | Single-port: 1 R/W port, single clock |
sram_tp | rtl/common/sram_tp.sv | Two-port: 1W + 1R, single clock (clk) |
sram_dp | rtl/common/sram_dp.sv | Dual-port: 1W + 1R, dual clock (wclk/rclk) |
Parameter contract:
DEPTH: number of entries (power-of-2 recommended for efficient address decode)WIDTH: bits per entryADDR_W = $clog2(DEPTH) (localparam inside wrapper)Port naming (SP example):
clk — clocki_ce — chip enable (active-high)i_we — write enable (active-high)i_addr [ADDR_W-1:0] — addressi_wdata [WIDTH-1:0] — write datao_rdata [WIDTH-1:0] — read data (1-cycle latency, registered output)TP port naming (two-port: 1W + 1R, single clock):
clk — clocki_wen — write enablei_waddr [ADDR_W-1:0] — write addressi_wdata [WIDTH-1:0] — write datai_ren — read enablei_raddr [ADDR_W-1:0] — read addresso_rdata [WIDTH-1:0] — read data (1-cycle latency)DP port naming (dual-port: 1W + 1R, dual clock):
wclk — write clocki_wen — write enablei_waddr [ADDR_W-1:0] — write addressi_wdata [WIDTH-1:0] — write datarclk — read clocki_ren — read enablei_raddr [ADDR_W-1:0] — read addresso_rdata [WIDTH-1:0] — read data (1-cycle latency, rclk domain)Instance naming: u_mem_{purpose} (e.g., u_mem_coeff, u_mem_line_buf)
Foundry macro replacement strategy:
sram_sp body with foundry-specific macro instantiation behind `ifdef SYNTHESIS guardWave 1 responsibility: rtl-coder creates rtl/common/sram_*.sv wrappers if not already present, then instantiates them in modules per μArch spec.
Sequential dependency loops in always_comb (e.g., error diffusion, SAD accumulation, carry chains) create deep combinational paths that risk timing closure.
Pattern: for (s = 0; s < N; s++) where iteration s depends on iteration s-1
→ split at N/2 with registered intermediate results (when latency budget allows)
Examples where split was beneficial (ASIC):
Per-module review by rtl-critic:
Classification: REVIEW_PASS (0 critical/major findings) or REVIEW_FAIL.
Wave 4 code review MUST include automated structural checks (performed by rtl-critic):
Structural findings are classified:
REVIEW_PASS requires zero CRITICAL structural findings.
sim/{module}/tb_{module}.svsim/{module}/tb_{module}.sv (adds reference comparison logic to
existing Wave 6a TBs, does NOT replace them).
Gate: sim/{module}/{module}_unit_results.json with ref_mismatches=0,
coverage meeting thresholds (FSM >= 50%, line >= 60%), req_ids populated,
func_coverage.covergroups_defined >= 1, and codec_conformance PASS/N/A.When iron-requirements has structured acceptance_criteria (with ac_id) for a REQ-U-*: ac_ids should be populated for each unit test feature covering that requirement. This is advisory at P4 (not hard-block). Incomplete AC coverage is flagged as WARNING and deferred to P5 for closure. When no structured AC: existing req_ids gate applies unchanged.
rtl-p4-rapid-impl produces: lint-clean modules, module-level CDC pass,
smoke functional pass, block sanity pass. State in p4-state.json.rtl-p4-implement adds: code review (Wave 4-5), Tier 1 smoke tests (Wave 6a),
Tier 2 unit tests (Wave 6b), protocol checks (Wave 8), refactoring (Wave 9),
integration gate (Wave 10), Stream B artifacts.rtl/*/*.sv exists + lint passes.rtl-p4s-bugfix: Used in Wave 5 for review-driven fixes, and Wave 6 for test-driven fixesrtl-p4s-refactor: Used in Wave 9 for code quality improvementsrtl-p4s-unit-test: Tier 2 testing (used in Wave 6b for mandatory per-module verification; also available standalone)rtl-lint-check: Used in Waves 2-3 and after any code modificationALL of the following must be true before Phase 5:
RTL Files:
Lint:
verilator --lint-only -Wall with zero errorsCode Review:
.rat/scratch/phase-4/Test Plan:
Unit Test (Tier 1 + Tier 2):
req_ids populated (REQ-U-* tracing)func_coverage.covergroups_defined >= 1 per moduleAC Coverage (advisory):
CDC:
Protocol:
Refactoring:
Integration + Gate:
Stream B (content quality verified):
property/assert per moduleclock-domain-map.mdREQ- tag per module + contains test function/taskNaming Conventions:
i_/o_/io_ prefix (NOT suffix _i/_o)clk or {domain}_clk — NOT clk_i, clk_sysrst_n or {domain}_rst_n — NOT rst_niu_ prefix, generates: gen_ prefixlogic only — no reg/wire keywordsalways_ff/always_comb — no bare alwaysUPPER_SNAKE_CASE, localparam: L_ prefix, types: snake_case_tSummary: