Systematic reduction of the Butlers test suite to ~2,000 contract-driven tests. Each surviving test must trace to an architectural invariant, an RFC wire contract, or an OpenSpec capability.

Before You Start

Rediscover current state — never trust hardcoded counts in this skill:

CURRENT=$(find tests/ -name '*.py' -exec grep -c 'def test_' {} + 2>/dev/null | awk -F: '{sum+=$2} END {print sum}')
echo "Current test count: $CURRENT (skill baseline was 13,675 on 2026-04-05)"

Check epic status: bd list --parent bu-rhztl — see which beads are done/in-progress
Read your bead: bd show <bead-id> for targets and acceptance criteria
Load doctrine: read about/heart-and-soul/ for invariants, relevant RFCs in about/legends-and-lore/
Run scoped discovery on your domain — see references/discovery.md

Systematic reduction of the Butlers test suite to ~2,000 contract-driven tests. Each surviving test must trace to an architectural invariant, an RFC wire contract, or an OpenSpec capability.

Before You Start

Rediscover current state — never trust hardcoded counts in this skill:

CURRENT=$(find tests/ -name '*.py' -exec grep -c 'def test_' {} + 2>/dev/null | awk -F: '{sum+=$2} END {print sum}')
echo "Current test count: $CURRENT (skill baseline was 13,675 on 2026-04-05)"

Check epic status: bd list --parent bu-rhztl — see which beads are done/in-progress
Read your bead: bd show <bead-id> for targets and acceptance criteria
Load doctrine: read about/heart-and-soul/ for invariants, relevant RFCs in about/legends-and-lore/
Run scoped discovery on your domain — see references/discovery.md

Butler Test Condensation

Before You Start

Butler Test Condensation

Before You Start

Resuming Mid-Epic

Three-Tier Test Architecture

Tier 1: Architectural Invariants (~200 tests) — `tests/contracts/`

Tier 2: Wire Contracts (~500-800 tests)

Tier 3: Capability Behavior (~800-1200 tests)

Condensation Workflow Per Domain

Quality Gates

Updating OpenSpec When Tests Reveal Gaps

Domain-Specific Guidance

Test Classification Decision Matrix

Beads Epic

Automation Audit Ops

Github Qa Labels

Jupyter Notebook

Tidb Integrationtest Recorder

Quality Nonconformance

Hugging Face Trackio

Butler Test Condensation

Before You Start

Butler Test Condensation

Before You Start

Resuming Mid-Epic

Three-Tier Test Architecture

Tier 1: Architectural Invariants (~200 tests) — tests/contracts/

Tier 2: Wire Contracts (~500-800 tests)

Tier 3: Capability Behavior (~800-1200 tests)

Condensation Workflow Per Domain

Quality Gates

Updating OpenSpec When Tests Reveal Gaps

Domain-Specific Guidance

Test Classification Decision Matrix

Beads Epic

Automation Audit Ops

Github Qa Labels

Jupyter Notebook

Tidb Integrationtest Recorder

Quality Nonconformance

Hugging Face Trackio

Tier 1: Architectural Invariants (~200 tests) — `tests/contracts/`