Spec-driven development and multi-layer verification pipeline that replaces manual code review. Before writing any code, this skill defines specs with acceptance criteria (AC), then verifies implementation through 4 automated layers: guardrails, BDD tests, scope checking, and independent test generation. Use this skill whenever the user asks to implement a feature, fix a bug, add functionality, refactor code, or make any code change — even if they don't mention specs or verification. Also trigger on: "코드 리뷰", "검증", "스펙", "AC", "수용 기준", "BDD", "가드레일", "merge 전에", "PR 체크", "테스트 먼저", "spec-driven", "리뷰 없이", "자동 검증", "verify", "acceptance criteria", "spec verify", "코드 검증", "구현해줘", "기능 추가", "버그 수정", "리팩토링". This skill should be the default workflow for any implementation task because it ensures code quality through specification, not just code review.
When an AI agent writes both code and tests, it can create a self-consistent but wrong implementation — the tests pass because they were shaped by the same misunderstanding as the code. This skill breaks that loop by:
The human's job shifts from reading diffs to reviewing specs and acceptance criteria — a higher-leverage activity.
When a user requests any implementation task, follow this flow:
User Request → Phase 1 (Spec) → User Approval → Phase 2 (Implement) → Phase 3 (Verify) → Phase 4 (Report)
Never skip Phase 1. The spec can be lightweight for simple tasks, but it always exists.
Before writing a full spec document, gauge complexity:
| Complexity | Signal | Spec Level |
|---|---|---|
| Simple | 1-2 file bug fix, typo, config change | 2-3 ACs, inline (no separate file) |
| Medium | New feature, API endpoint, component | Full spec with all sections |
| Complex | Architecture change, migration, auth/security | Full spec + domain contracts + extra scrutiny |
Use the template at assets/spec_template.md as a starting point. The spec includes:
references/escalation_patterns.md)For guidance on writing effective ACs, see references/spec_writing_guide.md.
Present the spec to the user. Do not proceed to implementation until they approve. If they suggest changes, update the spec and re-present.
For simple tasks, present inline and ask: "Does this AC look right?"
For medium/complex tasks, save to .spec-verify/specs/<task-name>.md and present a summary.
Once the spec is approved:
Convert each AC into an executable test. The test framework depends on the project — detect it automatically (jest, pytest, vitest, cargo test, go test, etc.).
Each Given/When/Then maps directly:
Write these tests before any implementation code. They should fail initially — that's expected and correct.
Now write the code to make the tests pass. If a test needs modification to pass, that's a red flag — the spec may need updating, not the test. Pause and reconsider before changing any test.
Run the new tests. If failures exist, fix the implementation (not the tests). Repeat until green.
After implementation, run all four verification layers automatically. Each layer catches different categories of problems.
These are fast, binary checks. Run the project's existing toolchain first.
Auto-detect and run:
Organization invariant checks (always run):
scripts/check_secrets.shDomain contract checks (if configured in .spec-verify/config.yaml):
To detect the project environment, run scripts/detect_project.sh from this skill's directory.
This layer ensures the implementation didn't exceed its mandate.
Run scripts/scope_check.py which:
git diff --name-only)Default escalation triggers (active even without config):
If any trigger fires, surface it clearly to the user with the specific file and reason. Don't block — just inform and ask them to acknowledge.
This is the layer that breaks the "agent writes code and tests" loop.
If any independent test fails:
The key constraint: these tests must be derived from the spec, not reverse-engineered from the code.
After all layers complete, output a summary:
══════════════════════════════════════════
SPEC-VERIFY REPORT
══════════════════════════════════════════
Task: [task title]
Spec approved: ✓
──────────────────────────────────────────
Layer 1 - Guardrails: [N/N] passed
Layer 2 - AC Tests: [N/N] passed
Layer 3 - Scope Check: [CLEAN / ESCALATION(n)]
Layer 4 - Independent: [N/N] passed
──────────────────────────────────────────
VERDICT: [PASS / CONDITIONAL PASS / FAIL]
[If conditional, state the reason and required action]
══════════════════════════════════════════
Verdict logic:
On first run in a project, auto-detect the environment by running scripts/detect_project.sh. This script checks for:
Store detected config in .spec-verify/config.yaml. If detection is ambiguous, ask the user.
See the config schema below for what can be customized: