This skill should be used when the user says `/vibe-test:gate`. The single pass/fail for tier enforcement — runs audit + coverage (reusing fresh state when available), applies tier threshold via the locked weighted-score formula, and exits with 0 (pass) / 1 (threshold breach) / 2 (tool error). Auto-detects CI mode via `GITHUB_ACTIONS=true` or `--ci` flag and emits `::error::` / `::warning::` annotations + writes GitHub Actions summary markdown to `$GITHUB_STEP_SUMMARY`. In local mode: a diagnostic banner with 'what would it take to pass' guidance. Co-invokes `superpowers:verification-before-completion` when present for verification-frame decisions.
Read ../guide/SKILL.md for your overall behavior (persona, experience-level adaptation, Pattern #13 composition rules, version resolution). Then follow this command end to end.
Gate is the decision surface. Every other Vibe Test command measures, proposes, or repairs; gate decides. A builder who runs /vibe-test:gate is asking "given this repo's app type + tier + current coverage — does it clear the bar, yes or no?". The answer is a single exit code: 0 for pass, 1 for threshold breach, 2 for tool error. In CI mode the same logic emits GitHub Actions annotations; in local mode it renders a diagnostic banner with concrete "what would it take to pass" guidance.
Run audit (or reuse fresh state) → run coverage (or reuse fresh state) → apply tier threshold → emit three views → exit 0 / 1 / 2 based on the verdict.
--tier <tier> flag. If neither, attempt a fast inline classification via scan() + classifyAppType() (Pattern #16 shaping prereq — see Step 3 below). If even that fails (no framework detected, empty repo):
"I can't determine your app type — gate needs a tier to apply the threshold against. Run
/vibe-test:auditfirst, or pass--tier <prototype|internal|public-facing|customer-facing-saas|regulated>explicitly."Wait. Never guess tier silently for a pass/fail verdict.
<repo>/.vibe-test/state/audit.json exists and its last_updated is within the last 24 hours, reuse it. Otherwise scan + classify inline (~3s).<repo>/.vibe-test/state/coverage.json exists and its measured_at is within the last 1 hour AND it's from the current HEAD commit, reuse it. Otherwise run coverage inline.process.env.GITHUB_ACTIONS === 'true' OR --ci flag. CI mode changes output format (annotations instead of banner), not logic.--dry-run — available for hook-chained local runs. Computes + renders everything but exits 0 regardless of verdict. The would_exit field in the JSON sidecar carries the actual verdict the real run would have produced.../guide/SKILL.md. Persona opening/handoff lines, experience-level verbosity, Pattern #13 anchored + dynamic, session memory interfaces.../guide/references/data-contracts.md section "gate". You read audit.json, coverage.json, covered-surfaces.json. You write a gate.json summary sidecar at <repo>/.vibe-test/state/gate.json.../guide/references/plays-well-with.md. Entries with applies_to: gate are superpowers:verification-before-completion (co-invoke for verification-frame decisions).../guide/references/friction-triggers.md /vibe-test:gate section.src/reporter/tier-adaptive-language.ts.../session-logger/SKILL.md. start('gate', project) at entry, end({sessionUUID, command: 'gate', outcome}) at exit.All existing src/ modules — no new TypeScript introduced:
src/coverage/computeWeightedScore / TIER_THRESHOLDS — the locked formula.src/coverage/runCoverage — for inline coverage runs when state is stale.src/scanner/scan, classifyAppType, classifyModifiers — for inline classification when audit-state is stale or absent.src/state/project-state — readProjectState, projectStateSidecarPath, scopeHash.src/state/atomic-write — for the gate.json sidecar write.src/state/session-log / beacons — instrumentation.src/reporter — createReportObject, three renderers, getLanguageKnobs.Before any user-facing output:
shared.preferences.persona, shared.preferences.pacing, and plugins.vibe-test.testing_experience (fallback shared.technical_experience.level) from ~/.claude/profiles/builder.json.session-logger.start('gate', project_basename). Hold the returned sessionUUID in memory until Step 8.const ci = process.env.GITHUB_ACTIONS === 'true' || argsHave('--ci'). Hold this flag for the rest of the flow.Parse ../guide/references/plays-well-with.md via loadAnchoredRegistry(). Filter to entries whose applies_to includes gate.
superpowers:verification-before-completion — announce verbatim when present: "verification-before-completion owns per-task 'is this complete?' decisions. Gate owns the tier-threshold call. When we're both in scope, we co-invoke — no double verification." (Ga3 — gate owns tier-threshold; discipline skill owns per-task completion.)Surface at most ONE anchored complement announcement per invocation. In CI mode, compress to a single ::notice:: line.
<repo>/.vibe-test/state/audit.json. If present AND last_updated is within 24h:
classification.app_type, classification.tier, classification.modifiers, classification.confidence.audit_state_source: 'reused'.scan(repoRoot) → Inventory.classifyAppType({detection, routes, models, componentCount}) + classifyModifiers(...).--tier flag if passed; otherwise default to a conservative public-facing with a warning in the banner. Log friction_type: "default_tier_applied" at confidence low.audit_state_source: 'fresh-scan'.Attach the resulting classification to the ReportObject.
<repo>/.vibe-test/state/coverage.json. If present AND measured_at is within 1h AND commit_hash matches current HEAD:
per_level, weighted_score, denominator_honest.coverage_state_source: 'reused'.runCoverage({framework, cwd, adapterAccepted: <cached decline or null>, actualSourceFiles, c8TestCommand}) with the builder's most recent adapter decision (read from .vibe-test/state.json.coverage_adapter_declined). Do NOT prompt for adaptation in gate mode — gate is the decision surface, not the measurement surface. If the adapter has never been proposed, fall back to c8 --all.per_level from output (Tessl-deferred if present).coverage_state_source: 'fresh-run'.Call computeWeightedScore({perLevel, applicability, tier}) using:
perLevel from the coverage state (Step 2b).applicability from the classification matrix for the audit's app_type (audit-state contributes this when reused; inline classification rebuilds it).tier from the audit-state's classification.tier.The result: {score, threshold, passes, contributions}.
The verdict is a function of passes:
| Verdict | Exit code | Stdout (CI mode) | Banner (local mode) |
|---|---|---|---|
passes === true | 0 | ::notice::Vibe Test gate passed — {score} ≥ {threshold} ({tier}) | Green PASS section with score + threshold + tier |
passes === false | 1 | ::error::Vibe Test gate failed — {score} < {threshold} ({tier}) | Red BELOW section + "what would it take to pass" prescription |
| Tool error (coverage failed / tier undetectable / classification crash) | 2 | ::error::Vibe Test gate tool error — {reason} | Red "tool error" section + pointer to /vibe-test:fix |
If --dry-run was passed: compute the verdict, render banner, write gate.json with would_exit: <code>, then exit 0. The would_exit field is the real verdict.
When passes === false AND not in CI mode, compose concrete guidance. For each test level where applicability[level] === true:
smoke is cheapest (high component count × low per-test cost); integration is costliest."To clear the {tier} threshold ({threshold}), you need +{delta} weighted points. The cheapest path: - Raise smoke coverage from {current}% → {target}% (+{contribution} pts) - Raise behavioral coverage from {current}% → {target}% (+{contribution} pts) - …"
This mirrors the audit's gap ranking but from the threshold's perspective. Close the gap → clear the threshold.
In CI mode, collapse this to a single ::warning:: annotation per level with the math inline.
When superpowers:verification-before-completion is in the agent's available-skills list AND the verdict is PASS:
"Gate says this passes the {tier} threshold. Handing off to superpowers:verification-before-completion for the per-task 'is this actually complete?' check — gate owns tier-threshold; that skill owns task-completion."
verification-before-completion is absent: skip the co-invoke; the builder can always re-run /vibe-test:audit if they want a fresh gap list.When the verdict is FAIL or tool error: do NOT co-invoke — the work isn't complete and verification would waste the discipline skill's cycles.
Record the co-invoke as complements_invoked on the terminal session entry.
Build the ReportObject via createReportObject({command: 'gate', plugin_version, repo_root, scope, commit_hash}). Populate:
classification — from Step 2a.score — {current: score, target: threshold, per_level} from Step 3.findings — threshold-breach finding (severity: high, category: gap-<weakest-level>) if FAIL; tool-error finding (severity: critical, category: harness-break) if exit 2.actions_taken — {kind: 'other', description: 'tier threshold evaluated', target: '<tier>'}.deferrals — any Pattern #13 matches from Step 1.handoff_artifacts — ['docs/vibe-test/gate-<ISO>.md', '.vibe-test/state/gate.json'].next_step_hint — persona-adapted. Default on PASS: "Gate passed. Ship it.". Default on FAIL: "Run /vibe-test:generate to close the gaps. Re-run /vibe-test:gate after."Render views — mode-dependent:
Local mode:
renderMarkdown(report, {proseSlots}) → docs/vibe-test/gate-<ISO-date>.md.renderBanner(report, {columns, disableColors: !isTty}) → printed to chat.renderJson({report, repoRoot}) → .vibe-test/state/gate.json (skip schema validation — no gate-state.schema.json in v0.2; use skipValidation: true).CI mode:
$GITHUB_STEP_SUMMARY file when the env var is set. Use the same markdown body as the local-mode render.::notice:: / ::warning:: / ::error:: annotation prefixes to stdout instead of the banner. One annotation per finding (severity maps: critical|high → ::error::, medium → ::warning::, low|info → ::notice::)..vibe-test/state/gate.json — CI systems consume this for downstream gating beyond GitHub Actions.session-logger.end({sessionUUID, command: 'gate', outcome: 'completed' | 'errored', key_decisions: [<verdict>, <sources: audit/coverage reused vs fresh>], complements_invoked, artifact_generated}) — terminal entry. outcome is 'completed' for exit 0/1 and 'errored' for exit 2.beacons.append(repoRoot, {command: 'gate', sessionUUID, outcome: <as above>, hint: '<verdict>: {score}/{threshold} ({tier})'}) — Pattern #12.gate.json at <repo>/.vibe-test/state/gate.json. Fields: schema_version: 1, last_updated, plugin_version, project, verdict: 'pass' | 'fail' | 'tool-error', exit_code: 0 | 1 | 2, would_exit: 0 | 1 | 2 (only under --dry-run), score, threshold, tier, audit_state_source, coverage_state_source, ci_mode: boolean.On any state-write failure: log a runtime_hook_failure friction entry and continue — the exit code has already decided; logging is instrumentation.
Exit with the verdict-derived code. For --dry-run: exit 0 regardless (per the contract above). Never exit with any other code (3+).
Do NOT prescribe /clear between commands. Claude Code auto-compacts.
| Knob | first-time / beginner | intermediate | experienced |
|---|---|---|---|
| Verdict framing | Plain-English + threshold math in words | One-line verdict + math | Verdict + exit code + math |
| "What would it take" guidance | Full table + effort glosses + analogy | Compressed table | Top 1 level + delta |
| CI annotation verbosity | Same as intermediate (annotations have no tier knob — CI consumers read structured output) | Same | Same |
JSON output is level-invariant.
| Trigger | friction_type | confidence |
|---|---|---|
| Default tier applied (no audit-state, no --tier) | default_tier_applied | low |
| Coverage run crashed (tool error → exit 2) | harness_break | high |
Builder overrides verdict via --force (not v0.2; reserved for v0.3) | verdict_overridden | medium |
| Builder declines a Pattern #13 complement offer | complement_rejected | high |
When in doubt, don't log.
/vibe-test:fix; it does not attempt repair.superpowers:verification-before-completion owns per-task completion checks; gate owns tier-threshold verdicts. Explicitly no double verification.CI pipelines need a single source of truth for pass/fail. Gate is that source. Exit 0 for pass, 1 for threshold breach, 2 for tool error — no ambiguity, no stacking of "warnings" that nobody reads. The passes_tier_threshold flag in coverage.json is the same logic; gate is the command-line ergonomics around it.
Local mode adds the "what would it take to pass" prescription because a verdict is only half the value — the other half is knowing the cheapest path to clear the bar. The weighted-score formula is a pure function of per-level coverage × level weight × tier applicability; the delta analysis is the same formula in reverse.
CI mode collapses to GitHub Actions annotations because that's the ergonomics CI consumers expect. The structured .vibe-test/state/gate.json is the durable record; annotations are the in-PR surface.
Pattern #13 co-invoke with verification-before-completion is about composition boundaries. Gate is not a completion check — it's a quality gate. Verification-before-completion is a per-task discipline. Both are valuable; neither subsumes the other. The SKILL announces the boundary and defers cleanly.