Full-cycle shipping pipeline: plan → TDD → review → test → verify → eval → deliver. One command with hard quality gates and fix-retry loops. Requires ECC + Codex plugins.
One command to take a task from idea to merged PR with deterministic quality enforcement at every step.
/ship <task description>MANDATORY FIRST ACTION: Before anything else, initialize the pipeline state file:
echo "# /ship pipeline state — $(date -Iseconds)" > .ship-pipeline-state
echo "# Task: <task description>" >> .ship-pipeline-state
# Auto-detect mobile app project
IS_MOBILE=false
if [[ -d "ios" || -d "android" || -f "pubspec.yaml" || -f "react-native.config.js" \
|| -f "capacitor.config.ts" || -f "capacitor.config.json" ]]; then
IS_MOBILE=true
fi
if [[ -f "app.json" ]] && grep -q '"expo"' app.json 2>/dev/null; then
IS_MOBILE=true
fi
echo "MOBILE_APP=${IS_MOBILE}" >> .ship-pipeline-state
# Initialize learnings directory
mkdir -p .claude/ship-learnings
for phase in plan tdd review test verify eval deliver; do
if [[ ! -f ".claude/ship-learnings/${phase}.md" ]]; then
echo "# /ship Learnings — ${phase} phase" > ".claude/ship-learnings/${phase}.md"
echo "" >> ".claude/ship-learnings/${phase}.md"
echo "Learnings are auto-captured after each /ship run. Read before starting the phase." >> ".claude/ship-learnings/${phase}.md"
echo "" >> ".claude/ship-learnings/${phase}.md"
fi
done
This file is used by enforcement hooks to prevent skipping steps. If this file does not exist, the pipeline gates are disabled.
The MOBILE_APP flag determines whether Phase 5C (Native Simulator QA) is mandatory.
Both plugins must be installed and configured:
If either is missing, stop immediately:
HARD STOP: Missing prerequisites.
Run the setup script: bash /path/to/ship-pipeline/setup.sh
7 phases. Each phase has a fix-retry loop. Hard gate between phases — no skipping, no exceptions.
Phase 1: PLAN → design + adversarial challenge
Phase 2: TDD → tests first, then implementation
Phase 3: REVIEW → 3 independent reviewers (Claude + Codex + security)
Phase 4: TEST → full regression + E2E
Phase 5: VERIFY → build + lint + types + UI spot-check + native simulator
Phase 6: EVAL → acceptance criteria validation
Phase 7: DELIVER → commit + PR
Every phase follows this exact protocol when something fails:
attempt = 1
while attempt <= 3:
run ALL phase checks
if ALL PASS → advance to next phase
if ANY FAIL:
1. DIAGNOSE the root cause (not the symptom)
2. APPLY a fix (must differ from any previous attempt)
3. RE-RUN the FULL phase checks (not just the failing item)
attempt += 1
if still failing after 3 attempts:
HARD STOP
Report:
- What failed
- What was tried in each attempt
- Why each fix didn't hold
Ask user for direction
| Rule | Rationale |
|---|---|
| Diagnose before fixing | No blind retry of the same action |
| Each retry must try a DIFFERENT approach | Prevents loops doing the same broken fix 3x |
| Re-run the FULL phase check, not just the failing item | A fix in one place can break another |
| Max 3 fix attempts per phase | Bounded — doesn't spin forever |
| Hard stop means hard stop | No "skip with warning", no "move on and come back" |
| Fixes must pass the same bar as first-time code | No @ts-ignore, no skipped tests, no loosened assertions, no eslint-disable |
| No superficial patches | The fix must actually resolve the issue, not suppress it |
Every /ship run captures learnings and feeds them back into future runs. This is not optional — it's baked into every phase.
.claude/ship-learnings/
├── plan.md ← Phase 1 learnings
├── tdd.md ← Phase 2 learnings
├── review.md ← Phase 3 learnings
├── test.md ← Phase 4 learnings
├── verify.md ← Phase 5 learnings
├── eval.md ← Phase 6 learnings
└── deliver.md ← Phase 7 learnings
MANDATORY — Before starting any phase, read the corresponding learnings file:
cat .claude/ship-learnings/<phase>.md 2>/dev/null || true
If learnings exist, incorporate them into your approach for this phase. For example:
plan.md → avoids repeating past design mistakesreview.md → checks for recurring review findingsverify.md → pre-fixes known build/lint patternstdd.md → writes better tests based on past failuresMANDATORY — After every phase completes, update the learnings file. This is NOT a blind append — you must deduplicate and consolidate.
.claude/ship-learnings/<phase>.md file first**Seen:** count and add the new date. Do NOT add a duplicate entry.### <takeaway as a concise rule>
**Seen:** <count>x — <dates>
**Category:** <root-cause category: type-error | import | race-condition | security | config | test-flake | etc.>
**Example:** <most recent concrete example>
**Resolution pattern:** <what to do when you see this>
If review.md already contains:
### Always use parameterized queries for database access
**Seen:** 1x — 2026-04-05
**Category:** security
**Example:** String interpolation in profile lookup → SQL injection flagged
**Resolution pattern:** Use $1/$2 placeholders, never template literals for SQL
And the same issue appears again on 2026-04-10, update the existing entry:
### Always use parameterized queries for database access
**Seen:** 2x — 2026-04-05, 2026-04-10
**Category:** security
**Example:** Raw SQL in report generator → injection flagged (2026-04-10)
**Resolution pattern:** Use $1/$2 placeholders, never template literals for SQL
Do NOT add a second entry for the same root cause.
If verify.md has 3 separate entries about missing type packages, merge into:
### Verify type packages are in dependencies (not devDependencies) when imported in source
**Seen:** 3x — 2026-04-06, 2026-04-08, 2026-04-12
**Category:** type-error
**Example:** @types/chart.js, @types/lodash, @types/node all failed build when in devDeps
**Resolution pattern:** After `npm install <pkg>`, check if `@types/<pkg>` needs to be in dependencies
| Rule | Why |
|---|---|
| Never blindly append | Prevents duplicate dumping |
| Read before writing | You need existing entries to compare against |
| No "clean pass" entries | They add noise without learning value |
| Consolidate at ~20 entries | Keeps the file scannable and useful as context |
| Takeaway is the heading | Makes scanning fast — headings are the rules |
| Track seen count | High-count entries are the most important learnings |
On first /ship run, create the directory and empty learning files:
mkdir -p .claude/ship-learnings
for phase in plan tdd review test verify eval deliver; do
if [[ ! -f ".claude/ship-learnings/${phase}.md" ]]; then
echo "# /ship Learnings — ${phase^} Phase" > ".claude/ship-learnings/${phase}.md"
echo "" >> ".claude/ship-learnings/${phase}.md"
echo "Learnings are auto-captured after each /ship run. Read before starting the phase." >> ".claude/ship-learnings/${phase}.md"
echo "" >> ".claude/ship-learnings/${phase}.md"
fi
done
Add this to the Pipeline State Initialization block (runs once at pipeline start).
Purpose: Design the implementation and stress-test the approach before writing any code.
.claude/ship-learnings/plan.md before starting.Invoke: /everything-claude-code:prp-plan <task description>
This produces:
Invoke: /codex:adversarial-review --wait --scope working-tree
This challenges:
If the adversarial review surfaces legitimate design flaws:
WAIT FOR USER CONFIRMATION BEFORE PROCEEDING TO PHASE 2.
.claude/ship-learnings/plan.md.Purpose: Write tests first, then implement code to make them pass.
.claude/ship-learnings/tdd.md before starting.Invoke: /everything-claude-code:tdd
This enforces:
If tests fail after implementation:
.claude/ship-learnings/tdd.md.Purpose: Three independent reviewers catch different classes of issues.
.claude/ship-learnings/review.md before starting. Apply past findings as a pre-check before invoking reviewers.Invoke: /everything-claude-code:code-review
Reviews for: code quality, patterns, maintainability, DRY violations, naming.
Invoke: /everything-claude-code:security-review
Reviews for: injection, XSS, auth bypass, secrets exposure, OWASP Top 10.
Invoke: /codex:review --wait
Reviews for: implementation correctness, edge cases, performance, idiomatic patterns.
Run 3A, 3B, 3C in parallel where possible.
Merge findings from all 3 reviewers. For each finding:
.claude/ship-learnings/review.md. Include specific review findings and their categories.Purpose: Full regression suite + end-to-end tests confirm nothing is broken.
.claude/ship-learnings/test.md before starting. Watch for previously flaky tests or known timing issues.Run the project's full test suite:
npm test
Invoke: /everything-claude-code:e2e
Runs Playwright end-to-end tests against critical user flows.
If any test fails:
.claude/ship-learnings/test.md. Note flaky tests, timing issues, and environment-specific failures.Purpose: Build, lint, types, and UI verification confirm the project is shippable.
.claude/ship-learnings/verify.md before starting. Pre-fix known build/lint/type patterns from past runs.Invoke: /everything-claude-code:verify
This runs:
npm run build)npx tsc --noEmit)npm run lint)Invoke: /everything-claude-code:browser-qa
This runs:
Skip 5B if the change is backend-only with no UI impact.
Always mandatory to attempt. The pipeline must always try to run native simulator verification. If it fails due to environment/setup issues (no simulator installed, no Xcode, no Android SDK, etc.), ask the user whether to skip. Do NOT silently skip.
# Check simulator availability
SIMULATOR_AVAILABLE=false
SKIP_REASON=""
# Flutter
if [[ -f "pubspec.yaml" ]]; then
if command -v flutter &>/dev/null && flutter devices 2>/dev/null | grep -qE "(simulator|emulator)"; then
SIMULATOR_AVAILABLE=true
else
SKIP_REASON="Flutter SDK not installed or no simulator/emulator running"
fi
# React Native / Expo
elif [[ -f "react-native.config.js" || (-f "app.json" && grep -q '"expo"' app.json 2>/dev/null) ]]; then
if command -v xcrun &>/dev/null && xcrun simctl list devices 2>/dev/null | grep -q "Booted"; then
SIMULATOR_AVAILABLE=true
elif command -v emulator &>/dev/null; then
SIMULATOR_AVAILABLE=true
else
SKIP_REASON="No iOS Simulator booted and no Android Emulator available"
fi
# Capacitor / native hybrid
elif [[ -f "capacitor.config.ts" || -f "capacitor.config.json" ]]; then
if command -v xcrun &>/dev/null || command -v emulator &>/dev/null; then
SIMULATOR_AVAILABLE=true
else
SKIP_REASON="No native toolchain (Xcode/Android SDK) found for Capacitor"
fi
# iOS/Android dirs exist (generic native project)
elif [[ -d "ios" || -d "android" ]]; then
if command -v xcrun &>/dev/null || command -v emulator &>/dev/null; then
SIMULATOR_AVAILABLE=true
else
SKIP_REASON="ios/ or android/ directories exist but no simulator toolchain found"
fi
# Web-only project — no native markers
else
SKIP_REASON="No native mobile project markers detected (ios/, android/, pubspec.yaml, etc.)"
fi
⚠ PHASE 5C: Native Simulator QA cannot run.
Reason: <SKIP_REASON>
This step is mandatory. However, it can be skipped with your explicit approval.
Do you want to skip native simulator testing for this run? (yes/no)
STEP_5C=skipped in state file, continue to Phase 6Invoke: /everything-claude-code:flutter-test
This runs:
flutter test — unit + widget testsflutter drive or integration_test on iOS Simulator / Android EmulatorRun Detox or Maestro on simulator:
# iOS Simulator
npx detox test --configuration ios.sim.release
# or Maestro
maestro test .maestro/
# Android Emulator
npx detox test --configuration android.emu.release
Run Playwright E2E in a WebView context, then verify native plugin behavior on simulator.
What this checks that browser-only testing misses:
If build/lint/types fail:
If browser-qa finds issues:
If native simulator tests fail:
.claude/ship-learnings/verify.md. Include build errors, type issues, and simulator findings.Purpose: Validate that the feature actually does what was asked — not just that code compiles and tests pass.
.claude/ship-learnings/eval.md before starting. Check for recurring gaps between acceptance criteria and implementation.Invoke: /everything-claude-code:eval
Using the acceptance criteria from the Phase 1 plan, construct evals:
# Example: verify the new endpoint exists and returns expected shape
curl -s http://localhost:3000/api/new-endpoint | jq '.fieldName' && echo "PASS" || echo "FAIL"
# Example: verify the function handles edge case
npm test -- --testPathPattern="new-feature.edge" && echo "PASS" || echo "FAIL"
[MODEL GRADER]
Given the acceptance criteria: "<criteria from plan>"
And the implementation diff: "<git diff>"
Does the implementation satisfy the criteria?
Score: PASS / FAIL
Reasoning: [explanation]
If any eval fails:
.claude/ship-learnings/eval.md. Note gaps between what was planned and what was actually built.Purpose: Commit and create a PR with full traceability.
.claude/ship-learnings/deliver.md before starting. Check for past PR description issues or commit message patterns.Invoke: /everything-claude-code:prp-commit
Commit message should reference what was built and that it passed all quality gates.
Invoke: /everything-claude-code:prp-pr
PR description should include:
rm -f .ship-pipeline-state
.claude/ship-learnings/deliver.md.After all 7 phases pass, report:
## /ship Complete
Task: <original task description>
Branch: <branch name>
PR: <PR URL>
### Phase Summary
| Phase | Status | Attempts |
|-------|--------|----------|
| 1. Plan | PASS | 1 |
| 2. TDD | PASS | 1 |
| 3. Review | PASS | 2 (fixed: missing null check) |
| 4. Test | PASS | 1 |
| 5. Verify | PASS | 1 |
| 6. Eval | PASS | 1 |
| 7. Deliver | PASS | 1 |
### Quality Evidence
- Tests: X passed, 0 failed, Y% coverage
- Reviewers: Claude (0 P0/P1), Codex (0 P0/P1), Security (0 P0/P1)
- Build: clean
- Evals: X/X acceptance criteria satisfied
| Anti-Pattern | Why It's Wrong |
|---|---|
| Skip Phase 1 for "simple" changes | Simple changes have hidden complexity. The plan catches it. |
| Write implementation before tests (Phase 2) | TDD is non-negotiable. Tests first. |
| Fix a P1 review finding and skip re-review | The fix might introduce a new issue. Re-run all reviewers. |
Mark a test as .skip to pass Phase 4 | That's not fixing, that's hiding. |
Add @ts-ignore to pass Phase 5 | Fix the type error properly. |
| Loosen an eval assertion to pass Phase 6 | The eval reflects the requirement. Fix the code, not the eval. |
| Proceed after hard stop without user input | Hard stop means STOP. Ask the user. |
| Retry the same fix twice | Each attempt must try a different approach. |
/everything-claude-code:prp-plan — Phase 1 planning/everything-claude-code:tdd — Phase 2 test-driven development/everything-claude-code:code-review — Phase 3 code review/everything-claude-code:security-review — Phase 3 security review/codex:adversarial-review — Phase 1 design challenge/codex:review — Phase 3 implementation review/everything-claude-code:e2e — Phase 4 end-to-end tests/everything-claude-code:verify — Phase 5 verification/everything-claude-code:browser-qa — Phase 5 UI spot-check/everything-claude-code:flutter-test — Phase 5C Flutter simulator testing/everything-claude-code:eval — Phase 6 acceptance evaluation/everything-claude-code:prp-commit — Phase 7 commit/everything-claude-code:prp-pr — Phase 7 pull request