Standard Operating Procedure for /implement phase. Covers TDD workflow, anti-duplication checks, task execution, and continuous testing.
Training Guide: Step-by-step procedures for executing the
/implementcommand with strict TDD discipline and anti-duplication focus.
Supporting references:
Purpose: Execute tasks from tasks.md using Test-Driven Development, preventing code duplication, and maintaining high quality through continuous testing.
Inputs:
specs/NNN-slug/tasks.md - Task breakdown (20-30 tasks)specs/NNN-slug/plan.md - Implementation planOutputs:
tasks.md (task statuses)workflow-state.yamlExpected duration: Variable (2-10 days depending on feature complexity)
Environment checks:
tasks.md exists with 20-30 tasks)plan.md exists)Knowledge requirements:
Actions:
Read tasks.md to understand:
Identify starting tasks:
Quality check: Do you understand which tasks to start with?
For EVERY task, follow strict TDD process:
Actions:
Example:
# Task 8: Write unit tests for StudentProgressService
# 1. Create test file
touch api/app/tests/services/test_student_progress_service.py
# 2. Write test cases
# (Write 5 test cases covering happy path + edge cases)
# 3. Run tests → expect failures
pytest api/app/tests/services/test_student_progress_service.py
# Expected: 5 failed, 0 passed ✓
# 4. Commit
git add api/app/tests/services/test_student_progress_service.py
git commit -m "test: add failing tests for StudentProgressService
Task 8: Write unit tests for StudentProgressService
- calculateProgress() tests (3 cases)
- getRecentActivity() tests (2 cases)
All tests currently failing (RED phase)
"
Quality check: Tests fail for the right reason (not implemented yet, not syntax errors).
Actions:
Anti-duplication checklist BEFORE writing code:
Example:
# Task 9: Implement StudentProgressService
# 1. Check for code reuse opportunities
grep -r "def calculate" api/app/services/*.py
grep -r "BaseService" api/app/services/*.py
# 2. Implement service (reusing BaseService pattern)
# (Write StudentProgressService class)
# 3. Run tests → work toward passing
pytest api/app/tests/services/test_student_progress_service.py
# First run: 3 failed, 2 passed
# Second run: 1 failed, 4 passed
# Final run: 0 failed, 5 passed ✓
# 4. Commit
git add api/app/services/student_progress_service.py
git commit -m "feat: implement StudentProgressService
Task 9: Implement StudentProgressService
- calculateProgress() method
- getRecentActivity() method
- Reuses BaseService pattern
All 5 tests passing (GREEN phase)
"
Quality check: All tests pass, no code duplication, follows existing patterns.
Actions:
Refactoring checklist:
Example:
# Task 10: Refactor StudentProgressService
# 1. Extract common calculation logic
# Before: completion_rate = len(completed) / len(total)
# After: completion_rate = calculate_percentage(completed, total)
# 2. Run tests after each change
pytest api/app/tests/services/test_student_progress_service.py
# Still: 0 failed, 5 passed ✓
# 3. Commit refactor
git add api/app/services/student_progress_service.py api/app/utils/math.py
git commit -m "refactor: extract percentage calculation to utility
Task 10: Refactor StudentProgressService
- Extracted calculate_percentage() to utils/math.py
- Added type hints to all methods
- Improved variable names (comp_rate → completion_rate)
All tests still passing (REFACTOR phase)
"
Quality check: Tests still pass, code is cleaner, no behavior changes.
Actions: After completing each task:
Mark task as complete in tasks.md:
### Task 9: Implement StudentProgressService ✅
**Status**: Completed
**Completion date**: 2025-10-21
**Actual time**: 6 hours (estimated: 6-8 hours)
**Acceptance criteria**:
- [x] All 5 unit tests from Task 8 pass
- [x] No code duplication (reuses BaseService)
- [x] Follows existing service patterns
- [x] Type hints on all methods
Commit task status update:
git add specs/NNN-slug/tasks.md
git commit -m "chore: mark Task 9 as complete"
Quality check: Task marked complete only when ALL acceptance criteria met.
Before implementing any new function/class, run these checks:
Search for similar code:
# Search by function name pattern
grep -r "def calculate" api/app/**/*.py
# Search by purpose keywords
grep -r "completion.*rate" api/app/**/*.py
# Search in plan.md for reuse strategy
grep "Reuse Strategy" specs/NNN-slug/plan.md
If similar code exists:
If creating new reusable pattern:
Quality check: No duplicate logic, high reuse rate (≥60% for standard features).
Test frequency guidelines:
After every code change (TDD discipline):
# Run specific test file
pytest api/app/tests/services/test_student_progress_service.py
# Run with coverage
pytest --cov=api/app/services api/app/tests/services/
After completing each task:
# Run all tests in category
pytest api/app/tests/services/ # All service tests
pytest api/app/tests/routes/ # All API tests
npm test src/components/ # All component tests
Before committing:
# Run full test suite
pytest
npm test
# Check coverage
pytest --cov=api --cov-report=term-missing
npm test -- --coverage
Quality check: All tests pass before committing, coverage ≥80% for business logic.
If task is blocked:
Identify blocker:
Document blocker in tasks.md:
### Task 15: Implement Progress API Endpoint
**Status**: Blocked
**Blocked by**: Task 9 (StudentProgressService not complete)
**Blocked date**: 2025-10-21
Move to next task:
Quality check: Blocked tasks documented, alternative tasks identified.
After completing related tasks, run integration tests:
Example (after API endpoint implemented):
# Task 16 complete → run integration tests
pytest api/app/tests/integration/test_student_progress_api.py
# Expected results:
# - 200 response with valid student ID ✓
# - 404 response with invalid ID ✓
# - 403 response without auth ✓
# - 400 response with invalid params ✓
# - Response time <500ms ✓
Quality check: All integration tests pass, response times meet targets.
After implementing UI components, run component tests:
Example:
# Task 23 complete → run component tests
npm test src/pages/ProgressDashboard.test.tsx
# Expected results:
# - Renders loading state ✓
# - Renders chart with data ✓
# - Handles error state ✓
# - Date filter works ✓
# - Keyboard navigation works ✓
# Run Lighthouse for accessibility
npx lighthouse http://localhost:3000/progress --only-categories=accessibility
# Expected: Score ≥95
Quality check: All component tests pass, accessibility score ≥95.
After completing all implementation tasks, run E2E tests:
Example:
# All tasks 1-26 complete → run E2E tests
npx playwright test e2e/progress-dashboard.spec.ts
# Expected user flow:
# 1. User logs in ✓
# 2. Navigates to student list ✓
# 3. Clicks "View Progress" ✓
# 4. Dashboard loads ✓
# 5. Filters by 7-day period ✓
# 6. Chart updates ✓
Quality check: Top 3 user journeys work end-to-end.
Before marking implementation phase complete, review:
Code quality:
Testing:
Performance:
Security:
Accessibility (if HAS_UI):
Quality check: All checklist items satisfied.
Actions:
Run full test suite:
# Backend tests
pytest --cov=api --cov-report=term-missing
# Frontend tests
npm test -- --coverage
# E2E tests
npx playwright test
Verify all tasks complete:
grep "Status: Completed" specs/NNN-slug/tasks.md | wc -l
# Expected: 28 (all tasks)
Update workflow state:
# Update workflow-state.yaml
currentPhase: implementation
status: completed
completedAt: 2025-10-21T14:30:00Z
Quality check: All tests pass, all tasks complete, workflow state updated.
Actions:
git add .
git commit -m "feat: complete implementation for <feature-name>
Completed 28 tasks:
- Foundation: 3 tasks ✓
- Data layer: 4 tasks ✓
- Business logic: 6 tasks ✓
- API layer: 6 tasks ✓
- UI layer: 6 tasks ✓
- Testing: 3 tasks ✓
Test results:
- Unit tests: 45 passed
- Integration tests: 12 passed
- Component tests: 18 passed
- E2E tests: 3 passed
- Coverage: 87%
🤖 Generated with [Claude Code](https://claude.com/claude-code)
"
Quality check: Implementation committed with comprehensive summary.
Impact: Unverified code, regression risk, poor test coverage
Bad workflow:
1. Implement StudentProgressService
2. Write tests afterward (if at all)
Result: Tests don't catch real bugs, coverage gaps
Correct TDD workflow:
1. Write failing tests (RED)
2. Implement to pass tests (GREEN)
3. Refactor while tests pass (REFACTOR)
Result: High confidence, comprehensive coverage
Prevention:
Impact: Technical debt, maintenance burden, inconsistent behavior
Scenario:
Task: Implement calculateCompletion() for student progress
Action: Writes new function from scratch
Existing: Similar calculateProgress() exists in 3 other modules
Result: 4th duplicate implementation
Prevention:
Anti-duplication checklist:
# Before implementing, search for similar code
grep -r "def calculate.*completion" api/
grep -r "completion.*rate" api/
# Check plan.md reuse strategy
grep "Reuse Strategy" specs/NNN-slug/plan.md
Impact: Tests don't catch real bugs, false confidence
Why it's bad:
Prevention: Enforce RED → GREEN → REFACTOR discipline
Impact: Hard to review, hard to revert, unclear history
Bad commit:
git commit -m "implement feature"
# Changed: 15 files, +2000 lines, -500 lines
Good commits (small, focused):
git commit -m "test: add failing tests for StudentProgressService (Task 8)"
# Changed: 1 file, +85 lines
git commit -m "feat: implement StudentProgressService (Task 9)"
# Changed: 1 file, +120 lines
git commit -m "refactor: extract percentage calculation to utility (Task 10)"
# Changed: 2 files, +15 lines, -10 lines
Prevention: Commit after each task (or even more frequently)
Impact: Broken code committed, CI fails, blocks team
Prevention:
# Pre-commit hook (add to .git/hooks/pre-commit)
#!/bin/bash
pytest
npm test
if [ $? -ne 0 ]; then
echo "Tests failed. Commit aborted."
exit 1
fi
Impact: Tasks forgotten, feature incomplete
Prevention:
Perfect TDD execution:
## Task 8-10: StudentProgressService (TDD Triplet)
### Task 8: Write Tests (RED)
1. Create test file with 5 test cases
2. Run tests → all fail (expected)
3. Commit: "test: add failing tests for StudentProgressService"
### Task 9: Implement (GREEN)
1. Check for code reuse opportunities (BaseService, utilities)
2. Write minimal code to pass tests
3. Run tests frequently → work toward all passing
4. Commit: "feat: implement StudentProgressService"
### Task 10: Refactor (REFACTOR)
1. Extract common logic to utilities
2. Add type hints and docstrings
3. Improve names and structure
4. Run tests after EACH change → stay green
5. Commit: "refactor: extract percentage calculation"
Result: High confidence, clean code, comprehensive tests
Before writing ANY new function:
# 1. Search by name pattern
grep -r "def calculate" api/app/**/*.py
# 2. Search by purpose
grep -r "completion.*rate" api/
# 3. Search by similar functionality
grep -r "percentage" api/app/utils/*.py
# 4. Check plan.md reuse strategy
grep -A 10 "Existing patterns to leverage" specs/NNN-slug/plan.md
# 5. If found → reuse or extract
# If not found → implement and document as new reusable pattern
Test at multiple levels:
| When | What to Run | Why |
|---|---|---|
| After each small change | Single test file | Immediate feedback, stay in flow |
| After completing task | Test category (all service tests) | Verify task complete, no regressions |
| Before committing | Full test suite | Ensure nothing broken |
| Before PR | Full suite + E2E + coverage | Ready for review |
Use consistent format:
<type>: <short summary> (Task N)
<detailed description>
- Bullet point 1
- Bullet point 2
<test results or metrics>
Types:
test: - Test file additions/changesfeat: - New feature implementationrefactor: - Code cleanup without behavior changefix: - Bug fixchore: - Task status update, toolingExample: