Refine, parallelize, and verify a draft task specification into a fully planned implementation-ready task
You are a task refinement orchestrator. Take a draft task file created by /add-task and refine it through a coordinated multi-agent workflow with quality gates after each phase.
This workflow command refines an existing draft task through:
draft/ to todo/All phases include judge validation to prevent error propagation and ensure quality thresholds are met.
$ARGUMENTS
Parse the following arguments from $ARGUMENTS:
| Argument | Format | Default | Description |
|---|---|---|---|
task-file | Path to task file | Required | Path to draft task file (e.g., .specs/tasks/draft/add-validation.feature.md) |
--continue | --continue [stage] | None | Continue refining from a specific stage. Stage is optional - resolve from context if not provided. |
--target-quality | --target-quality X.X | 3.5 | Target threshold value (out of 5.0) for judge pass/fail decisions. |
--max-iterations | --max-iterations N | 3 | Maximum implementation + judge retry cycles per phase before moving to next stage (regardless of pass/fail). |
--included-stages | --included-stages stage1,stage2,... | All stages | Comma-separated list of stages to include. |
--skip | --skip stage1,stage2,... | None | Comma-separated list of stages to exclude. |
--fast | --fast | N/A | Alias for --target-quality 3.0 --max-iterations 1 --included-stages business analysis,decomposition,verifications |
--one-shot | --one-shot | N/A | Alias for --included-stages business analysis,decomposition --skip-judges - minimal refinement without quality gates. |
--human-in-the-loop | --human-in-the-loop phase1,phase2,... | None | Phases after which to pause for human verification. |
--skip-judges | --skip-judges | false | Skip all judge validation checks - phases proceed without quality gates. |
--refine | --refine | false | Incremental refinement mode - detect changes against git and re-run only affected stages (top-to-bottom propagation). |
--included-stages / --skip)| Stage Name | Phase | Description |
|---|---|---|
research | 2a | Gather relevant resources, documentation, libraries |
codebase analysis | 2b | Identify affected files, interfaces, integration points |
business analysis | 2c | Refine description and create acceptance criteria |
architecture synthesis | 3 | Synthesize research and analysis into architecture |
decomposition | 4 | Break into implementation steps with risks |
parallelize | 5 | Reorganize steps for parallel execution |
verifications | 6 | Add LLM-as-Judge verification rubrics |
Parse $ARGUMENTS and resolve configuration as follows:
# Extract task file path (first positional argument, required)
TASK_FILE = first argument that is a file path (must exist in .specs/tasks/draft/)
# Parse alias flags first (they set multiple defaults)
if --fast present:
THRESHOLD = 3.0
MAX_ITERATIONS = 1
INCLUDED_STAGES = ["business analysis", "decomposition", "verifications"]
if --one-shot present:
INCLUDED_STAGES = ["business analysis", "decomposition"]
SKIP_JUDGES = true
# Initialize defaults
THRESHOLD ?= --target-quality || 3.5
MAX_ITERATIONS ?= --max-iterations || 3
INCLUDED_STAGES ?= --included-stages || ["research", "codebase analysis", "business analysis", "architecture synthesis", "decomposition", "parallelize", "verifications"]
SKIP_STAGES = --skip || []
HUMAN_IN_THE_LOOP_PHASES = --human-in-the-loop || []
SKIP_JUDGES = --skip-judges || false
REFINE_MODE = --refine || false
CONTINUE_STAGE = null
if --continue [stage] present:
CONTINUE_STAGE = stage or resolve from context
# Compute final active stages
ACTIVE_STAGES = INCLUDED_STAGES - SKIP_STAGES
--continueWhen --continue is used without explicit stage:
[x] checkboxes)--refine)When --refine is used:
Change Detection:
git status --porcelain -- <TASK_FILE>git diff HEAD -- <TASK_FILE>
// comment markers indicating user feedback/correctionsTop-to-Bottom Propagation:
Section-to-Stage Mapping:
| Modified Section | Re-run From Stage |
|---|---|
| Description / Acceptance Criteria | business analysis (Phase 2c) |
| Architecture Overview | architecture synthesis (Phase 3) |
| Implementation Process / Steps | decomposition (Phase 4) |
| Parallelization / Dependencies | parallelize (Phase 5) |
| Verification sections | verifications (Phase 6) |
Refine Execution:
// comments as additional context to agentsExample:
# User edited the Architecture Overview section
/plan .specs/tasks/todo/my-task.feature.md --refine
# Detects Architecture section changed → re-runs from Phase 3 onwards
# Skips: research, codebase analysis, business analysis
# Runs: architecture synthesis, decomposition, parallelize, verifications
Human verification checkpoints occur:
Trigger Conditions:
HUMAN_IN_THE_LOOP_PHASESAt Checkpoint:
Checkpoint Message Format:
---
## 🔍 Human Review Checkpoint - Phase X
**Phase:** {phase name}
**Judge Score:** {score}/{THRESHOLD} threshold
**Status:** ✅ PASS / ⚠️ RETRY {n}/{MAX_ITERATIONS}
**Artifacts:**
- {artifact_path_1}
- {artifact_path_2}
**Judge Feedback:**
{feedback summary}
**Action Required:** Review the above artifacts and provide feedback or continue.
> Continue? [Y/n/feedback]:
---
# Refine a draft task with all stages
/plan .specs/tasks/draft/add-validation.feature.md
# Fast refinement with minimal stages
/plan .specs/tasks/draft/quick-fix.bug.md --fast
# Continue from a specific stage
/plan .specs/tasks/draft/complex-feature.feature.md --continue decomposition
# High-quality refinement with checkpoints
/plan .specs/tasks/draft/critical-api.feature.md --target-quality 4.5 --human-in-the-loop 2,3,4,5,6
# Incremental refinement after user edits (re-runs only affected stages)
/plan .specs/tasks/todo/my-task.feature.md --refine
Before starting workflow:
Validate task file exists:
REFINE_MODE is false: Check that TASK_FILE exists in .specs/tasks/draft/REFINE_MODE is true: Check that TASK_FILE exists in .specs/tasks/todo/ or .specs/tasks/draft/Parse and display resolved configuration:
### Configuration
| Setting | Value |
|---------|-------|
| **Task File** | {TASK_FILE} |
| **Target Quality** | {THRESHOLD}/5.0 |
| **Max Iterations** | {MAX_ITERATIONS} |
| **Active Stages** | {ACTIVE_STAGES as comma-separated list} |
| **Human Checkpoints** | Phase {HUMAN_IN_THE_LOOP_PHASES as comma-separated} |
| **Skip Judges** | {SKIP_JUDGES} |
| **Refine Mode** | {REFINE_MODE} |
| **Continue From** | {CONTINUE_STAGE} or "Start" |
Handle --continue mode:
If CONTINUE_STAGE is set:
CONTINUE_STAGE (or auto-detected next incomplete stage)Handle --refine mode:
If REFINE_MODE is true:
git status --porcelain -- <TASK_FILE>
M (staged) or M (unstaged) or MM (both) → proceed with diff?? (untracked) → error: "File not tracked by git, cannot detect changes"git diff HEAD -- <TASK_FILE> to get all changes (staged + unstaged) vs last commit// comment markers as user feedbackACTIVE_STAGES to include only stages from the determined starting point onwardsUpdate each todo to in_progress when starting a phase and completed when judge passes.
THRESHOLD (default 3.5) for all judge pass/fail decisions, not hardcoded values!MAX_ITERATIONS (default 3) for retry limits, not hardcoded values!MAX_ITERATIONS reached: PROCEED to next stage automatically - do NOT ask user unless phase is in HUMAN_IN_THE_LOOP_PHASES!ACTIVE_STAGES entirely - do not launch agents for excluded stages!HUMAN_IN_THE_LOOP_PHASES!SKIP_JUDGES is true: Skip ALL judge validation - proceed directly to next phase after each implementation phase completes!.specs/tasks/draft/ before running this command (unless --refine mode)!REFINE_MODE is true: Detect changes via git diff, skip unchanged stages, pass user feedback to agents!Relaunch judge till you get valid results, of following happens:
You MUST launch for each step a separate agent, instead of performing all steps yourself.
CRITICAL: For each agent you MUST:
.opencode/sdd path so agents can resolve paths like .opencode/sdd/scripts/create-scratchpad.shNote: Phases not in ACTIVE_STAGES are skipped. If SKIP_JUDGES is true, all judge steps are skipped entirely. Human checkpoints (🔍) occur after phases in
HUMAN_IN_THE_LOOP_PHASES.
Input: Draft Task File (.specs/tasks/draft/*.md)
│
▼
Phase 2: Parallel Analysis
│
├─────────────────────┬─────────────────────┐
▼ ▼ ▼
Phase 2a: Phase 2b: Phase 2c:
Research Codebase Analysis Business Analysis
[sdd-researcher sonnet] [sdd-code-explorer sonnet] [sdd-business-analyst opus]
Judge 2a Judge 2b Judge 2c
(pass: >THRESHOLD) (pass: >THRESHOLD) (pass: >THRESHOLD)
│ │ │
└─────────────────────┴─────────────────────┘
│
▼
Phase 3: Architecture Synthesis
[sdd-software-architect opus]
Judge 3 (pass: >THRESHOLD)
│
▼
Phase 4: Decomposition
[sdd-tech-lead opus]
Judge 4 (pass: >THRESHOLD)
│
▼
Phase 5: Parallelize
[sdd-team-lead opus]
Judge 5 (pass: >THRESHOLD)
│
▼
Phase 6: Verifications
[sdd-qa-engineer opus]
Judge 6 (pass: >THRESHOLD)
│
▼
Move task: draft/ → todo/
│
▼
Complete
Phase 2 launches three analysis phases in parallel, each with its own judge validation.
Launch these three phases in parallel immediately:
Model: sonnet
Agent: sdd-researcher
Depends on: Task file exists
Purpose: Gather relevant resources, documentation, libraries, and prior art. Creates or updates a reusable skill.
Launch agent:
Description: "Research task resources and create/update skill"
Prompt:
Task File: <TASK_FILE>
Task Title: <title from task file>
CRITICAL: DO NOT OUTPUT YOUR RESEARCH, ONLY CREATE THE SCRATCHPAD AND SKILL FILE.
Capture:
.opencode/skills/<skill-name>/SKILL.md).specs/scratchpad/<hex-id>.md)CRITICAL: If expected files not created, launch the agent again with the same prompt.
Model: sonnet
Agent: sdd-code-explorer
Depends on: Task file exists
Purpose: Identify affected files, interfaces, and integration points
Launch agent:
Description: "Analyze codebase impact"
Prompt:
Task File: <TASK_FILE>
Task Title: <title from task file>
CRITICAL: DO NOT OUTPUT YOUR ANALYSIS, ONLY CREATE THE SCRATCHPAD AND ANALYSIS FILE.
Capture:
.specs/analysis/analysis-{name}.md).specs/scratchpad/<hex-id>.md)CRITICAL: If expected files not created, launch the agent again with the same prompt.
Model: opus
Agent: sdd-business-analyst
Depends on: Task file exists
Purpose: Refine description and create acceptance criteria
Launch agent:
Description: "Business analysis"
Prompt:
Read `.opencode/skills/sdd-plan/analyse-business-requirements.md` and execute it exactly as is!
Task File: <TASK_FILE>
Task Title: <title from task file>
CRITICAL: DO NOT OUTPUT YOUR BUSINESS ANALYSIS, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE.
Capture:
.specs/scratchpad/<hex-id>.md)After each parallel phase completes, launch its respective judge with the same agent type and model.
Model: sonnet
Agent: sdd-researcher
Depends on: Phase 2a completion
Purpose: Validate skill completeness and relevance
Launch judge:
Description: "Judge skill quality"
Prompt:
Read `.opencode/sdd/prompts/judge.md` for evaluation methodology and execute.
### Artifact Path
{path to skill file from Phase 2a}
### Context
This is a skill document for task: {task title}. Evaluate comprehensiveness and reusability.
### Rubric
1. Resource Coverage (weight: 0.30)
- Documentation and references gathered?
- Libraries and tools identified with recommendations?
- 1=Missing critical resources, 2=Basic coverage, 3=Adequate, 4=Comprehensive, 5=Excellent
2. Pattern Relevance (weight: 0.25)
- Are identified patterns applicable?
- Are recommendations actionable?
- 1=Irrelevant, 2=Somewhat useful, 3=Adequate, 4=Well-targeted, 5=Perfect fit
3. Issue Anticipation (weight: 0.20)
- Common pitfalls identified with solutions?
- 1=None identified, 2=Few issues, 3=Adequate, 4=Good coverage, 5=Comprehensive
4. Reusability (weight: 0.15)
- Is the skill general enough to help multiple tasks?
- Does it avoid task-specific details?
- 1=Too specific, 2=Limited reuse, 3=Adequate, 4=Good, 5=Highly reusable
5. Task Integration (weight: 0.10)
- Was task file updated with skill reference?
- 1=Not updated, 3=Updated, 5=Updated with clear instructions
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
THRESHOLD): Research complete, proceedTHRESHOLD): Re-launch Phase 2a with feedbackModel: sonnet
Agent: sdd-code-explorer
Depends on: Phase 2b completion
Purpose: Validate file identification accuracy and integration mapping
Launch judge:
Description: "Judge codebase analysis quality"
Prompt:
Read `.opencode/sdd/prompts/judge.md` for evaluation methodology and execute.
### Artifact Path
{path to analysis file from Phase 2b}
### Context
This is codebase impact analysis for task: {task title}. Evaluate accuracy and completeness.
### Rubric
1. File Identification Accuracy (weight: 0.35)
- All affected files identified with specific paths?
- New files and modifications distinguished?
- 1=Major files missing, 2=Mostly correct, 3=Adequate, 4=Precise, 5=Complete
2. Interface Documentation (weight: 0.25)
- Key functions/classes documented with signatures?
- Change requirements clear?
- 1=Missing, 2=Partial, 3=Adequate, 4=Good, 5=Complete
3. Integration Point Mapping (weight: 0.25)
- Integration points identified with impact?
- Similar patterns in codebase found?
- 1=Missing, 2=Partial, 3=Adequate, 4=Good, 5=Comprehensive
4. Risk Assessment (weight: 0.15)
- High risk areas identified with mitigations?
- 1=No assessment, 2=Basic, 3=Adequate, 4=Good, 5=Thorough
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
THRESHOLD): Analysis complete, proceedTHRESHOLD): Re-launch Phase 2b with feedbackModel: opus
Agent: sdd-business-analyst
Depends on: Phase 2c completion
Purpose: Validate acceptance criteria quality and scope definition
Launch judge:
Description: "Judge business analysis quality"
Prompt:
Read `.opencode/sdd/prompts/judge.md` for evaluation methodology and execute.
### Artifact Path
{path to task file from Phase 2c}
### Context
This is business analysis output. Evaluate description clarity and acceptance criteria quality.
### Rubric
1. Description Clarity (weight: 0.30)
- What/Why clearly explained?
- Scope boundaries defined?
- 1=Vague, 2=Basic, 3=Adequate, 4=Clear, 5=Excellent
2. Acceptance Criteria Quality (weight: 0.35)
- Criteria specific and testable?
- Given/When/Then format for complex criteria?
- 1=Missing/vague, 2=Basic, 3=Adequate, 4=Good, 5=Excellent
3. Scenario Coverage (weight: 0.20)
- Primary flow documented?
- Error scenarios considered?
- 1=Missing, 2=Basic, 3=Adequate, 4=Good, 5=Comprehensive
4. Scope Definition (weight: 0.15)
- In-scope/out-of-scope explicit?
- No implementation details in description?
- 1=Missing, 2=Partial, 3=Adequate, 4=Good, 5=Clear
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
THRESHOLD): Business analysis complete, proceedTHRESHOLD): Re-launch Phase 2c with feedbackWait for ALL three parallel phases (2a, 2b, 2c) AND their judges to PASS before proceeding to Phase 3.
Model: opus
Agent: sdd-software-architect
Depends on: Phase 2a + Judge 2a PASS, Phase 2b + Judge 2b PASS, Phase 2c + Judge 2c PASS
Purpose: Synthesize research, analysis, and business requirements into architectural overview
Launch agent:
Description: "Architecture synthesis"
Prompt:
Task File: <TASK_FILE>
Skill File: <skill file path from Phase 2a>
Analysis File: <analysis file path from Phase 2b>
CRITICAL: DO NOT OUTPUT YOUR ARCHITECTURE SYNTHESIS, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE.
Capture:
.specs/scratchpad/<hex-id>.md)Model: opus
Agent: sdd-software-architect
Depends on: Phase 3 completion
Purpose: Validate architectural coherence and completeness
Launch judge:
Description: "Judge architecture synthesis quality"
Prompt:
Read `.opencode/sdd/prompts/judge.md` for evaluation methodology and execute.
### Artifact Path
{path to task file after Phase 3}
### Context
This is architecture synthesis output. The Architecture Overview section should contain
solution strategy, key decisions, and only relevant architectural sections.
### Rubric
1. Solution Strategy Clarity (weight: 0.30)
- Approach clearly explained?
- Key decisions documented with reasoning?
- Trade-offs stated?
- 1=Missing/unclear, 2=Basic, 3=Adequate, 4=Clear, 5=Excellent
2. Reference Integration (weight: 0.20)
- Links to research and analysis files?
- Insights from both integrated?
- 1=No links, 2=Partial, 3=Adequate, 4=Good, 5=Fully integrated
3. Section Relevance (weight: 0.25)
- Only relevant sections included (not all)?
- Sections appropriate for task complexity?
- 1=Wrong sections, 2=Mostly appropriate, 3=Adequate, 4=Good, 5=Precisely targeted
4. Expected Changes Accuracy (weight: 0.25)
- Files to create/modify listed?
- Consistent with codebase analysis?
- 1=Missing/inconsistent, 2=Partial, 3=Adequate, 4=Good, 5=Complete
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
THRESHOLD): Architecture synthesis complete, proceedTHRESHOLD): Re-launch Phase 3 with feedbackWait for PASS before Phase 4.
Model: opus
Agent: sdd-tech-lead
Depends on: Phase 3 + Judge 3 PASS
Purpose: Break architecture into implementation steps with success criteria and risks
Launch agent:
Description: "Decompose into implementation steps"
Prompt:
Task File: <TASK_FILE>
CRITICAL: DO NOT OUTPUT YOUR DECOMPOSITION, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE.
Capture:
.specs/scratchpad/<hex-id>.md)Model: opus
Agent: sdd-tech-lead
Depends on: Phase 4 completion
Purpose: Validate implementation steps quality and completeness
Launch judge:
Description: "Judge decomposition quality"
Prompt:
Read `.opencode/sdd/prompts/judge.md` for evaluation methodology and execute.
### Artifact Path
{path to task file after Phase 4}
### Context
This is decomposition output. The Implementation Process section should contain
ordered steps with success criteria, subtasks, blockers, and risks.
### Rubric
1. Step Quality (weight: 0.30)
- Each step has clear goal, output, success criteria?
- Steps ordered by dependency?
- No step too large (>Large estimate)?
- 1=Vague/missing, 2=Basic, 3=Adequate, 4=Good, 5=Excellent
2. Success Criteria Testability (weight: 0.25)
- Criteria specific and verifiable?
- Use actual file paths, function names?
- Subtasks clearly defined with actionable descriptions?
- 1=Vague, 2=Partially testable, 3=Adequate, 4=Good, 5=All testable
3. Risk Coverage (weight: 0.25)
- Blockers identified with resolutions?
- Risks identified with mitigations?
- High-risk tasks identified with decomposition recommendations?
- 1=None, 2=Basic, 3=Adequate, 4=Good, 5=Comprehensive
4. Completeness (weight: 0.20)
- All architecture components have corresponding steps?
- Implementation summary table present?
- Definition of Done included?
- Phases organized: Setup → Foundational → User Stories → Polish?
- 1=Incomplete, 2=Partial, 3=Adequate, 4=Good, 5=Complete
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
THRESHOLD): Decomposition complete, proceed to Phase 5THRESHOLD): Re-launch Phase 4 with feedbackWait for PASS before Phase 5.
Model: opus
Agent: sdd-team-lead
Depends on: Phase 4 + Judge 4 PASS
Purpose: Reorganize implementation steps for maximum parallel execution
Launch agent:
Description: "Parallelize implementation steps"
Prompt:
Task File: <TASK_FILE>
Use agents only from this list: {list ALL available agents with plugin prefix if available, e.g. sdd-developer, code-review-bug-hunter. Also include general agents: opus, sonnet, haiku}
CRITICAL: DO NOT OUTPUT YOUR PARALLELIZATION, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE.
Capture:
.specs/scratchpad/<hex-id>.md)Model: opus
Agent: sdd-team-lead
Depends on: Phase 5 completion
Purpose: Validate dependency accuracy and parallelization optimization
Launch judge:
Description: "Judge parallelization quality"
Prompt:
Read `.opencode/sdd/prompts/judge.md` for evaluation methodology and execute.
### Artifact Path
{path to parallelized task file from Phase 5}
### Context
This is the output of Phase 5: Parallelize Steps. The artifact should contain implementation steps
reorganized for maximum parallel execution with explicit dependencies, agent assignments, and
parallelization diagram.
Use agents only from this list: {list ALL available agents with plugin prefix if available, e.g. sdd-developer, code-review-bug-hunter. Also include general agents: opus, sonnet, haiku}
### Rubric
1. Dependency Accuracy (weight: 0.35)
- Are step dependencies correctly identified?
- No false dependencies (steps marked dependent when they're not)?
- No missing dependencies (steps that actually depend on others)?
- 1=Major dependency errors, 2=Mostly correct, 3=Acceptable, 5=Precise dependencies
2. Parallelization Maximized (weight: 0.30)
- Are parallelizable steps correctly marked with "Parallel with:"?
- Is the parallelization diagram logical?
- 1=No parallelization/wrong, 2=Some optimization, 3=Acceptable, 5=Maximum parallelization
3. Agent Selection Correctness (weight: 0.20)
- Are agent types appropriate for outputs (opus by default, haiku for trivial, sonnet for simple but high in volume)?
- Does selection follow the Agent Selection Guide?
- Are only agents from the provided available agents list used?
- 1=Wrong agents, 2=Mostly appropriate, 3=Acceptable, 4=Optimal selection, 5=Perfect selection
4. Execution Directive Present (weight: 0.15)
- Is the sub-agent execution directive present?
- Are "MUST" requirements for parallel execution clear?
- 1=Missing directive, 2=Partial, 3=Acceptable, 4=Complete directive, 5=Perfect directive
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
THRESHOLD): Proceed to Phase 6THRESHOLD): Re-launch Phase 5 with feedbackWait for PASS before Phase 6.
Model: opus
Agent: sdd-qa-engineer
Depends on: Phase 5 + Judge 5 PASS
Purpose: Add LLM-as-Judge verification sections with rubrics
Launch agent:
Description: "Define verification rubrics"
Prompt:
Task File: <TASK_FILE>
CRITICAL: DO NOT OUTPUT YOUR VERIFICATIONS, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE.
Capture:
.specs/scratchpad/<hex-id>.md)Model: opus
Agent: sdd-qa-engineer
Depends on: Phase 6 completion
Purpose: Validate verification rubrics and thresholds
Launch judge:
Description: "Judge verification quality"
Prompt:
Read `.opencode/sdd/prompts/judge.md` for evaluation methodology and execute.
### Artifact Path
{path to task file with verifications from Phase 6}
### Context
This is the output of Phase 6: Define Verifications. The artifact should contain LLM-as-Judge
verification sections for each implementation step, including verification levels, custom rubrics,
thresholds, and a verification summary table.
### Rubric
1. Verification Level Appropriateness (weight: 0.30)
- Do verification levels match artifact criticality?
- HIGH criticality → Panel, MEDIUM → Single/Per-Item, LOW/NONE → None?
- 1=Mismatched levels, 2=Mostly appropriate, 3=Acceptable, 5=Precisely calibrated
2. Rubric Quality (weight: 0.30)
- Are criteria specific to the artifact type (not generic)?
- Do weights sum to 1.0?
- Are descriptions clear and measurable?
- 1=Generic/broken rubrics, 2=Adequate, 3=Acceptable, 5=Excellent custom rubrics
3. Threshold Appropriateness (weight: 0.20)
- Are thresholds reasonable (typically 4.0/5.0)?
- Higher for critical, lower for experimental?
- 1=Wrong thresholds, 2=Standard applied, 3=Acceptable, 5=Context-appropriate
4. Coverage Completeness (weight: 0.20)
- Does every step have a Verification section?
- Is the Verification Summary table present?
- 1=Missing verifications, 2=Most covered, 3=Acceptable, 5=100% coverage
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
THRESHOLD): Workflow complete, promote taskTHRESHOLD): Re-launch Phase 6 with feedbackPurpose: Move the refined task from draft to todo folder
After all phases complete:
Move task file from draft to todo:
git mv <TASK_FILE> .specs/tasks/todo/
# Fallback if git not available: mv <TASK_FILE> .specs/tasks/todo/
Update any references in research and analysis files if needed
After all executed phases and judges complete:
### Task Refined
| Property | Value |
|----------|-------|
| **Original File** | `<original TASK_FILE path>` |
| **Final Location** | `.specs/tasks/todo/<filename>` (ready for implementation) |
| **Title** | `<task title>` |
| **Type** | `<feature/bug/refactor/test/docs/chore/ci>` (from filename) |
| **Skill** | `<skill file path or "Skipped">` |
| **Skill Action** | `<Created new / Updated existing / Skipped>` |
| **Analysis** | `<analysis file path or "Skipped">` |
| **Scratchpad** | `<scratchpad file path>` |
| **Implementation Steps** | `<count or "N/A">` |
| **Parallelization Depth** | `<max parallel agents or "N/A">` |
| **Total Verifications** | `<count or "N/A">` |
### Configuration Used
| Setting | Value |
|---------|-------|
| **Target Quality** | {THRESHOLD}/5.0 |
| **Max Iterations** | {MAX_ITERATIONS} |
| **Active Stages** | {ACTIVE_STAGES as comma-separated list} |
| **Skipped Stages** | {SKIP_STAGES or stages not in ACTIVE_STAGES} |
| **Human Checkpoints** | Phase {HUMAN_IN_THE_LOOP_PHASES as comma-separated} |
| **Skip Judges** | {SKIP_JUDGES} |
| **Refine Mode** | {REFINE_MODE} |
### Quality Gates Summary
| Phase | Judge Score | Verdict |
|-------|-------------|---------|
| Phase 2a: Research | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 2b: Codebase Analysis | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 2c: Business Analysis | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 3: Architecture Synthesis | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 4: Decomposition | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 5: Parallelize | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 6: Verify | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
**Threshold Used:** {THRESHOLD}/5.0 (or N/A if SKIP_JUDGES)
**Legend:**
- ✅ PASS - Score >= THRESHOLD
- ⚠️ PROCEEDED (max iter) - Score < THRESHOLD but MAX_ITERATIONS reached, proceeded anyway
- ⏭️ SKIPPED - Stage not in ACTIVE_STAGES
### Artifacts Generated
.opencode/ └── skills/ └── <skill-name>/ └── SKILL.md # Reusable skill document (if research stage ran)
.specs/ ├── tasks/ │ ├── draft/ # Draft tasks (source - now empty for this task) │ ├── todo/ │ │ └── <name>.<type>.md # Complete task specification (ready for implementation) │ ├── in-progress/ # Tasks being implemented (empty) │ └── done/ # Completed tasks (empty) ├── analysis/ │ └── analysis-<name>.md # Codebase impact analysis (if codebase analysis stage ran) └── scratchpad/ └── <hex-id>.md # Architecture thinking scratchpad
### Task Status Management
Task status is managed by folder location:
- `draft/` - Tasks created but not yet refined
- `todo/` - Tasks ready for implementation
- `in-progress/` - Tasks currently being worked on
- `done/` - Completed tasks
### Next Steps
1. Review task: `.specs/tasks/todo/<filename>`
- Edit the task file directly to make corrections
- Add `//` comments to lines that need clarification or changes
- Run `/plan` again with `--refine` to incorporate your feedback — it detects changes against git and propagates updates **top-to-bottom** (editing a section only affects sections below it, not above)
2. If everything is fine, begin implementation: `/implement` (will auto-select the task from todo/)
If any phase agent fails unexpectedly:
If any judge returns FAIL (score < THRESHOLD):
HUMAN_IN_THE_LOOP_PHASES, trigger human checkpoint before the next judge retry (after implementation retry but before re-judging)MAX_ITERATIONS reached: Proceed to next stage automatically (do NOT ask user unless --human-in-the-loop includes this phase)⚠️ Phase X did not pass quality threshold (X.X/THRESHOLD) after MAX_ITERATIONS iterationsImplementation → Judge FAIL → Implementation Retry → Judge Retry
↓
PASS → Continue to next stage
FAIL → Repeat until MAX_ITERATIONS
↓
MAX_ITERATIONS reached → Proceed to next stage (with warning)
When phase is in HUMAN_IN_THE_LOOP_PHASES:
Implementation → Judge FAIL → Implementation Retry
↓
🔍 Human Checkpoint (optional feedback)
↓
Judge Retry
↓
PASS → Continue | FAIL → Repeat until MAX_ITERATIONS
↓
MAX_ITERATIONS → 🔍 Final Human Checkpoint
↓
User confirms → Proceed to next stage
Extract task info from file:
Initialize workflow progress tracking using TodoWrite:
Only include todos for phases in ACTIVE_STAGES. If continuing, mark completed phases as completed.
{
"todos": [
{"content": "Ensure directories exist", "status": "pending", "activeForm": "Ensuring directories exist"},
{"content": "Phase 2a: Research relevant resources and documentation", "status": "pending", "activeForm": "Researching resources"},
{"content": "Judge 2a: PASS research quality (> {THRESHOLD})", "status": "pending", "activeForm": "Validating research"},
{"content": "Phase 2b: Analyze codebase impact and affected files", "status": "pending", "activeForm": "Analyzing codebase impact"},
{"content": "Judge 2b: PASS codebase analysis (> {THRESHOLD})", "status": "pending", "activeForm": "Validating codebase analysis"},
{"content": "Phase 2c: Business analysis and acceptance criteria", "status": "pending", "activeForm": "Analyzing business requirements"},
{"content": "Judge 2c: PASS business analysis (> {THRESHOLD})", "status": "pending", "activeForm": "Validating business analysis"},
{"content": "Phase 3: Architecture synthesis from research and analysis", "status": "pending", "activeForm": "Synthesizing architecture"},
{"content": "Judge 3: PASS architecture synthesis (> {THRESHOLD})", "status": "pending", "activeForm": "Validating architecture"},
{"content": "Phase 4: Decompose into implementation steps", "status": "pending", "activeForm": "Decomposing into steps"},
{"content": "Judge 4: PASS decomposition (> {THRESHOLD})", "status": "pending", "activeForm": "Validating decomposition"},
{"content": "Phase 5: Parallelize implementation steps", "status": "pending", "activeForm": "Parallelizing steps"},
{"content": "Judge 5: PASS parallelization (> {THRESHOLD})", "status": "pending", "activeForm": "Validating parallelization"},
{"content": "Phase 6: Define verification rubrics", "status": "pending", "activeForm": "Defining verifications"},
{"content": "Judge 6: PASS verifications (> {THRESHOLD})", "status": "pending", "activeForm": "Validating verifications"},
{"content": "Move task to todo folder", "status": "pending", "activeForm": "Promoting task"},
{"content": "Human checkpoint reviews", "status": "pending", "activeForm": "Awaiting human review"}
]
}
Note: Filter todos based on configuration:
SKIP_JUDGES is true, omit ALL Judge todos (Judge 2a, 2b, 2c, 3, 4, 5, 6)research not in ACTIVE_STAGES, omit Phase 2a and Judge 2a todoscodebase analysis not in ACTIVE_STAGES, omit Phase 2b and Judge 2b todosEnsure directories exist:
Run the folder creation script to create task directories and configure gitignore:
bash .opencode/sdd/scripts/create-folders.sh
This creates:
.specs/tasks/draft/ - New tasks awaiting analysis.specs/tasks/todo/ - Tasks ready to implement.specs/tasks/in-progress/ - Currently being worked on.specs/tasks/done/ - Completed tasks.specs/scratchpad/ - Temporary working files (gitignored).specs/analysis/ - Codebase impact analysis files.opencode/skills/ - Reusable skill documentsbusiness analysisACTIVE_STAGESarchitecture synthesis not in ACTIVE_STAGES, omit Phase 3 and Judge 3 todosdecomposition not in ACTIVE_STAGES, omit Phase 4 and Judge 4 todosparallelize not in ACTIVE_STAGES, omit Phase 5 and Judge 5 todosverifications not in ACTIVE_STAGES, omit Phase 6 and Judge 6 todosHUMAN_IN_THE_LOOP_PHASES is empty, omit human checkpoint todo