Generate task-specific evaluation criteria from an implementation plan. Use after writing a plan (via writing-plans skill) or manually on any plan file. Produces a scored eval file that the eval-verifier agent uses post-implementation.
Announce: "I'm using the eval-criteria skill to generate evaluation criteria from the plan."
Generate task-specific, taste-informed evaluation criteria from an implementation plan. Criteria cover functional requirements (pass/fail), code quality (scored), UX/craft (scored), and completeness across personas.
If a plan path is provided as argument, use it.
Otherwise, find the most recent .md file in docs/superpowers/plans/.
Read the plan file fully before proceeding.
Also read:
CLAUDE.md if it exists (for project-specific conventions)~/.claude/taste/ (for taste-informed criteria):
architecture.md, product.md, ux.md, code.md, process.md, communication.mdExtract from the plan:
Count unique files across all tasks (Create + Modify, excluding Test files):
Scan file paths and task descriptions for these patterns to auto-select relevant code quality categories:
| Pattern in files/descriptions | Criteria added | Owner dimension |
|---|---|---|
| auth, permission, login, session, token, password | Security | security |
| database, migration, query, SQL, prisma, drizzle | Performance + Architecture | performance, architecture |
| api, endpoint, route, controller, handler | Error handling + Security | error-handling, security |
| component, page, layout, form, modal, .tsx, .jsx, .css | UX/Craft | ux-reviewer, visual-reviewer, content-reviewer |
| test, spec, .test., .spec. | Testing | testing |
| config, env, infra, deploy, docker, ci | Architecture | architecture |
No match = architecture + error-handling as defaults.
One criterion per plan task outcome / requirement. Each includes:
Selected by the auto-select mapping above. Each includes:
Apply taste principles: Read ~/.claude/taste/code.md and ~/.claude/taste/architecture.md. For each HIGH confidence principle relevant to this plan, add it as a specific criterion or weave it into an existing category.
Examples of taste-informed criteria:
any or as assertions — use unknown + type guards" (from taste/code.md)Only generated if UI work is detected. Each includes category, owner, verify.
Apply taste principles: Read ~/.claude/taste/ux.md. For HIGH confidence principles relevant to this plan, add specific criteria:
For each affected persona, generate a completeness check:
## Completeness (pass/fail)
- [ ] End User: can complete the full flow without help
- verify: walk through the user journey start to finish
- [ ] Admin: can manage this feature via UI (not database)
- verify: check CRUD operations available in admin panel
- [ ] API: endpoints are complete (not just Read — Create/Update/Delete if applicable)
- verify: check all CRUD endpoints exist and return proper responses
- [ ] System: migration is reversible, existing data handled
- verify: run migration up and down, check existing records
Only include personas actually affected by the plan. Skip personas the plan doesn't touch.
Apply taste: "A feature without management UI is not a feature" (taste/product.md, HIGH confidence). If the plan creates a new entity/feature but has no admin UI task, FLAG it:
- [ ] FLAGGED: {entity} has no admin management UI in this plan — intentional?
- verify: confirm admin can create/edit/delete {entity} via UI, not just database
Apply relevant HIGH confidence taste principles as an additional scored section. Select only principles that directly relate to what the plan builds:
## Taste (scored 1-5)
- **Simplicity** — no over-engineering, minimal states, simple infrastructure
- owner: architecture
- verify: check for unnecessary abstractions, premature optimization, unused flexibility
- taste: architecture.md#simplicity-over-infrastructure, architecture.md#fewer-states
- **User language** — UI copy uses domain terms, not technical jargon
- owner: ux-reviewer
- verify: check labels, error messages, empty states for plain language
- taste: product.md#use-domain-language, communication.md#always-ask-whats-simpler
Only include taste criteria with relevant principles. Don't add generic taste checks.
mkdir -p docs/superpowers/evals/
Write to docs/superpowers/evals/YYYY-MM-DD-{feature}-eval.md:
# Evaluation Criteria: {feature_name}
---
plan_path: {relative path to plan file}
plan_modified: {ISO timestamp of plan file's last modification}