Use after /fpt-reconstruct to evaluate a first-principles cycle. Triggers on "evaluate the decomposition", "how did my FPT go", "rate the analysis", "eval". Do NOT use for evaluating things unrelated to a first-principles cycle. Output: self-eval + human-eval records + updated eval-log.json.
Evaluates a completed first-principles thinking cycle across two phases: a structured self-evaluation that scores each artifact against its quality criteria, and a human evaluation that captures implementation outcomes and narrative feedback. Both phases produce artifacts in the project directory and append a structured entry to first-principals-thinking/evals/eval-log.json — a running JSON record that enables pattern analysis across projects over time.
This skill is what turns individual FPT cycles into a learning system.
first-principals-thinking/projects/{project-name}//fpt-decompose and /fpt-reconstruct:
01-problem-statement.md through 06-decision-brief.mdfirst-principals-thinking/evals/eval-log.jsonLook up the project directory at first-principals-thinking/projects/{project-name}/.
If the user didn't provide a project name, list available projects under first-principals-thinking/projects/ and ask which one to evaluate.
Read the artifact chain. Determine which evaluation phase to run:
Score each artifact against its quality criteria. Use the eval criteria from references/evaluation-guide.md § Artifact Quality Criteria.
Eval 1: Problem Statement (5 criteria, /5)
Eval 2: Assumptions Map (5 criteria, /5)
Eval 3: Challenge Log (5 criteria, /5)
Eval 4: First Principles Foundation (5 criteria, /5)
Eval 5: Reconstructed Solution (5 criteria, /5)
Compute total: __/25. Identify weakest and strongest artifacts. State the primary gap.
Save self-eval to first-principals-thinking/projects/{project-name}/07-self-eval.md using assets/self-eval-template.md
Walk the user through structured feedback capture:
Implementation status: Not implemented / Partially / Fully / Abandoned / Modified significantly
Outcome: Did the solution achieve the functional need? Fully / Partially / No / Too early
Principle validity: Did the first principles turn out correct? All / Mostly / Some wrong / Fundamentally flawed
Six quality dimensions (1–5 scale each, /30 total):
Narrative feedback: What worked, what fell short, process changes, artifact feedback
Lessons learned: Extract 2–3 key takeaways
Save human-eval to first-principals-thinking/projects/{project-name}/08-human-eval.md using assets/human-eval-template.md
Read first-principals-thinking/evals/eval-log.json. Append a new entry (or update an existing one for this project) with all self-eval and human-eval data. Follow the schema documented in assets/eval-log-schema.json.
Generate the next eval_id by incrementing from the last entry (format: fpt-NNN).
Populate the artifact_paths object with the actual paths inside the project directory.
Write the updated JSON back to first-principals-thinking/evals/eval-log.json.
If the eval log has 3 or more entries, run a brief pattern analysis:
primary_gap values recurring?time_justified ratings?Report findings inline.
Two artifacts added to the project directory:
07-self-eval.md (Phase 1)08-human-eval.md (Phase 2, if applicable)Plus:
3. Updated first-principals-thinking/evals/eval-log.json
4. Inline summary with scores, key findings, and pattern analysis (if 3+ log entries)
/fpt-decomposefirst-principals-thinking/evals/eval-log.json doesn't exist: create it with the schema from assets/eval-log-schema.json