Analyze the current session for agent efficiency, quality, and actionable improvements. Use this skill when the user says things like "learn from this session", "session retro", "what went well", "how did the agent do", "what should I improve", "review this session", "session review", or any request to reflect on how the current session went. Also use when the user wants to extract learnings, identify friction, or improve their workflow based on what just happened.
You are reviewing the current Claude Code session. Your job is to surface the 2-3 most impactful findings — things that would actually change how the next session goes — not to produce an exhaustive report card.
Be candid. A session where everything scores 4/5 but you have nothing concrete to suggest is a wasted review. Prioritize specificity over coverage: one sharp observation beats five generic ones.
Analyze the session transcript and produce the following:
Evaluate how well the agent used tools and how much human course-correction was needed.
Score each dimension 1-5. Calibration: 1=actively harmful or completely wrong approach, 2=significant waste or frequent missteps, 3=adequate but with clear room for improvement, 4=good with minor issues, 5=genuinely impressive and hard to improve on. Reserve 5 for sessions that would make you say "I wish the agent always worked like this."
Provide a brief narrative (2-3 sentences) explaining the scores with specific examples from the session.
Evaluate how well the agent followed project coding standards and CLAUDE.md rules.
Use the same 1-5 calibration as above.
Provide a brief narrative (2-3 sentences) explaining the scores with specific examples.
Surface things that would save time if known at the start of the next session. The bar for inclusion: "would I tell a teammate about this before they start working on this codebase?" Skip categories with nothing worth noting.
This is the most important section. Produce recommendations in these categories, but only if genuinely actionable — a suggestion the user can apply in under 5 minutes. Skip any category with nothing worth doing.
Do not suggest rules that duplicate existing automated enforcement. If a lint rule, pre-commit hook, CI check, or other tooling already catches an issue, documenting it in CLAUDE.md/AGENTS.md is redundant. Before suggesting a rule, check whether the session transcript shows the error was already caught and blocked by automated tooling (e.g. a pre-commit hook rejected the commit, a linter flagged the issue). If so, skip it — the tooling is already doing its job.
For each suggestion, specify whether it belongs in:
./CLAUDE.md or ./AGENTS.md)~/.claude/CLAUDE.md)Format each as a ready-to-paste diff:
+ [the addition - keep it brief]
Suggest new or revised hooks, skills, or agents that would have prevented issues or improved the workflow.
One concrete thing the engineer could do differently in their prompts to get better results (be specific, not generic).
One observation about how the agent behaved that the team should know about — a strength to replicate or a weakness to work around.
After the human-readable review, emit a fenced JSON block that a scraper can parse:
{
"session_review": {
"version": "1.1",
"timestamp": "<ISO 8601>",
"efficiency": {
"tool_precision": <1-5>,
"iteration_efficiency": <1-5>,
"context_utilization": <1-5>,
"autonomy_level": <1-5>,
"autonomy_span": <1-5>,
"friction_events": <count of rejections/aborts/course-corrections>,
"total_tool_calls": <count>,
"failed_tool_calls": <count>
},
"quality": {
"claude_md_compliance": <1-5>,
"code_pattern_adherence": <1-5>,
"test_coverage_intent": <1-5>,
"pr_hygiene": <1-5>,
"documentation_awareness": <1-5>,
"violations": ["<brief description of each CLAUDE.md violation>"]
},
"improvements": {
"claude_md_rules": ["<each suggested rule, ready to paste>"],
"hooks_skills_agents": ["<each suggestion>"],
"prompt_technique": "<the suggestion>",
"agent_pattern_note": "<the observation>"
}
}
}
gh pr comment with --body-file).