Skills improve through a three-phase lifecycle. The agent operates in one phase at a time depending on whether ground truth is available.

Trigger conditions

You MUST evaluate whether to enter the skill evolution workflow when ANY of these events occur during a conversation:

User correction — The user corrects your output (e.g., "the answer should be X", "no, use Y instead of Z"). A correction means the skill that guided you was missing information.
Retry after failure — Your code/formulation failed (wrong result, solver error, runtime exception) and you had to change approach. The fix likely contains a generalizable pattern.
Undocumented behavior — You discovered an API behavior, default value, or constraint not mentioned in the relevant skill.
Workaround — You had to work around a limitation or gotcha not documented in any skill.
Variable type or modeling error — You chose the wrong variable type (e.g., CONTINUOUS vs INTEGER), constraint form, or objective structure, and the correction changed the result.

When a trigger fires: Finish solving the user's problem first, then evaluate whether the learning is generalizable (not user-specific) before entering Phase 1 or Phase 2.

Skills improve through a three-phase lifecycle. The agent operates in one phase at a time depending on whether ground truth is available.

Trigger conditions

You MUST evaluate whether to enter the skill evolution workflow when ANY of these events occur during a conversation:

User correction — The user corrects your output (e.g., "the answer should be X", "no, use Y instead of Z"). A correction means the skill that guided you was missing information.
Retry after failure — Your code/formulation failed (wrong result, solver error, runtime exception) and you had to change approach. The fix likely contains a generalizable pattern.
Undocumented behavior — You discovered an API behavior, default value, or constraint not mentioned in the relevant skill.
Workaround — You had to work around a limitation or gotcha not documented in any skill.
Variable type or modeling error — You chose the wrong variable type (e.g., CONTINUOUS vs INTEGER), constraint form, or objective structure, and the correction changed the result.

When a trigger fires: Finish solving the user's problem first, then evaluate whether the learning is generalizable (not user-specific) before entering Phase 1 or Phase 2.

Ground truth	How to score
Behavioral tests	`must_include` / `must_not_include` patterns pass
Code execution	`solution.py` runs without error, produces expected output
Solver status	cuOpt returns `Optimal` / `FeasibleFound` / `SUCCESS`
Constraint satisfaction	All constraints in the formulation are met
Known answer	Output matches the expected value within tolerance

Skill Evolution

Trigger conditions

Skill Evolution

Trigger conditions

Phase 1: Learning (with ground truth)

Skill generation loop (sandbox)

Scoring criteria

Distillation

Placement rule — target highest-impact skill

Proposal format

Phase 2: Inference (no ground truth)

Use specialized skills

Collect insights

Propose insights (lower confidence)

Phase 3: Offline reflection

When to reflect

How to reflect

Provenance tagging

Updates to existing skills

New skills

Prose

Golang Patterns

Audiocraft Audio Generation

Pokemon Player

Ideation

Storybook Upgrade