Evaluate a pre-planned set of candidate system prompts so a model reliably outputs <think>...</think><answer>...</answer> format on a reasoning task. No API key required.

Arguments

$ARGUMENTS[0] — env_name (required): reasoning-gym task name, e.g. countdown, maze, gsm8k. If missing, ask the user before proceeding.
$ARGUMENTS[1] — model_name (optional): HuggingFace ID or local checkpoint path. Default: Qwen/Qwen2.5-0.5B-Instruct

Steps

Run the local optimizer script from the project root with a 15-minute timeout:

python .claude/skills/get-sys-prompt/scripts/optimize_prompt_local.py $ARGUMENTS

Show the full output, calling out each phase clearly:
- Round N — system prompt tried, per-sample tag presence (), avg format + correctness reward

Evaluate a pre-planned set of candidate system prompts so a model reliably outputs <think>...</think><answer>...</answer> format on a reasoning task. No API key required.

Arguments

$ARGUMENTS[0] — env_name (required): reasoning-gym task name, e.g. countdown, maze, gsm8k. If missing, ask the user before proceeding.
$ARGUMENTS[1] — model_name (optional): HuggingFace ID or local checkpoint path. Default: Qwen/Qwen2.5-0.5B-Instruct

Steps

Run the local optimizer script from the project root with a 15-minute timeout:

python .claude/skills/get-sys-prompt/scripts/optimize_prompt_local.py $ARGUMENTS

Show the full output, calling out each phase clearly:
- Round N — system prompt tried, per-sample tag presence (), avg format + correctness reward

Get Sys Prompt

Arguments

Steps

Get Sys Prompt

Arguments

Steps

Optional flags (pass after model_name)

Supporting files

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns