Design a new infra-bench task for a specific dataset. Use when asked to propose, plan, outline, or prepare a Harbor-compatible benchmark task for datasets such as kubernetes-core, terraform-core, or future infra-bench datasets, especially when the task must follow dataset-specific rules from docs/task-rules.
Use this skill to design, not implement, an infra-bench task unless the user
explicitly asks for file creation.
Identify the target dataset.
kubernetes-core, kubeply/kubernetes-core, or
datasets/kubernetes-core.Read the required repo docs before proposing the task:
datasets/<dataset-name>/dataset.tomldocs/conventions.mddocs/harbor.mddocs/task-design.mddocs/task-rules/<domain>.mdDerive <domain> from the dataset name.
kubernetes-core -> kubernetesterraform-core -> terraformobservability-core -> observabilityIf does not exist, stop and report the gap. Propose creating the domain rule document before designing tasks for that dataset. Do not invent undocumented task rules.
docs/task-rules/<domain>.mdInspect existing tasks in datasets/<dataset-name>/ to avoid duplicate task
names, repeated scenarios, or misleading difficulty counts.
Produce a task design brief.
Include these fields:
kubeply/<dataset-name>datasets/<dataset-name>/<task-name>kubeply/<task-name>easy, medium, or hardcanaryEvery published task must include the same canary string in two places:
instruction.md[metadata].canary in task.tomlGenerate the canary with:
python3 -c 'import uuid; print(f"<infra-bench-canary: {uuid.uuid4()}>")'
Do not reuse canaries across tasks.
For a designed task, list the checks that must pass once implemented:
./scripts/validate-structure.sh
uvx --from harbor harbor sync datasets/<dataset-name>
uvx --from harbor harbor run -p datasets/<dataset-name>/<task-name> -a oracle
Only include the oracle run when a task implementation exists.