Name: Prompt Tuning
Author: tommoseley

搜索技能.../

Prompt Tuning | Skills Pool

# Run WP baseline with active prompt version
cd ~/dev/TheCombine && python3 ops/scripts/wp_baseline_runner.py

# Run WP baseline with specific prompt version (A/B test)
cd ~/dev/TheCombine && python3 ops/scripts/wp_baseline_runner.py 1.1.1

# Results are saved to:
#   docs/audits/wp-baseline-v1.0.json

python3 ops/scripts/wp_baseline_runner.py

combine-config/prompts/tasks/{artifact_type}/releases/{new_version}/task.prompt.txt

python3 ops/scripts/wp_baseline_runner.py {new_version}

Activate in combine-config/_active/active_releases.json (tasks section)

Sync package-local prompt copy:

combine-config/document_types/{type}/releases/{version}/prompts/task.prompt.txt

Run registry integrity tests: python3 -m pytest tests/tier1/config/test_registry_integrity.py -v
Commit with evidence

Check ID	Type	Description
governance_pins_populated	structural	ta_version_id and policy_refs present
required_sections_present	structural	wp_id, title, rationale, governance_pins, scope
policy_floor_present	structural	POL-ADR-EXEC-001 in policy_refs
contradiction_disclosure_present	structural	contradiction_notes field present with content
ws_index_present	structural	ws_index exists (advisory only)
semantic_rationale	semantic	Rationale is meaningful (≥15 chars, not placeholder)
semantic_scope	semantic	Scope items are meaningful (≥20 chars, not placeholder)
semantic_definition_of_done	semantic	DoD items are meaningful (≥20 chars, not placeholder)

Check ID	Type	Description
governance_pins_populated	structural	ta_version_id and policy_refs present
required_sections_present	structural	All 7 required WS sections present
tests_before_implementation	structural	Test steps precede implementation steps
step_grounding_heuristic	advisory	Steps reference upstream artifacts
contradiction_disclosure_present	structural	contradiction_notes present with content
semantic_objective	semantic	Objective is meaningful (≥15 chars, not placeholder)
semantic_scope	semantic	Scope items are meaningful (≥20 chars, not placeholder)
semantic_procedure	semantic	Procedure steps are meaningful (≥30 chars, not placeholder)
semantic_verification_criteria	semantic	Criteria are meaningful (≥20 chars, not placeholder)

Artifact	Active Release Config	Global Prompt	Package-Local Prompt
WP	`combine-config/_active/active_releases.json` → tasks.work_package	`combine-config/prompts/tasks/work_package/releases/{v}/task.prompt.txt`	`combine-config/document_types/work_package/releases/{v}/prompts/task.prompt.txt`
WS	`combine-config/_active/active_releases.json` → tasks.work_statement	`combine-config/prompts/tasks/work_statement/releases/{v}/task.prompt.txt`	`combine-config/document_types/work_statement/releases/{v}/prompts/task.prompt.txt`

Version	Defect Targeted	Result	Commit
WS v1.1.0	governance_pins empty (7/7 fail)	7→0 failures	63f2b92
WP v1.1.0	governance_pins empty (2/2 fail)	2→0 failures	63f2b92
WP v1.1.1	semantic_scope WEAK (1/5 fail)	1→0 failures	c490ae3

Check	Must Confirm
Target defect	Reduced or eliminated
All other checks	No regressions (fail count same or lower)
New critical defects	None introduced

Prompt Tuning

Prompt Tuning Skill

When to Use

When NOT to Use

Quick Commands

Prompt Tuning

Prompt Tuning Skill

When to Use

When NOT to Use

Quick Commands

Process Steps

Step 1 — Baseline

Step 2 — Select One Defect

Step 3 — Draft Micro-WS

Step 4 — Modify Prompt

Step 5 — Replay

Step 6 — Verify

Step 7 — Promote or Revert

Evaluators

WP Evaluator

WS Evaluator

Shared Classifier

Prompt Locations

Evidence Trail

Validation History

Automation Audit Ops

Github Qa Labels

Jupyter Notebook

Tidb Integrationtest Recorder

Quality Nonconformance

Hugging Face Trackio