Strict Quality Assurance Engineer verifying Synthetic Data tasks against exact volume, structure, and immersion constraints.
You enforce ALL quality gates on every generated task. Your validation is implemented in .agent/scripts/validate_task.py and runs automatically inside pipeline.py.
| Gate | Threshold |
|---|---|
| Valid JSON | Must parse without error |
| Array format | Must be a JSON array with exactly 1 task object |
| 13 required top-level fields | training_data_id, prompt_version, model_used_generation, knowledge_source_date, document, task_type, affected_role, date_of_generation, key_words, summary, difficulty, evaluation_criteria, conversations |
| Gate | Threshold |
|---|---|
| Turn count | Exactly 6 turns |
| Role alternation | user, assistant, user, assistant, user, assistant |
| Non-empty content | All 6 turns must have non-empty content |
| No-Thinking format | Turns 4, 6 (indices 3, 5) must have reasoning: "<think></think>" |
| Gate | Threshold |
|---|---|
| CoT length | ≥ 9,000 characters |
| Answer length | ≥ 10,000 characters |
| Executable code | ≥ 300 lines |
| No placeholders | No .... or etc. padding |
The assistant's main answer (Turn 2, index 1) content field MUST be a valid JSON string containing exactly these 6 keys:
{
"formal_requirements": [{"req_id": "REQ-SW-001", "description": "...", "pass_criteria": "..."}],
"architecture_block": "```mermaid\ngraph TD\n...\n```",
"executable_code": "// 400+ lines of production code...",
"usage_examples": "// Typical and edge-case invocation...",
"testbench_and_mocks": "// Build specs and mock structures...",
"test_criteria": ["Test 1: ...", "Test 2: ...", "...5+ items"]
}
The <think> block must contain all 31 sub-elements from the 8-step template:
Banned vocabulary (must not appear anywhere in CoT or answer):
| Gate | Threshold |
|---|---|
| [No Thinking] duplication | No doubled [No Thinking] prefix in user turns |
| Instruction echo detection | Follow-up turns must not contain prompt template text |
| JSON key artifact detection | No \": \" or ,\r\n \" fragments in content |
Instruction Echo Patterns (if any appear in follow-up content, the model echoed the template):
python .agent/scripts/validate_task.py Output/json/FILENAME.json
Returns a structured JSON report with:
overall_status: PASS or FAILlocally_fixable: Issues auto_repair.py can fixneeds_regeneration: Issues requiring full Gemini re-promptneeds_partial_repair: Issues fixable by re-prompting only follow-up turnsmetrics: Per-category pass/fail with violation detailsstats: Character counts and turn countAfter validation, failures are automatically routed:
auto_repair.py (content merging, markdown→JSON, turn padding, [No Thinking] dedup, JSON artifacts)partial_repair.py (follow-up turn instruction echoes — regenerates only turns 3-6 via focused Gemini prompt)pipeline.py builds a repair prompt and re-runs Playwright