Structured visual QA verdict for screenshot-to-reference comparisons
<Use_When>
<Output_Contract> Return JSON only with this exact shape:
{
"score": 0,
"verdict": "revise",
"category_match": false,
"differences": ["..."],
"suggestions": ["..."],
"reasoning": "short explanation"
}
Rules:
score: integer 0-100verdict: short status (pass, revise, or fail)category_match: true when the generated screenshot matches the intended UI category/styledifferences[]: concrete visual mismatches (layout, spacing, typography, colors, hierarchy)suggestions[]: actionable next edits tied to the differencesreasoning: 1-2 sentence summary<Threshold_And_Loop>
score < 90, continue editing and rerun $visual-verdict before any further code edits in the next iteration..omx/state/{scope}/ralph-progress.json with both:
score, threshold pass/fail)reasoning, suggestions, next_actions)
</Threshold_And_Loop><Debug_Visualization> When mismatch diagnosis is hard:
$visual-verdict as the authoritative decision.differences[] and suggestions[] updates.
</Debug_Visualization>