Parallel adversarial review protocol that launches two independent blind judge sub-agents simultaneously to review the same target, synthesizes their findings, applies fixes, and re-judges until both pass or escalates after 2 iterations. Trigger: When user says "judgment day", "judgment-day", "review adversarial", "dual review", "doble review", "juzgar", "que lo juzguen".
Follow the Skill Resolver Protocol (_shared/skill-resolver.md) before launching ANY sub-agent:
mem_search(query: "skill-registry", project: "{project}")) → fallback to .atl/skill-registry.md from the project root → skip if none.go → go-testing; .tsx → react-19, typescript)## Project Standards (auto-resolved) block with the matching compact rulesThis ensures judges review against project-specific standards, not just generic best practices.
If no registry exists: warn the user ("No skill registry found — judges will review without project-specific standards. Run skill-registry to fix this.") and proceed with generic review only.
delegate (async, parallel — never sequential)The orchestrator (NOT a sub-agent) compares results after both delegation_read calls return:
Confirmed → found by BOTH agents → high confidence, fix immediately
Suspect A → found ONLY by Judge A → needs triage
Suspect B → found ONLY by Judge B → needs triage
Contradiction → agents DISAGREE on the same thing → flag for manual decision
Present findings as a structured verdict table (see Output Format).
Judges MUST classify every WARNING into one of two sub-types:
WARNING (real) → Causes a bug, data loss, security hole, or incorrect behavior
in a realistic production scenario. Fix required.
WARNING (theoretical) → Requires a contrived scenario, corrupted input, or conditions
that cannot arise through normal usage. Report but do NOT block.
How to classify: ask "Can a normal user, using the tool as intended, trigger this?" If YES → real. If it requires a malicious manifest, renamed home dir, two clicks in <1ms, or Windows volume root edge case → theoretical.
Theoretical warnings are reported as INFO in the verdict table. They are NOT fixed, do NOT trigger re-judgment, and do NOT count toward the convergence threshold. The orchestrator includes them in the final report for awareness.
Round 1: Present the verdict table to the user. ASK: "These are the confirmed issues. Want me to fix them?" Only fix after user confirms. Then re-judge with full scope.
Round 2+: Only re-judge if there are confirmed CRITICALs. For anything else:
APPROVED criteria after Round 1: 0 confirmed CRITICALs + 0 confirmed real WARNINGs = APPROVED. Theoretical warnings and suggestions may remain.
This prevents the diminishing-returns cycle where each fix round introduces minor artifacts that trigger another round of nit-picking.
User asks for "judgment day"
│
├── Target is specific files/feature/component?
│ ├── YES → continue
│ └── NO → ask user to specify scope before proceeding
│
▼
Resolve skills (Pattern 0): read registry → match by code + task context → build Project Standards block
▼
Launch Judge A + Judge B in parallel (delegate, async) — with Project Standards injected
▼
Wait for both to complete (delegation_read both)
▼
Synthesize verdict
│
├── No issues found?
│ └── JUDGMENT: APPROVED ✅ (stop here)
│
├── Issues found (confirmed, suspect, or contradictions)?
│ └── Present verdict table to user
│ ▼
│ ASK: "¿Arreglo los issues confirmados? / Fix confirmed issues?"
│ ▼
│ ├── User says YES → Delegate Fix Agent with confirmed issues list
│ ├── User says NO → JUDGMENT: ESCALATED (user chose not to fix)
│ └── User gives specific feedback → adjust fix list accordingly
│ ▼
│ Wait for Fix Agent to complete
│ ▼
│ Re-launch Judge A + Judge B in parallel (Round 2)
│ ▼
│ Synthesize verdict
│ │
│ ├── Clean → JUDGMENT: APPROVED ✅
│ │
│ └── Still issues → Delegate Fix Agent again (Round 3 / iteration 2)
│ ▼
│ Re-launch Judge A + Judge B in parallel (Round 3)
│ ▼
│ Synthesize verdict
│ │
│ ├── Clean → JUDGMENT: APPROVED ✅
│ └── Still issues → ASK USER: "Issues remain after 2 iterations. Continue iterating?"
│
├── User says YES → repeat fix + judge cycle (no limit)
└── User says NO → JUDGMENT: ESCALATED ⚠️ (report to user)
You are an adversarial code reviewer. Your ONLY job is to find problems.
## Target
{describe target: files, feature, architecture, component}
{if compact rules were resolved in Pattern 0, inject the following block — otherwise OMIT this entire section}
## Project Standards (auto-resolved)
{paste matching compact rules blocks from the skill registry}
## Review Criteria
- Correctness: Does the code do what it claims? Are there logical errors?
- Edge cases: What inputs or states aren't handled?
- Error handling: Are errors caught, propagated, and logged properly?
- Performance: Any N+1 queries, inefficient loops, unnecessary allocations?
- Security: Any injection risks, exposed secrets, improper auth checks?
- Naming & conventions: Does it follow the project's established patterns AND the Project Standards above?
{if user provided custom criteria, add here}
## Return Format
Return a structured list of findings ONLY. No praise, no approval.
Each finding:
- Severity: CRITICAL | WARNING (real) | WARNING (theoretical) | SUGGESTION
- File: path/to/file.ext (line N if applicable)
- Description: What is wrong and why it matters
- Suggested fix: one-line description of the fix (not code, just intent)
**WARNING classification rule**: Ask "Can a normal user, using the tool as intended, trigger this?"
- YES → `WARNING (real)` — e.g., silent error on disk full, data corruption on normal input
- NO → `WARNING (theoretical)` — e.g., requires malicious manifest, renamed home dir, race condition in <1ms, OS-specific edge case that doesn't apply to the project's target platforms
Always include at the end: **Skill Resolution**: {injected|fallback-registry|fallback-path|none} — {details}
If you find NO issues, return:
VERDICT: CLEAN — No issues found.
## Instructions
Be thorough and adversarial. Assume the code has bugs until proven otherwise.
Your job is to find problems, NOT to approve. Do not summarize. Do not praise.
You are a surgical fix agent. You apply ONLY the confirmed issues listed below.
## Confirmed Issues to Fix
{paste the confirmed findings table from the verdict synthesis}
{if compact rules were resolved in Pattern 0, inject the following block — otherwise OMIT this entire section}
## Project Standards (auto-resolved)
{paste matching compact rules blocks from the skill registry}
## Context
- Original review criteria: {paste same criteria used for judges}
- Target: {same target description}
## Instructions
- Fix ONLY the confirmed issues listed above
- Do NOT refactor beyond what is strictly needed to fix each issue
- Do NOT change code that was not flagged
- **Scope rule**: If you fix a pattern in one file (e.g., add error logging for a silent discard), search for the SAME pattern in ALL other files touched by this change and fix them ALL. Inconsistent fixes across files are the #1 cause of unnecessary re-judge rounds.
- After each fix, note: file changed, line changed, what was done
Return a summary:
## Fixes Applied
- [file:line] — {what was fixed}
**Skill Resolution**: {injected|fallback-registry|fallback-path|none} — {details}
## Judgment Day — {target}
### Round {N} — Verdict
| Finding | Judge A | Judge B | Severity | Status |
|---------|---------|---------|----------|--------|
| Missing null check in auth.go:42 | ✅ | ✅ | CRITICAL | Confirmed |
| Race condition in worker.go:88 | ✅ | ❌ | WARNING (real) | Suspect (A only) |
| Windows volume root edge case | ❌ | ✅ | WARNING (theoretical) | INFO — reported |
| Naming mismatch in handler.go:15 | ❌ | ✅ | SUGGESTION | Suspect (B only) |
| Error swallowed in db.go:201 | ✅ | ✅ | WARNING (real) | Confirmed |
**Confirmed issues**: 2 CRITICAL
**Suspect issues**: 1 WARNING, 1 SUGGESTION
**Contradictions**: none
### Fixes Applied (Round {N})
- `auth.go:42` — Added nil check before dereferencing user pointer
- `db.go:201` — Propagated error instead of silently returning nil
### Round {N+1} — Re-judgment
- Judge A: PASS ✅ — No issues found
- Judge B: PASS ✅ — No issues found
---
### JUDGMENT: APPROVED ✅
Both judges pass clean. The target is cleared for merge.
## Judgment Day — {target}
### JUDGMENT: ESCALATED ⚠️
User chose to stop after {N} fix iterations. Issues remain.
Manual review required before proceeding.
### Remaining Issues
| Finding | Judge A | Judge B | Severity |
|---------|---------|---------|----------|
| {description} | ✅ | ✅ | CRITICAL |
### History
- Round 1: {N} confirmed issues found
- Fix 1: applied {list}
- Round 2: {N} issues remain
- Fix 2: applied {list}
- Round 3: {N} issues remain → escalated
Recommend: human review of the remaining issues above before re-running judgment day.
After every delegation that returns a result, check the **Skill Resolution** field in each judge/fix-agent response:
injected → skills were passed correctly ✅fallback-registry, fallback-path, or none → skill cache was lost (likely compaction). Re-read the registry immediately and inject compact rules in all subsequent delegations.This is a self-correction mechanism. Do NOT ignore fallback reports.
These rules cannot be skipped, overridden, or deprioritized under any circumstances:
JUDGMENT: APPROVED until: Round 1 judges return CLEAN, OR Round 2 judges confirm 0 CRITICALs + 0 confirmed real WARNINGs (theoretical warnings and suggestions may remain)git push, git commit, or any code-modifying action after fixes until re-judgment completesBefore pushing, committing, summarizing, or telling the user "done":
If ANY answer is "no" → you skipped a step. Go back and complete it before proceeding.
delegate (async) so they run in parallel# No CLI commands — this is a pure orchestration protocol.
# Execution happens via delegate() and delegation_read() tool calls.