技能档案

Adversarial Review

Name: Adversarial Review
Author: ZhangHanbo

Full adversarial review at top-venue standard (RSS/CoRL/IJRR/T-RO). Graduated pressure, six attack vectors from review_guideline §3.1-3.6, mechanical verdict via metrics.verdict. Use to review, critique, audit a paper, or self-review a draft.

ZhangHanbo0 星标2026年4月11日

职业
分类: 学术

技能内容

When to use

Invoked for full adversarial review at the standard of top robotics venues (RSS, CoRL, IJRR, T-RO, ICRA, IROS). This is the largest and most judgment-heavy skill in the system. It composes three sub-skills via the Task tool:

concurrent-work-check (for §3.1 scoop detection)
formalization-check (for §3.2 formalization attack)
experiment-audit (for §3.5 validation attack)

Maps to the VALIDATE stage of the research state machine (for self-review) or to the review-agent workflow (for external paper review).

The kill-chain (review_guideline.md §1.2)

Average reviewers score dimensions independently and average. Top reviewers trace the logical chain and find where it breaks:

SIGNIFICANCE → FORMALIZATION → CHALLENGE → APPROACH → VALIDATION

One broken link is a structural flaw that no score-averaging can compensate for. Search for breaks.

相关技能

Adversarial Review | Skills Pool

PYTHONPATH=src python -c "
from alpha_research.tools.paper_fetch import fetch_and_extract
import json, sys
c = fetch_and_extract(sys.argv[1])
print(json.dumps({
    'title': c.title,
    'abstract': c.abstract,
    'sections': c.sections,
    'extraction_quality': c.extraction_quality.overall,
    'math_preserved': c.extraction_quality.math_preserved,
}, indent=2, default=str))
" "<paper_id>"

PYTHONPATH=src python -c "
from alpha_research.metrics.verdict import compute_verdict
from alpha_research.models.review import Finding, Severity
from alpha_research.models.blackboard import Venue
import json, sys

findings_json = json.loads(sys.argv[1])
findings = [Finding(**f) for f in findings_json]
venue = Venue[sys.argv[2]]
significance_score = int(sys.argv[3])

verdict = compute_verdict(findings, venue=venue, significance_score=significance_score)
print(json.dumps({'verdict': verdict.value if hasattr(verdict, 'value') else str(verdict)}))
" '<findings_json>' RSS 3

PYTHONPATH=src python -c "
from alpha_research.records.jsonl import append_record
from pathlib import Path
import json, sys
rid = append_record(Path(sys.argv[1]), 'review', json.loads(sys.stdin.read()))
print(rid)
" "<project_dir>" <<< '<review_json>'

{
  "artifact_id": "arxiv:2501.12345",
  "venue": "RSS",
  "iteration": 2,
  "chain_extraction": {
    "task": "...",
    "problem": "...",
    "challenge": "...",
    "approach": "...",
    "contribution": "...",
    "chain_complete": true,
    "broken_links": []
  },
  "steel_man": "The paper's central insight is that ... This is non-obvious because ... The experimental result on X genuinely demonstrates ...",
  "findings": {
    "fatal": [],
    "serious": [
      {
        "severity": "serious",
        "attack_vector": "3.5",
        "what_is_wrong": "Only 6 trials per condition reported in Table 2",
        "why_it_matters": "Below RSS threshold of 20; variance estimates are unreliable at this sample size",
        "what_would_fix_it": "Rerun with 20+ trials per condition and report 95% CI",
        "falsification": "If authors show CI [.55,.80] after rerunning with n=20, this concern is addressed",
        "grounding": "Table 2, §5.1",
        "fixable": true,
        "maps_to_trigger": null
      }
    ],
    "minor": []
  },
  "verdict": "weak_reject",
  "confidence": 4,
  "questions_for_authors": [
    "Please provide trial counts and CI for Table 2 results.",
    "Did you compare against RT-2 fine-tuned on your task? If not, why not?"
  ],
  "what_would_increase_score": "Address the missing RT-2 baseline AND provide ≥20 trials per condition with CI. Both fixes are feasible in a rebuttal period.",
  "anti_patterns_avoided": ["dimension_averaging", "false_balance", "novelty_fetishism"]
}

Adversarial Review

When to use

The kill-chain (review_guideline.md §1.2)

Adversarial Review

When to use

The kill-chain (review_guideline.md §1.2)

Finding hierarchy (review_guideline.md §1.3)

Graduated pressure (review_plan.md §3)

Iteration 1 — Structural scan (≈5 min budget)

Iteration 2 — Full review (≈30 min budget)

Iteration 3+ — Focused re-review

Process

Step 1 — Fetch the artifact

Step 2 — Iteration 1: extract the logical chain + quick scan

Step 3 — Iteration 2: apply all six attack vectors

§3.1 Attacking Significance (the "So What?" test)

§3.2 Attacking Formalization (the "Where's the Math?" test)

§3.3 Attacking the Challenge

§3.4 Attacking the Approach

§3.5 Attacking Validation

§3.6 Attacking Novelty

Step 4 — Steel-man (≥3 sentences)

Step 5 — Classify findings

Step 6 — Compute verdict MECHANICALLY

Step 7 — Persist

Output format

Honesty protocol

References

Goplaces

Research Ops

Editor

Fact Checker

Deep Research

Academic Researcher