Use this skill after deep-read-paper, when the user wants a critical audit of a paper — does the evidence actually support the claims, is the motivation strong, are the experiments well-designed, what's missing. Produces 04_critique.md.
This is the skill that does the hard work of journal club: not summarizing the paper, but stress-testing it. Most of what makes a good journal club presentation is the critic's read, not the rehash.
knowledge/{tag}/03_claims.md must exist (from deep-read-paper). If not, run deep-read-paper first.
Read all of:
knowledge/{tag}/meta.yamlknowledge/{tag}/01_summary.mdknowledge/{tag}/02_deep_notes.mdknowledge/{tag}/03_claims.mdknowledge/{tag}/07_open_questions.mdYou may also need to re-open the source for specific sections; that's expected.
Audit the paper across six axes. For each, write a section in with concrete, evidence-cited findings — not vibes. If an axis genuinely has no problems, say so plainly; don't manufacture criticism.
04_critique.mdThe paper exists because of some claimed gap, failure, or opportunity. Audit it:
find-paper-context.)Verdict: strong / moderate / weak / manufactured, with one paragraph explaining.
For methods/empirical papers:
For theoretical papers:
For each subpoint where you have a finding, cite the table/figure/section.
This is the centerpiece. Walk through 03_claims.md and for each claim, ask:
Does the evidence cited actually support a claim of this scope?
Categorize each claim:
supported — evidence matches the claim.overclaimed — evidence supports a narrower claim than the paper makes. (Most common failure mode.) Rewrite the claim narrower.underclaimed — evidence supports a stronger claim than the paper makes. (Rare and good for the authors.)assumed — claim is made but no evidence is provided. May still be true; just note it.contradicted — evidence in the paper itself doesn't support the claim, or some other table/figure undercuts it.For overclaimed/contradicted/assumed cases, write specifically what would have made the claim valid. This is the most useful output of the whole skill.
If the answer to most of these is no, the paper's empirical claims should be discounted accordingly.
# Critique: {Title}
**Tag:** `{tag}`
**Critiqued on:** {YYYY-MM-DD}
## TL;DR critique
2–4 sentences. The single most important strength and the single most important weakness. This is the line you'd lead a journal club with.
## Axis 1 — Motivation strength
**Verdict:** strong / moderate / weak / manufactured
...
## Axis 2 — Hypothesis quality
...
## Axis 3 — Experimental design / data
...
## Axis 4 — Claims-↔-evidence audit
| Claim | Status | Notes |
|---|---|---|
| C1 | supported | ... |
| C2 | overclaimed | Evidence in Table 3 only covers setting X; claim is about settings X *and* Y. |
| ... | ... | ... |
## Axis 5 — Limitations & failure modes
**Acknowledged:** ...
**Unacknowledged:** ...
## Axis 6 — Reproducibility & rigor
...
## What I'd want to see in a follow-up
Concrete experiment / analysis that would resolve the biggest open question above.
## Most interesting question for journal club discussion
The single thing this paper should provoke an argument about. Not the most damning criticism — the most *interesting* one.
Append any new open questions surfaced during the audit. Mark which ones could be answered by find-paper-reviews (community has probably already debated them) vs find-paper-context (resolved by reading a follow-up) vs "needs an experiment".
Report to the user:
04_critique.mdfind-paper-reviews and find-paper-context next (almost always yes — the audit will have generated questions only the community can answer)