Deep critical analysis of academic papers with structured methodology. Use this skill whenever the user wants to analyze a specific paper in depth — not just summarize it, but dissect its experimental setup, extract key numbers, evaluate its claims against a research hypothesis, and identify what the paper inadvertently proves. Triggers: 'analyze this paper', 'deep dive into', '分析这篇论文', '论文深度分析', arXiv ID with analysis intent, 'what does this paper really show', '这篇论文说明了什么', 'break down this paper', 'critical reading'. Also triggers when discussing related work and the user asks to look at a specific paper in detail. NOT for: paper discovery/filtering (use scholar-agent), quick summaries, literature surveys across many papers, or adding papers to NotebookLM.
A structured methodology for critical paper analysis. The goal is not summarization — it's understanding what a paper actually demonstrates (which may differ from what it claims), extracting the numbers that matter, and connecting findings to your research framework.
The primary reading channel is NotebookLM — it provides source-grounded answers from Gemini, handles tables/figures reliably, and supports iterative deep questioning. Every paper analyzed should end up in NotebookLM.
If the user specifies a notebook ID (e.g., "in notebook 9ef789c8"):
notebooklm use <notebook-id>
notebooklm source list # check if the paper is already a source
If the paper is not yet in NotebookLM, add it:
# Prefer arXiv HTML — it preserves equations, tables, and structure
notebooklm source add "https://arxiv.org/html/<ID>"
# If HTML unavailable, fall back to abstract page
notebooklm source add "https://arxiv.org/abs/<ID>"
If no notebook exists for this line of research, create one:
notebooklm create "Research Topic Name"
notebooklm source add "https://arxiv.org/html/<ID>"
Supplementary sources (only when NotebookLM is insufficient):
https://api.semanticscholar.org/graph/v1/paper/ArXiv:<ID>?fields=title,abstract,year,citationCount,authors/web-fetcher on arXiv HTML — fallback if NotebookLM source processing failsHow to use NotebookLM during analysis: Use notebooklm ask throughout all steps — for understanding the paper's narrative (Step 2), extracting numbers (Step 3), and verifying claims. Example questions:
Before applying any external framework, respect the authors' story. Map out:
This step matters because jumping straight to criticism without understanding the paper on its own terms leads to shallow analysis. The interesting insights come from the gap between what a paper shows and what it claims to show — and you can only see that gap if you first understand the claims.
For every core metric the paper introduces or relies on:
Format as a table when there are multiple conditions:
| Condition | Metric A | Metric B | Notes |
|-----------|----------|----------|-------|
| Baseline | X% | Y% | ... |
| Proposed | X'% | Y'% | ... |
The standard for definitions is: a reader should understand what the number means without looking at the original paper.
This is the core output. The structure follows a causal reasoning chain — each section flows from the previous one:
### Paper-Name (arXiv-ID) — "one-line characterization"
**现象**:
- The key experimental finding, with actual numbers
- What's surprising or counterintuitive about it
- Note: lead with what they *saw*, not what they *did*
**实验设置**:Base model | Method | Data | Key metrics — how the above phenomenon was observed
**归因**:Authors' explanation for why the phenomenon occurs.
**解法**(if applicable):Based on the attribution above, what they propose and how it addresses the cause.
The causal chain matters because it exposes logical gaps: if the 解法 doesn't actually follow from the 归因, or if the 归因 doesn't fully explain the 现象, that's where the interesting analysis lives.
If a taxonomy is configured (i.e., references/taxonomy.md exists), add a tag line at the top of each paper's analysis, mapping it to the classification framework:
### Paper-Name (arXiv-ID) — "大白话一句话概括"
> 📍 诊断:[用什么方法发现的] | 归因:[为什么会这样] | 修复:[怎么修的/没修]
The tag line serves two purposes:
If no taxonomy is configured, skip the tag line — Steps 1-4 already produce a complete analysis.
Skip this step if no research framework is configured (i.e., references/hypothesis.md doesn't exist). Steps 1-4 already produce a complete, standalone analysis.
If the user has defined a research framework — their own hypothesis, theoretical lens, or set of claims they're building evidence for — this step connects the paper's findings to that framework. The framework is stored in references/hypothesis.md and is entirely user-defined. It could be a hypothesis, a taxonomy, a set of open questions, or anything else that gives structure to a body of related work.
Mapping: How does this paper's evidence relate to the framework?
Reverse Challenge: For any "fix" the paper proposes, ask:
"Does this fix inadvertently prove the framework's claims?"
The logic: if a paper identifies a problem and proposes a fix, and the fix addresses a symptom predicted by your framework, then the fix itself becomes evidence — even if the authors don't frame it that way. Push hard on whether the paper's narrative holds up, and whether its evidence points somewhere the authors didn't look.
If the user is building a body of related work, note:
All output from this skill must follow these rules. The standard is: a reader should understand every sentence without looking up the original paper.
Always use plain language first, then note the paper's original term in parentheses for traceability.
This applies to:
Do NOT de-jargon universally understood ML terms (RL, SFT, GRPO, CoT, attention, loss, etc.) or proper nouns (benchmark names, model names).
Every abbreviation must be expanded on first use with a one-line explanation:
The paper title's one-line characterization (after the em-dash) must be in plain language, not the paper's own jargon:
#### GeoEyes — "Tool Usage Homogenization"#### GeoEyes — "工具调用千篇一律:每题恰好一次 zoom 的机械行为"The primary output is the causal chain analysis (Step 4), plus tri-dimensional tagging (Step 4.5) when a taxonomy is configured, plus research framework mapping (Step 5) when a hypothesis is configured. This goes into the user's related work document.
For a complete analysis, produce:
thinking-with-image.md or equivalent doc)If the user has a research hypothesis, theoretical lens, or set of claims they're building evidence for, they can create references/hypothesis.md (see references/hypothesis.example.md for the expected format). This file is entirely user-defined — it could be a hypothesis, a set of open questions, or any high-level narrative that organizes related work.
references/hypothesis.md exists → read it before analysis, apply Step 5references/taxonomy.md defines the multi-dimensional classification framework for the research area. Each paper gets "hung" into this framework via the tri-dimensional tag (Step 4.5). See references/taxonomy.example.md for the expected format.
references/taxonomy.md exists → apply Step 4.5 taggingThis skill depends on the notebooklm skill (notebooklm-py CLI). Every paper analyzed gets added to NotebookLM as a source. The typical notebook organization is one notebook per research topic (e.g., "CoT Faithfulness Survey"), with multiple papers as sources.
If the user has an existing notebook, they'll usually specify a notebook ID or name. If not specified, ask which notebook to use or whether to create a new one.