Deep-read and explain ML/CV/AI research papers from arxiv or uploaded PDFs. Use this skill whenever the user shares a paper link or PDF and asks about its content — including equations, methods, contributions, comparisons to prior work, or conceptual questions. Also trigger for follow-up questions like "how does equation N work?", "explain this intuitively", or "show me a diagram of this". Always use this skill when a paper URL or identifier (e.g. arxiv:XXXX.XXXXX) appears in the user's message alongside any reading/explaining request.
A skill for reading, deeply understanding, and explaining ML/CV/AI research papers. Handles equations, method comparisons, visual diagrams, and multi-turn conceptual Q&A.
Try fetching in this priority order until you have readable content:
https://arxiv.org/html/{id}v{N}
— Full text, equations as MathML, figures described.https://arxiv.org/pdf/{id}v{N}
— Use web_fetch with web_fetch_pdf_extract_text: true.https://arxiv.org/abs/{id}
— Falls back to abstract + any search results for supplementary detail."{title}" arxiv site:arxiv.org and fetch the
HTML version from the result URL directly.If the HTML URL returns a rate-limit error, wait a moment and retry once, then fall back to the abstract + web search for details.
Never tell the user you cannot read the paper without exhausting all four options.
After fetching, classify what the user is asking:
| Request type | How to handle |
|---|---|
| "Explain the main contribution" | See §3 |
| "Explain equation N" | See §4 |
| "How does X compare to Y?" | See §5 |
| "I don't understand X, explain simply" | See §6 |
| "Show me a diagram" | See §7 |
A single message may combine multiple types — address them in order.
Structure the summary around these four questions:
Keep the summary under 400 words. Add diagrams (§7) proactively if the method has a non-trivial architecture or data flow.
For each equation:
$...$). Identify its number and section in the paper.When asked to compare paper A to paper B (or to prior work):
When the user says they are confused or asks for a simpler explanation:
Always offer or proactively draw a diagram when:
Use show_widget (Visualizer tool). Call read_me with
["diagram", "interactive"] before the first diagram in a session.
For architecture diagrams (write/read pipelines, encoder-decoder flows):
c-blue for input tokens, c-coral or c-amber for the memory /
state component, c-teal for query/virtual tokens, c-green for outputs.For method comparison (two or three methods side by side):
✓ / ✗ lines summarising the trade-offs.For intuition diagrams (explaining how attention, fast weights, or associative memory work):
General rules:
show_widget calls without prose between them.After the initial explanation, remain in "paper expert" mode for follow-up questions. Common patterns and how to handle them:
| Follow-up | Approach |
|---|---|
| "Does this really use TTT?" | Audit whether slow weights update at inference. Answer honestly. |
| "Why K ≠ V in self-attention?" | Explain role separation: K = discriminative addressing, V = rich content payload. Forced sharing is a conflicting objective. |
| "What happens if the state is saturated?" | Explain finite-capacity interference: outer-product superposition, Hopfield-style recall degradation. |
| "How does this compare to [other paper]?" | Fetch the other paper if not already in context, then build a comparison table (§5). |
| "Can you make a diagram?" | Jump to §7. |
Before delivering any explanation, verify: