Paper Reading - Research Paper Summarization

Overview

A structured approach to reading and summarizing scientific research papers. Automatically identifies paper type (empirical/theoretical/survey/systems), selects the appropriate template, screenshots important figures, and embeds them in the summary document.

When to Use

User provides a paper (PDF path, URL, or pasted content) and asks for summary
User asks to "read", "summarize", or "analyze" a research paper
User wants to understand a paper's contribution quickly
Literature review tasks

Not for: Tutorial papers, textbooks, or non-research documents

Workflow

digraph paper_reading {
    rankdir=TB;
    "Receive paper" -> "Get PDF file";
    "Get PDF file" -> "Read PDF content";
    "Read PDF content" -> "Identify paper type";
    "Identify paper type" -> "Prepare output directory";
    "Prepare output directory" -> "Extract figures (pymupdf4llm)";
    "Extract figures (pymupdf4llm)" -> "Filter & rename images";
    "Filter & rename images" -> "Fill type-specific template";
    "Fill type-specific template" -> "Write markdown file";
}

Paper Reading - Research Paper Summarization

Overview

When to Use

User provides a paper (PDF path, URL, or pasted content) and asks for summary
User asks to "read", "summarize", or "analyze" a research paper
User wants to understand a paper's contribution quickly
Literature review tasks

Not for: Tutorial papers, textbooks, or non-research documents

Workflow

digraph paper_reading {
    rankdir=TB;
    "Receive paper" -> "Get PDF file";
    "Get PDF file" -> "Read PDF content";
    "Read PDF content" -> "Identify paper type";
    "Identify paper type" -> "Prepare output directory";
    "Prepare output directory" -> "Extract figures (pymupdf4llm)";
    "Extract figures (pymupdf4llm)" -> "Filter & rename images";
    "Filter & rename images" -> "Fill type-specific template";
    "Fill type-specific template" -> "Write markdown file";
}

## Basic Information [shared section] ## Research Problem [shared section] - **Mathematical formulation:** (optional)  ## Key Insight > Distill the paper's core new idea in 2-3 sentences. Not "what was done", but "what insight makes this method work". > Example: Rather than predicting frame-by-frame, first establish long-term 3D point tracking, then leverage temporal consistency for joint optimization. ## Technical Method ### Overall Framework and Principles   - Overall system architecture description - Modules/components and their responsibilities - Signal/data flow direction - **Why this design?** Advantages over the intuitive/naive approach ### Core Component Details  - Model/algorithm architecture details (layers, dimensions, input/output) - Training objective and loss function (write key equations) - Training data source (synthetic/real/mixed, dataset names and scale) - Key tricks and design decisions - **Motivation for each design choice:** Why use A instead of B? Does the paper provide justification? ## Experimental Results   ### Results (Facts) - **Experimental setup:** Environment, hardware, hyperparameters - **Baselines compared:** List specific method names and sources - **Key results:** Quantitative improvement margins (specific numbers + percentages) - **Ablation study:** Component contributions (removing X decreases performance by Y%) - **Surprising findings:** Any counterintuitive results ### Analysis (Interpretation) - Authors' explanation and attribution of results - Which scenarios/datasets show best performance? Worst? - Root cause of performance gains (authors' claims vs actual evidence)  ## Critical Analysis ### Strengths - Specific improvements over prior work (not just "good results") ### Limitations - **Acknowledged by authors:** - **My observations:** Issues not mentioned in the paper - Do assumptions hold in practice? - Are compute/data requirements reasonable? - Are evaluation metrics comprehensive? ### Reproducibility Assessment - Is code open-sourced? Is data available? - Are key implementation details sufficiently described? ## Summary [shared section]

Source	Detection	Action
Local PDF	File path ends with `.pdf`	Use directly
arXiv URL	Contains `arxiv.org`	Extract paper ID → download `https://arxiv.org/pdf/XXXX.XXXXX`
Other URL	Default	Try downloading as PDF; if not a PDF, use WebFetch for text

Type	Identification Signals
Empirical	Proposes new method/model, has experimental comparisons, includes baselines
Theoretical	Theorem/proof-driven, math-heavy derivations, few or no experiments
Survey	Many citations (>100), taxonomy/classification, "survey"/"review" keywords
Systems	System design, engineering implementation, benchmarks, deployment experience

Priority	Figure Type	When to Capture
Must	System architecture / overall framework	If available
Must	Main experiment results table/chart	If available
Recommended	Core algorithm flowchart	If available
Recommended	Ablation study charts	If available
Optional	Visualization / qualitative results	If space allows
Optional	Auxiliary illustrations	As needed

Shallow writing (prohibited)	Deep writing (required)
"Proposes a new method"	"Addresses bottleneck Y in problem X via mechanism Z"
"Achieves SOTA results"	"Improves X% over method B on dataset A, primarily because of Y"
"Uses a Transformer"	"Uses L-layer Transformer with input dim D, H attention heads, key modification is Z"
"Has some limitations"	"Only validated in scenario X, does not account for distribution shift Y, assumption Z may not hold in practice"

Mistake	Correction
Copying abstract verbatim	Synthesize in your own words, distill the key insight
Missing key assumptions	Explicitly state what the method assumes
Vague architecture description	Include specific dimensions and layer types
Ignoring failure cases	Note where method underperforms and on which datasets
Skipping mathematical notation	Include key LaTeX equations when available
Not screenshotting paper figures	Must capture architecture and main result figures
Rendering entire PDF pages as images	Use pymupdf4llm write_images=True for automatic precise extraction
Misplaced image insertion	Images should be adjacent to corresponding text
Vague critiques	Must name specific limitations (scenario, data, assumptions)
Wrong paper type classification	Read abstract and intro fully before classifying; default to Empirical
Giving up after screenshot failure	Use pymupdf4llm auto-extraction first, fall back to manual pymupdf clip
Writing only "what" without "why"	Every design choice should explain the motivation and justification
Mixing results and conclusions	Separate experimental facts (Results) from author interpretation (Analysis)
Missing related work positioning	Must compare against 2-3 closest prior works
Key Insight too vague or missing	Key Insight must be a specific, actionable new idea
Evaluation missing three perspectives	Separately write authors' conclusion, personal assessment, overall evaluation
Not distinguishing author-acknowledged vs self-discovered limitations	Critical Analysis must separate the two types of limitations

Paper Reading

Paper Reading - Research Paper Summarization

Overview

When to Use

Workflow

Paper Reading

Paper Reading - Research Paper Summarization

Overview

When to Use

Workflow

Step 1: PDF Acquisition

Download Flow

Read Content

Step 2: Paper Type Identification

Step 3: Figure & Table Extraction (pymupdf4llm)

1. Prepare Output Directory

2. Screenshot Priority Guide

3. Automated Extraction with pymupdf4llm

4. Filter & Rename Extracted Images

5. Fallback: Manual pymupdf clip extraction

File Naming

Step 4: Fill Template

After identifying paper type, select the corresponding template

Writing Principles (Critical)

After identifying paper type, select the corresponding template

Template A: Empirical Paper

Template B: Theoretical Paper

Template C: Survey Paper

Template D: Systems Paper

Shared Summary Section

Section Writing Guidelines

Basic Information

Research Problem

Key Insight

Technical Method (Empirical)

Experimental Results

Critical Analysis

Summary and Evaluation

Common Mistakes

Language

Goplaces

Research Ops

Editor

Fact Checker

Deep Research

Academic Researcher