Read and analyze arXiv papers. Given an arXiv link or ID, download the paper, extract all figures first, generate structured reading notes in both Chinese and English with convincing figures, and push to GitHub. (user)
Read arXiv papers and generate structured reading notes with figures, then push to GitHub.
This skill runs autonomously from start to finish. DO NOT ask the user "Do you want to proceed?" or any similar confirmation questions.
Execute ALL steps in sequence without pausing:
Never stop to ask for confirmation. Complete the entire workflow automatically.
Use /read-paper followed by an arXiv link or ID.
Accepts any of these formats:
2301.07041https://arxiv.org/abs/2301.07041https://arxiv.org/pdf/2301.07041.pdfWhen triggered with an arXiv paper:
Extract the arXiv ID from the input. Examples:
2301.07041 → 2301.07041https://arxiv.org/abs/2301.07041 → 2301.07041https://arxiv.org/pdf/2301.07041.pdf → 2301.07041Use the arXiv API to get paper metadata (title, authors, last updated date):
curl -s "https://export.arxiv.org/api/query?id_list=ARXIV_ID"
Extract from the XML response:
<title> - Paper title<author> - Authors<updated> - Last edited date (format: YYYY-MM-DD from the full timestamp)cd ~/projects/claude-skill-read-paper
mkdir -p "paper-YYYY-MM-DD-short-title"
curl -L -o "paper-YYYY-MM-DD-short-title/paper.pdf" "https://arxiv.org/pdf/ARXIV_ID.pdf"
IMPORTANT: Extract all figures FIRST, before reading the paper. This allows you to review and select the most convincing figures for the notes.
cd "paper-YYYY-MM-DD-short-title"
mkdir -p figures
Method 1: PyMuPDF (Recommended)
python3.8 << 'EOF'
import fitz
doc = fitz.open("paper.pdf")
img_count = 0
for page_num in range(len(doc)):
page = doc[page_num]
images = page.get_images(full=True)
for img in images:
xref = img[0]
base_image = doc.extract_image(xref)
image_bytes = base_image["image"]
image_ext = base_image["ext"]
with open(f"figures/fig-{img_count:03d}.{image_ext}", "wb") as f:
f.write(image_bytes)
print(f"Extracted: fig-{img_count:03d}.{image_ext} (page {page_num + 1})")
img_count += 1
print(f"Total: {img_count} figures extracted")
EOF
Method 2: pdfimages (if PyMuPDF unavailable)
pdfimages -png paper.pdf figures/fig
Method 3: pdftoppm (convert pages to images)
pdftoppm -png -r 150 paper.pdf figures/page
CRITICAL STEP: Before reading the paper, review ALL extracted figures to understand their content:
This step ensures you can select the most convincing and relevant figures when writing notes.
Use the Read tool to read the PDF file. When reading:
After reading the paper and reviewing figures, generate TWO separate markdown files.
Figure Selection Criteria - Include figures that:
Figure Placement:
# Insight section: Overview/architecture diagram# Contribution section: Method illustrations for each contribution# Experiments section: Results charts, tables, ablation figures## Limitation section: Failure cases or statistical figures if relevantFigure Reference Format:

*Figure X: Caption explaining what the figure shows and why it's important*
Save the following files in the paper folder:
paper.pdf - Original paper (already downloaded in Step 3)notes_zh.md - Chinese notes with figure referencesnotes_en.md - English notes with figure referencesfigures/ - Directory containing extracted figurescd ~/projects/claude-skill-read-paper
git add .
git commit -m "Add notes: Paper Title (arXiv:XXXX.XXXXX)"
git push
Return the GitHub folder URL to the user.
Each markdown file should follow this structure:
# Quick View
**Title**: [Paper title]
**Authors**: [Author list]
**arXiv**: [arXiv ID with link]
**Year**: [Publication year]
# Question
[What research question does this paper address?]
# Task
[What specific task is this paper trying to solve?]
# Challenge
[What technical challenges did previous methods face? Why is this problem hard?]
# Insight
[What is the core insight or key idea that solves the challenge? One sentence high-level thought, NOT specific technical contribution.]

*Figure X: Brief description of the main architecture or approach*
# Contribution
[List each technical contribution with:]
1. **[Contribution Name]**
- **Approach**: [How is it done?]
- **Technical Advantage**: [Why is this better?]

*Figure X: Diagram showing the method*
2. **[Contribution Name]**
- **Approach**: [How is it done?]
- **Technical Advantage**: [Why is this better?]
# Experiments
## Core Contribution Impact (Ablation Studies)
[What is the impact of each core contribution on performance?]

*Figure/Table X: Key experimental results*
## Limitation
[What are the failure cases? On what kind of data does it fail?]

*Figure X: Statistics or failure cases (if applicable)*
notes_zh.md (Chinese) and notes_en.md (English)paper.pdf in the folderFor paper 2301.07041 (Verifiable Fully Homomorphic Encryption, updated 2023-02-11):
~/projects/claude-skill-read-paper/
├── paper-2023-02-11-verifiable-fhe/
│ ├── paper.pdf # Original paper
│ ├── notes_zh.md # Chinese notes (with figures)
│ ├── notes_en.md # English notes (with figures)
│ └── figures/ # ALL extracted figures
│ ├── fig-000.png # Review each to understand content
│ ├── fig-001.png
│ └── ...
├── README.md
└── SKILL.md
When reviewing extracted figures, categorize them:
| Category | Where to Use | Priority |
|---|---|---|
| Architecture/Overview | Insight section | High |
| Method diagrams | Contribution section | High |
| Results tables/charts | Experiments section | High |
| Ablation figures | Ablation subsection | High |
| Comparison charts | Experiments section | Medium |
| Statistics/distributions | Limitation section | Medium |
| Small icons (< 5KB) | Skip | Low |
| Decorative images | Skip | Low |
After completing all steps, display:
https://github.com/0xPabloxx/Paper-Skill/tree/main/paper-YYYY-MM-DD-short-title