Paper Compass Learnpath. Build prerequisite learning paths before reading a paper. Extract concepts, anchor evidence to sections, rank order and difficulty, recommend resources. Use when user gives arXiv ID/link/PDF and asks what to learn first.
Do one thing: produce an actionable prerequisite learning path before the user reads a paper.
lang=zh|enzhlang is not provided, output in Chinese.lang=en, output all report sections and notes in English.lang=zh, all report section titles must be Chinese.[Section] "short quote".evidence=indirect and lower confidence.Must Learn⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐memory=<path>, read that file first.~/Documents/know/memory.md.信息不足 (for zh) or insufficient information (for en) when needed.low-confidence when certainty is limited.venue待验证 or citations待验证, NOT "preprint/unpublished"- 标题: {title}
- 作者: {authors}
- 年份: {year}
- 发表信息与venue: {venue_name} | JCR 分区: {Q1/Q2/Q3/Q4/N/A} | CCF 等级: {A/B/C/N/A}
- 来源: {paper_url_or_path}
- **影响力**: {citation_count_and_awards_if_known_or_search_online}
关键实验结论: {1-3 sentences summarizing key findings}## 7. **Sources**: followed by reference links| User Input | Rule |
|---|---|
2010.11929 or arxiv:2010.11929 | Convert to arXiv ID, use multi-source fallback |
https://arxiv.org/abs/... | Extract ID, use multi-source fallback |
https://arxiv.org/pdf/... | Extract ID, use multi-source fallback |
https://arxiv.org/html/... | Extract ID, use multi-source fallback |
| Local PDF path | Parse directly with Read tool |
| Other paper URL | Fetch and parse if readable |
arXiv resources (PDF, HTML, abs) are PRIMARY - always use first:
https://export.arxiv.org/api/query?id_list={id} → get title, authors, abstract, yearcurl with redirect-following, retry, timeout, and a browser-like user agenthttps://arxiv.org/html/{id} when availablecurl -s "https://api.semanticscholar.org/graph/v1/paper/ARXIV:{id}?fields=venue,citationCount,tldr" → get venue, citationsPriority Order:
Priority 1: arXiv API → metadata (title, authors, abstract, year)
Priority 2: Stable PDF download + Read tool → full paper content (sections, quotes, methods)
Priority 3: arXiv HTML fallback → recover readable sections when PDF fails
Priority 4: Semantic Scholar API via curl → venue, citations, TLDR (supplement)
CRITICAL:
semantic-scholar is only a metadata supplement, not the primary full-text pathhttps://arxiv.org/pdf/{id} over hardcoding .pdf suffix when downloading/tmp or other system temp paths; always save files under the current working directory in ./papers/./papers/{id}.pdf or ./papers/{id}.html from the current workspace so Windows path resolution does not failUse Bash + Python/curl for arXiv access and full-text retrieval (no WebSearch, no WebFetch for arxiv.org):
# 1. Get arXiv metadata
python3 -c "
import urllib.request, xml.etree.ElementTree as ET, json
NS = 'http://www.w3.org/2005/Atom'
url = 'https://export.arxiv.org/api/query?id_list=ARXIV_ID'
with urllib.request.urlopen(url, timeout=30) as r:
root = ET.fromstring(r.read())
# Extract title, authors, abstract, year, pdf_url...
"
# 2. Always download to the current workspace under ./papers/
mkdir -p papers
curl -L --retry 5 --retry-delay 2 --retry-all-errors \
--connect-timeout 15 --max-time 120 \
-A "Mozilla/5.0" \
-o ./papers/ARXIV_ID.pdf https://arxiv.org/pdf/ARXIV_ID
# 3. Verify local PDF exists and is non-trivial
test -s ./papers/ARXIV_ID.pdf
# 4. If PDF failed, try HTML fallback
curl -L --retry 3 --retry-delay 2 --retry-all-errors \
--connect-timeout 15 --max-time 60 \
-A "Mozilla/5.0" \
-o ./papers/ARXIV_ID.html https://arxiv.org/html/ARXIV_ID
# 5. Get Semantic Scholar metadata (venue, citations)
curl -s "https://api.semanticscholar.org/graph/v1/paper/ARXIV:ARXIV_ID?fields=venue,citationCount,publicationVenue,tldr"
Then read content:
# Prefer Read tool on ./papers/ARXIV_ID.pdf
# If the PDF download failed but HTML exists, parse ./papers/ARXIV_ID.html instead
# Do not read /tmp/... paths on Windows
Extract and record:
N/A)N/A)When download or APIs fail:
信息不足 if neither PDF nor HTML is accessible待验证 if Semantic Scholar API fails/semantic-scholar only as a metadata cross-check for venue, citations, DOI, and TLDRmemory.md)Read references/memory-format.md, then classify user knowledge into:
masteredfamiliarbasicunknownIf a concept is not present, default to unknown.
Use these signals:
Tag each candidate:
role=must: essential to understand the main methodrole=bridge: helps with key detailsrole=optional: useful but not requiredFor each must/bridge concept, bind at least one evidence item:
section: section name or numberquote: short original quoteusage: one-sentence explanation of usage in this paperIf the concept appears across multiple sections, include multiple evidence anchors.
Topologically sort by dependencies to produce order.
Difficulty levels:
1: term-level, ~30-60 minutes2: standard module-level, ~1-2 hours3: mechanism reasoning-level, ~2-4 hours4: implementation/theory detail-level, ~4-8 hours5: cross-paper synthesis-level, >8 hoursFor each concept provide:
minimum_goal: what "good enough" meansestimated_time: suggested time investmentmastered and not central to this paper's novel delta: downgrade to skip/review-optionalfamiliar: keep a minimal refresher pathbasic/unknown: keep in primary pathExplicitly state:
For each must concept recommend:
Follow references/resource-sourcing.md.
When lang=zh:
Select template by language:
lang=zh -> references/template.zh.mdlang=en -> references/template.en.mdreferences/template.mdRead selected template and write:
{timestamp}--paper-compass-learnpath-{short-title}__learnpath.md./)After writing, report the absolute output path to the user.
Must Learn item includes section-grounded evidence.