Deep literature survey pipeline: semantic search via local vec-db, web search for arXiv papers, download PDFs, parallel subagent reading, and structured report generation. Use this skill whenever the user asks to survey, research, or find related papers on a topic, especially when they mention vec-db, arXiv, literature review, paper search, or updating an idea/survey note. Also trigger when the user says things like "调研", "查论文", "找相关工作", "literature survey", "related work", "search papers", or "read papers and summarize". This skill orchestrates the full pipeline from search to structured markdown report.
Orchestrate a full literature survey: vec-db semantic search → web search → download → parallel subagent reading → structured report.
This skill automates the research paper discovery-and-analysis workflow. It combines a local vector database of ~60K+ indexed top-venue papers with web search for broader coverage, then deploys parallel subagents to deep-read each paper and produce a structured comparative analysis anchored to the user's research idea.
The goal is not just to list papers, but to produce actionable competitive intelligence: how does each paper relate to the user's proposed work, what can be borrowed, what differentiates the user's approach.
/home/vla-reasoning/proj/litian-research/vec-db/
cd /home/vla-reasoning/proj/litian-research/vec-db && npx tsx src/cli.ts search "<query>" --top <N>cd /home/vla-reasoning/proj/litian-research/vec-db && npx tsx src/cli.ts statuswget https://arxiv.org/pdf/<id> -O <id>.pdfBefore searching, gather from the conversation or by asking:
Run 5-8 diverse semantic queries against the vec-db to maximize coverage. Each query approaches the topic from a different angle.
Query design principles:
--top 15 per query to get broad coverageCRITICAL: Run all queries in parallel — issue multiple Bash tool calls in a single message. Do NOT run them sequentially.
Example for a "3D point cloud + VLA" topic:
# Angle 1: Direct topic
npx tsx src/cli.ts search "3D point cloud reconstruction for vision-language-action robot manipulation" --top 15
# Angle 2: Method-focused
npx tsx src/cli.ts search "depth prediction auxiliary task for imitation learning robot policy" --top 15
# Angle 3: Application-focused
npx tsx src/cli.ts search "vision language action model with 3D spatial understanding" --top 15
# Angle 4: Competing approach
npx tsx src/cli.ts search "implicit 3D grounding spatial representation for robotic grasping" --top 15
# Angle 5: Adjacent technique
npx tsx src/cli.ts search "auxiliary reconstruction loss for robot learning visual representation" --top 15
# Angle 6: Specific method name (if user mentions one)
npx tsx src/cli.ts search "spatial forcing depth estimation multi-view geometry for VLA" --top 15
Complement vec-db with Semantic Scholar to catch papers outside the indexed conferences:
# Keyword search (high citation)
curl -s "https://api.semanticscholar.org/graph/v1/paper/search?query=<URL_ENCODED_KEYWORDS>&limit=20&fields=title,year,authors,citationCount,externalIds,abstract&sort=citationCount:desc"
# Recent papers (last 2 years)
curl -s "https://api.semanticscholar.org/graph/v1/paper/search?query=<KEYWORDS>&limit=20&fields=title,year,authors,citationCount,externalIds,abstract&year=2024-2026"
Run 2-3 keyword variants. Semantic Scholar covers 200M+ papers including workshops, journals, and preprints that vec-db may miss. Use citation counts to prioritize.
From the combined vec-db + Semantic Scholar results, select ~10 papers that are:
For papers without arXiv IDs in vec-db results, check Semantic Scholar's externalIds.ArXiv field first, then fall back to WebSearch:
WebSearch: "<paper title>" arXiv <year>
allowed_domains: ["arxiv.org"]
Run multiple WebSearch calls in parallel for different papers.
If a paper has no arXiv preprint, replace it with another candidate that does.
Prefer AlphaXiv over downloading PDFs — it's faster and returns structured Markdown:
# Try AlphaXiv overview first (fast, structured)
WebFetch: https://alphaxiv.org/overview/<ARXIV_ID>.md
# If overview lacks detail, get full text
WebFetch: https://alphaxiv.org/abs/<ARXIV_ID>.md
Only download PDFs if AlphaXiv returns 404:
cd <papers-dir>/raw
for id in <id1> <id2> ...; do
[ -f "${id}.pdf" ] && echo "SKIP ${id}" || \
(wget -q "https://arxiv.org/pdf/${id}" -O "${id}.pdf" && echo "OK ${id}" || echo "FAIL ${id}")
done
Verify all downloads succeeded before proceeding.
CRITICAL REQUIREMENT: Launch ALL reading agents in a SINGLE message.
This is non-negotiable. Do NOT launch agents one by one. Do NOT wait for one to finish before starting the next. Issue all Agent tool calls in one response so they run truly in parallel.
For 10 papers, launch 5 agents in the first batch (each reading 2 papers), or 10 agents (each reading 1 paper) if the system supports it. The key is: one message, all agents, true parallelism.
Agent prompt template:
Read the paper at <pdf_path> (<paper_title>, <venue>).
This is for a literature survey on "<user's research topic>".
Our proposed idea: <1-2 sentence description of user's work>
Provide a structured analysis:
1. **Core Method**: What does this paper do and how? (2-3 sentences)
2. **Architecture**: Key components, modules, design choices
3. **Training**: Losses, data requirements, does it need depth/3D at inference?
4. **Key Results**: Performance numbers on main benchmarks. Key ablation findings.
5. **Relation to Our Work**:
- How does this paper's approach compare to ours?
- What can we learn from or differentiate against?
- Is this complementary or competing?
Be concise but thorough. Focus on technical details for comparison.
Parallelism patterns:
Pattern A — 10 agents, 1 paper each (preferred if feasible):
[Single message with 10 Agent tool calls]
Agent 1: Read paper_1.pdf → analysis
Agent 2: Read paper_2.pdf → analysis
...
Agent 10: Read paper_10.pdf → analysis
Pattern B — 5 agents, 2 papers each (if agent count is limited):
[Single message with 5 Agent tool calls]
Agent 1: Read paper_1.pdf AND paper_2.pdf → two analyses
Agent 2: Read paper_3.pdf AND paper_4.pdf → two analyses
...
Agent 5: Read paper_9.pdf AND paper_10.pdf → two analyses
NEVER do this (anti-pattern):
# BAD: Sequential launches — defeats the purpose
Agent 1: Read paper_1.pdf → wait → get result
Agent 2: Read paper_2.pdf → wait → get result # WRONG: should be parallel
After ALL agents return, synthesize findings into the target note:
Writing style:
git add <note_path>
git commit -m "Update survey with N new top-venue papers
Added analysis of:
- Paper1 (Venue): one-line summary
- Paper2 (Venue): one-line summary
...
Co-Authored-By: Claude <[email protected]>"
git push origin main