Step 3 of the PaperOrchestra pipeline (arXiv:2604.05018). Execute the literature search strategy from outline.json — discover candidate papers via web search, verify them through Semantic Scholar (Levenshtein > 70 fuzzy title match, temporal cutoff, dedup by paperId), build a BibTeX file, and draft Introduction + Related Work using ≥90% of the verified pool. Runs in parallel with the plotting-agent. TRIGGER when the orchestrator delegates Step 3 or when the user asks to "find citations for my paper", "draft the related work", or "build the bibliography".
Faithful implementation of the Hybrid Literature Agent from PaperOrchestra (Song et al., 2026, arXiv:2604.05018, §4 Step 3, App. D.3, App. F.1 p.46).
Cost: ~20–30 LLM calls. This is one of the two longest steps (the other is plotting). Wall-time floor is set by Semantic Scholar's 1 QPS verification limit.
workspace/outline.json — specifically intro_related_work_plan with the
Introduction search directions and the 2-4 Related Work methodology
clustersworkspace/inputs/conference_guidelines.md — used to derive cutoff_dateworkspace/inputs/idea.md, workspace/inputs/experimental_log.md — for
framing the Intro and grounding the Related Work positioningworkspace/citation_pool.json — verified Semantic Scholar metadata for
every paper that survived verificationworkspace/refs.bib — BibTeX file generated from the verified poolworkspace/drafts/intro_relwork.tex — drafted Introduction and Related
Work sections, written into the template, with the rest of the template
preserved verbatimPHASE 1 — Parallel Candidate Discovery
For each search direction in introduction_strategy.search_directions:
For each limitation_search_query in each related_work cluster:
- Use the host's web search tool to discover up to ~10 candidate papers.
- Run up to 10 discovery queries in parallel (host-permitting).
- Collect (title, snippet, url) tuples — no verification yet.
→ PRE-DEDUP before Phase 2 (see Step 1.5 below)
PHASE 2 — Sequential Citation Verification (1 QPS, with cache)
For each candidate (after pre-dedup), sequentially:
0. Check s2_cache.json first (scripts/s2_cache.py --check).
If HIT: use cached response, skip live S2 call. No throttle needed.
If MISS: proceed with live request below.
1. Query Semantic Scholar by title:
GET https://api.semanticscholar.org/graph/v1/paper/search?query=<title>
&fields=title,abstract,year,authors,venue,externalIds&limit=5
(Public endpoint, no key. Throttle to 1 QPS for live requests only.)
2. Store the S2 response in cache: s2_cache.py --store.
3. Pick the top hit. Check Levenshtein title ratio against the original
candidate title. If ratio < 70: discard.
4. Bonus: if year and venue exactly align with hints, add a +5 point
match-quality bonus.
5. Require: abstract is non-empty.
6. Require: paper.year (or month if known) strictly predates cutoff_date.
Months default to day-1: e.g., "October 2024" → 2024-10-01.
7. If all checks pass, add to verified pool.
After all candidates are verified, dedup by Semantic Scholar paperId.
The host agent does the LLM/web work; the deterministic helpers in scripts/
do the math.
cutoff_dateParse conference_guidelines.md for the submission deadline. The paper aligns
research cutoff with venue submission deadline (App. D.1):
| Venue | Cutoff |
|---|---|
| CVPR 2025 | Nov 2024 |
| ICLR 2025 | Oct 2024 |
| Other | One month before the stated submission deadline |
Encode as YYYY-MM-DD. Months default to day-1 (e.g., 2024-10-01).
From outline.json:
introduction_strategy.search_directions (3-5 queries)related_work_strategy.subsections:
sota_investigation_mission becomes a search querylimitation_search_queries (1-3 each)For each query, use your host's web search tool (e.g., WebSearch in
Claude Code, @web in Cursor, the search tool in Antigravity). Collect the
top ~10 candidates per query: title, abstract snippet, source URL.
If your host supports parallel sub-tasks, fire up to 10 concurrent search queries. If not, run sequentially — slower but functionally equivalent.
If your host has no native web search, OR you want a research-paper-focused
backend with better signal-to-noise, you can use Exa via
the bundled scripts/exa_search.py helper. It is opt-in and reads
EXA_API_KEY from the environment — the repo never commits a key.
export EXA_API_KEY="your-key-here" # get one at https://dashboard.exa.ai/
python skills/literature-review-agent/scripts/exa_search.py \
--query "Sparse attention long context transformers" \
--num-results 15 \
--discovered-for "related_work[2.1]"
Output is a normalized candidate list ready to merge into
raw_candidates.json. Phase 2 verification (Semantic Scholar fuzzy match,
cutoff, dedup) is unchanged. See references/exa-search-cookbook.md for
the full recipe, query patterns, cost estimates, and security notes.
Combine all discovered candidates into a single working list. Tag each with the originating query ID so you can later attribute it to "intro" vs "related_work[i]".
Always run this before starting Phase 2. Multiple search queries routinely return the same papers (e.g., "Attention is All You Need" appears in almost every NLP discovery query). Verifying duplicates wastes 30-40% of S2 quota at 1 QPS.
python skills/literature-review-agent/scripts/pre_dedup_candidates.py \
--in workspace/raw_candidates.json \
--out workspace/deduped_candidates.json
# Prints: "150 candidates → 97 unique (53 duplicates removed)"
Use workspace/deduped_candidates.json as input to Phase 2.
For each candidate in deduped_candidates.json, in sequential order:
Step A — check cache first (no S2 call, no throttle needed):
python skills/literature-review-agent/scripts/s2_cache.py \
--cache workspace/cache/s2_cache.json \
--check "<candidate title>"
# exit 0 + prints JSON → use cached response, skip Step B
# exit 1 → proceed to Step B
Step B — live S2 request (cache MISS only, throttle to 1 QPS):
Preferred: use the bundled scripts/s2_search.py helper — it handles
auth, retries, and 429 back-off automatically:
python skills/literature-review-agent/scripts/s2_search.py \
--query "<URL-decoded candidate title>" --limit 5
# If SEMANTIC_SCHOLAR_API_KEY is set the key is forwarded automatically.
# If not, the public unauthenticated endpoint is used (≤1 QPS, still works).
Check whether the key is configured before starting Phase 2:
python skills/literature-review-agent/scripts/s2_search.py --check-key
Fallback: if you prefer your host's URL fetch tool, GET: