Verification-first literature review workflow that prevents LLM citation hallucination. Uses strict retrieval-only citations: all references must come from verified PubMed/CrossRef sources, never from free-form LLM generation. Use for: (1) Systematic literature reviews requiring verified citations only, (2) Academic writing where citation accuracy is critical, (3) Literature synthesis with complete traceability to source data.
Only cite what has been retrieved. Phase 1 retrieves → Verified Database → Phase 2-5 use ONLY those citations → Phase 5 verifies.
当使用本skill进行文献综述时,必须遵循以下工作流程:
1. 生成详细系统的todo列表
↓
2. 执行单个任务
↓
3. 回溯核验该任务结果
↓
4. 核验通过后,进入下一任务
↓
5. 重复步骤2-4,直到所有任务完成
工作流程要求:
示例todo结构:
- Phase 1: 执行PubMed检索
- Phase 1: 核验 - 检查数据完整性(abstracts、DOIs、authors)
- Phase 2: 生成citation数据库
- Phase 2: 核验 - 检查citekey格式和唯一性
- Phase 3: 生成BibTeX文件
- Phase 3: 核验 - 检查BibTeX条目完整性
...
# Phase 0: Generate search strategy from topic
python scripts/generate_search_strategy.py "吸烟与认知障碍的关系" -o process/search_mapping.json
# Phase 1: Retrieve from PubMed (with batching and rate limiting)
python scripts/pubmed_batch_retrieval.py process/search_mapping.json
# Phase 1: Verify data completeness
python scripts/verify_phase1_data.py process/phase1_pubmed_results_*.json
# Phase 2: Screen papers with study type detection and journal ranking
python scripts/screen_papers.py process/phase1_pubmed_results.json \\
--top-journals-only \\
--sort-by evidence \\
-o process/phase2_screened.json
# Alternative: LLM-assisted analysis and writing
python scripts/analyze_and_write.py process/phase2_screened.json \\
--mode full \\
--topic "吸烟与认知障碍的关系" \\
-o review.md
# Phase 3: Generate citation database and BibTeX
python scripts/build_citation_db.py process/phase2_screened.json -o process/citation_db.json
python scripts/generate_bibtex.py process/phase2_screened.json
# Phase 4: Verify citations in final document
python scripts/verify_bibtex_citations.py chapterXX/X.X-final.md process/references.bib
# Phase 5: Generate PDF (Final Step)
python scripts/generate_pdf.py chapterXX/X.X-final.md process/references.bib
Phase 0 → Phase 1 → Phase 2 → Phase 3 → Phase 4 → Phase 5
AI辅助 PubMed 智能筛选 写作/LLM 验证 PDF
检索策略 (自动) (研究类型) (交互式) (自动) (渲染)
期刊排名
自动化阶段: Phase 1-2, 4-5 (本skill提供脚本) 手动阶段: Phase 0 (需要用户干预) LLM辅助模式: Phase 3 (LLM直接读取数据并撰写)
python scripts/generate_search_strategy.py "your topic" [--dry-run]
交互式AI辅助模式:
脚本会生成结构化的MeSH检索策略,基于主题的PICO要素:
输出: search_mapping.json 包含多个检索主题,每个主题有对应的PubMed查询式。
手动调整选项: 可以直接编辑生成的 search_mapping.json 调整检索式。参考 phase1-retrieval.md 中的格式。
Batch retrieval with curl, rate limiting (3 req/sec), resume capability, complete data extraction (100% abstract coverage).
# Run retrieval
python scripts/pubmed_batch_retrieval.py process/search_mapping.json
# Resume from checkpoint if interrupted
python scripts/pubmed_batch_retrieval.py process/search_mapping.json --resume
# Verify data completeness after retrieval
python scripts/verify_phase1_data.py process/phase1_pubmed_results_*.json
Data Quality: 100% abstract coverage using itertext() for nested XML markup, 98%+ DOI coverage.
See phase1-retrieval.md for details.
两种筛选方案:
使用内置脚本进行智能筛选,支持:
# 按证据等级排序,筛选高质量研究
python scripts/screen_papers.py process/phase1_pubmed_results.json \\
--include "dietary" "cognitive" \\
--exclude "animal" \\
--abstract-only \\
--sort-by evidence \\
-o process/phase2_screened.json
# 只保留顶级期刊 (Tier 1-3)
python scripts/screen_papers.py process/phase1_pubmed_results.json \\
--top-journals-only \\
--sort-by journal \\
-o process/phase2_screened.json
# 只保留特定研究类型
python scripts/screen_papers.py process/phase1_pubmed_results.json \\
--study-types "Meta-Analysis" "Randomized Controlled Trial" \\
-o process/phase2_screened.json
适用于: 快速scoping review、优先选择高质量证据
完整的PRISMA筛选流程:
详细的筛选方法学请参考: phase2-screening.md
适用于: 系统性综述、发表级研究
筛选后处理:
# 筛选结果与Phase 1格式兼容
python scripts/build_citation_db.py process/phase2_screened.json
python scripts/generate_bibtex.py process/phase2_screened.json
注意: 如果不进行筛选,可以直接使用Phase 1结果进行后续步骤。
核心流程:LLM直接读取筛选后的文献数据(phase2_screened.json),分析文献内容,撰写综述。
# 生成citekey数据库
python scripts/build_citation_db.py process/phase2_screened.json
# 生成BibTeX文件
python scripts/generate_bibtex.py process/phase2_screened.json
执行方式:请求LLM读取以下文件并撰写综述
process/citation_db.json - 包含完整文献数据和citekey(含摘要、作者、期刊等)process/phase2_screened.json - 筛选后的原始文献数据(可选,用于交叉验证)LLM写作流程:
phase2_screened.json中的所有文献字段(title, authors, abstract, journal, year等)[@citekey]格式引用文献,citekey来自citation_db.json写作规则:
[@citekey]引用格式,citekey来自citation_db.json(Author, Year) 格式引用示例:
多项研究报道了运动对认知功能的积极影响。例如,经颅直流电刺激结合有氧运动可以改善卒中后认知功能障碍患者的工作记忆 [@zhou2025_exploring]。另一项研究发现,长期阻力运动可以改善APP/PS1小鼠的认知缺陷 [@azevedo2025_distinct]。
analyze_and_write.py 是一个辅助工具,适用于需要更结构化分析的场景:
# 完整流程 - 分析+写作
python scripts/analyze_and_write.py phase2_screened.json --mode full --topic "运动与认知功能" -o review.md
适用场景:
注意:对于大多数场景,LLM直接读取数据并撰写综述(步骤2)更高效。
Verify all citekeys exist in BibTeX file.
python scripts/verify_bibtex_citations.py chapterXX/X.X-final.md process/references.bib [--verify-doi]
See phase5-verification.md for details.
Render Markdown to final PDF with formatted citations.
python scripts/generate_pdf.py chapterXX/X.X-final.md process/references.bib
Features:
[@citekey] to formatted citationsRequirements:
brew install pandocbrew install mactex or tlmgr install scheme-fullOutput formats: PDF, HTML, Word (configurable)
Multiple studies documented this [@bold2023_smartphone; @jones2024_potentially].
NOT: (Bold et al., 2023; Jones et al., 2024)
{first_author_surname}{year}_{first_word}
Examples: yu2021_association, jones2024_potentially
project/
├── process/
│ ├── search_mapping.json # Phase 0: Search strategy
│ ├── phase1_pubmed_results_*.json # Phase 1: Retrieved articles
│ ├── phase1_checkpoint.json # Resume capability
│ ├── phase2_screened.json # Phase 2: Screened articles
│ ├── phase2_analyzed.json # Phase 3: LLM analysis results (optional)
│ ├── citation_db.json # Citation database with citekeys
│ └── references.bib # BibTeX file
└── chapterXX/
└── X.X-final.md # Uses [@citekey] markers
Testing & Validation: See testing-results.md for test results and data structure specifications.
Troubleshooting: See testing-results.md for common issues and solutions.
| Script | Purpose |
|---|---|
generate_search_strategy.py | Topic → MeSH queries |
pubmed_batch_retrieval.py | PubMed retrieval with curl, batching, rate limiting |
verify_phase1_data.py | Verify Phase 1 data completeness (abstracts, DOIs, duplicates) |
screen_papers.py | Literature screening with study type detection, journal ranking |
analyze_and_write.py | LLM-assisted literature analysis and writing (interactive) |
build_citation_db.py | Generate citekey database |
generate_bibtex.py | Create .bib file from articles |
verify_bibtex_citations.py | Verify citations in markdown |
generate_pdf.py | Render Markdown to PDF |
Phase Guides:
Writing Resources:
Additional Reference: