Conduct literature searches for meta-analysis using Python with uv, query PubMed and other databases, deduplicate results, and store round-based bibliographies with notes. Use when building or updating the evidence corpus.
Run reproducible database searches, capture the search strategy, and produce versioned .bib files.
01_protocol/search-plan.md01_protocol/pico.yaml02_search/round-01/queries.txt02_search/round-01/results.bib02_search/round-01/dedupe.bib02_search/round-01/log.mdqueries.txt.
01_protocol/pico.yaml (L1-8: population, intervention, comparison fields)02_search/round-01/queries.txttooling/python/uv inituv adduv run from tooling/python/ (avoid direct python3 calls).
scripts/pubmed_fetch.py (L105-155: fetch_pubmed_records function)scripts/scopus_fetch.py for Scopus APIuv tool for any one-off CLI utilities (do not install them globally)..bib and record the run date, database, and query hash in log.md.
scripts/pubmed_fetch.py (L54-103: build_bib_entry function)02_search/round-01/results.bib02_search/round-01/log.mddedupe.bib.
scripts/dedupe_bib.py02_search/round-01/dedupe.bibgenerate_prisma_flowchart.R (see below).
scripts/generate_prisma_flowchart.R02_search/prisma-flow.png (300 DPI minimum)scripts/pubmed_fetch.py for the default PubMed pipeline with uv run.references/pubmed-eutils.md for a compact tutorial and API notes..env in the project root.scripts/pubmed_fetch.py fetches PubMed records and writes BibTeX.scripts/dedupe_bib.py removes duplicate records based on DOI, PMID, or title.scripts/build_queries.py builds multi-DB queries from pico.yaml.scripts/mesh_expand.py expands terms via the MeSH RDF lookup service.scripts/expand_terms.py expands PICO terms using MeSH and optional Emtree synonyms.scripts/run_multi_db_search.py runs multi-DB search, merge, and counts.scripts/multi_db_dedupe.py merges and deduplicates multiple BibTeX files.scripts/db_counts.py summarizes per-database counts for PRISMA.scripts/search_report.py generates a per-database query + count report.scripts/search_audit.py generates a JSON audit with query hashes and parameters.scripts/scopus_fetch.py fetches Scopus Search API results.scripts/embase_fetch.py fetches Embase Search API results.scripts/cochrane_fetch.py fetches Cochrane ReviewDB API results.scripts/bib_subset_by_ids.py extracts a BibTeX subset from CSV record IDs.scripts/zotero_fetch.py fetches records from a Zotero collection.scripts/zotero_sync.py syncs a .bib file back to a Zotero collection.scripts/env_utils.py loads .env credentials.scripts/generate_prisma_flowchart.R generates PRISMA 2020 compliant flow diagrams in PNG/PDF/SVG/HTML formats.references/pubmed-eutils.md summarizes the E-utilities workflow.references/database-auth.md summarizes authentication per database.references/emtree-synonyms-template.csv provides a template for Emtree synonyms.references/prisma-flowchart-guide.md provides complete PRISMA 2020 flowchart generation guide..bib files..bib entry for the round (example: note = {round-01} ).Purpose: Generate publication-quality PRISMA 2020 flow diagrams automatically.
When: After completing database searches and deduplication.
Command:
cd ma-search-bibliography/scripts
Rscript generate_prisma_flowchart.R \
DB_RECORDS \
SCREENED \
EXCLUDED \
FULLTEXT \
INCLUDED \
[PARTICIPANTS] \
[OUTPUT_DIR]
Example (after search completion, before screening):
# Count database records
DB_RECORDS=$(wc -l < ../../projects/<project-name>/02_search/round-01/dedupe.bib | xargs)
# Generate initial flowchart (screening numbers TBD)
Rscript generate_prisma_flowchart.R $DB_RECORDS 0 0 0 0 NA ../../projects/<project-name>/figures/
Outputs:
prisma_flowchart.png (300 DPI, for manuscript)prisma_flowchart.pdf (vector, for publication)prisma_flowchart.svg (scalable, for presentations)prisma_flowchart_interactive.html (interactive, for supplementary materials)Time: 30 seconds to 2 minutes
See: references/prisma-flowchart-guide.md for complete documentation
| Step | Skill | Stage |
|---|---|---|
| Prev | /ma-topic-intake | 01 Protocol & PICO |
| Next | /ma-screening-quality | 03 Screening & Quality |
| All | /ma-end-to-end | Full pipeline orchestration |