Comprehensive citation management for academic research; use when you need to discover papers (Google Scholar/PubMed), extract/verify metadata (DOI/PMID/arXiv/URL), and produce validated, clean BibTeX for manuscripts.
.bib file before submission..bib files for LaTeX/Overleaf workflows and complements literature review pipelines.requests>=2.31.0scholarly>=1.7.11 (optional; required only for Google Scholar automation)A complete, end-to-end workflow that searches, extracts metadata, formats, deduplicates, and validates a bibliography:
# 1) Search PubMed (biomedical focus)
python scripts/search_pubmed.py \
--query '"CRISPR-Cas Systems"[MeSH] AND "Gene Editing"[MeSH]' \
--date-start 2020-01-01 \
--date-end 2024-12-31 \
--limit 200 \
--output crispr_pubmed.json
# 2) Search Google Scholar (broad coverage)
python scripts/search_google_scholar.py "CRISPR gene editing therapeutics" \
--year-start 2020 \
--year-end 2024 \
--limit 100 \
--output crispr_scholar.json
# 3) Extract metadata from search outputs (or mixed identifiers)
cat crispr_pubmed.json crispr_scholar.json > combined_results.json
python scripts/extract_metadata.py \
--input combined_results.json \
--output combined.bib
# 4) Add known papers by DOI (append)
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2 >> combined.bib
python scripts/doi_to_bibtex.py 10.1126/science.aam9317 >> combined.bib
# 5) Format + deduplicate + sort (newest first)
python scripts/format_bibtex.py combined.bib \
--deduplicate \
--sort year \
--descending \
--output formatted.bib
# 6) Validate + auto-fix common issues + emit report
python scripts/validate_citations.py formatted.bib \
--auto-fix \
--report validation.json \
--output final_references.bib
# 7) Inspect validation results
cat validation.json
Google Scholar (scripts/search_google_scholar.py)
"deep learning"), author filters (author:LeCun), title-only (intitle:"neural networks"), exclusions (-survey), and year ranges.--year-start, --year-end: constrain publication years--limit: cap results--sort-by citations: prioritize highly cited papers (when supported by the script)PubMed (scripts/search_pubmed.py)
--query: supports MeSH terms, field tags, and Boolean logic--date-start, --date-end: publication date filtering--publication-types: e.g., Clinical Trial,Review--format: JSON or BibTeX output (if supported)(See: references/google_scholar_search.md, references/pubmed_search.md)
author, title, yearjournal, volume, number, pages, doibooktitle, pageseprint, archivePrefix)(See: references/metadata_extraction.md)
@article, @inproceedings, @book, @misc.-- (e.g., 123--145){CRISPR})Last, First and Last, First)FirstAuthorYearKeyword)(See: references/bibtex_formatting.md)
Validation typically checks:
doi.org and matches CrossRef metadata.Outputs may include a machine-readable report (e.g., JSON) with errors and warnings.
(See: references/citation_validation.md)3d:["$","$L46",null,{"content":"$47","frontMatter":{"name":"citation-management","description":"Comprehensive citation management for academic research; use when you need to discover papers (Google Scholar/PubMed), extract/verify metadata (DOI/PMID/arXiv/URL), and produce validated, clean BibTeX for manuscripts.","license":"MIT","author":"aipoch","source":"aipoch","source_url":"https://github.com/aipoch/medical-research-skills"}}]