Preprint server API for biology and medicine papers
bioRxiv (pronounced "bio-archive") is a free online archive and distribution service for unpublished preprints in the life sciences. Operated by Cold Spring Harbor Laboratory, it provides researchers with immediate access to the latest findings before formal peer review. The bioRxiv API enables programmatic access to preprint metadata, content details, and publication linkage data across biology and medical sciences.
The API serves researchers who need to track emerging research trends, monitor preprint activity in specific subfields, or build automated literature surveillance pipelines. It is particularly valuable for systematic reviewers who want to capture the latest evidence before journal publication, and for bibliometric analysts studying the preprint-to-publication pipeline.
bioRxiv hosts preprints across more than 25 subject areas including neuroscience, genomics, bioinformatics, cell biology, and many more. The API returns structured metadata including titles, authors, abstracts, DOIs, publication dates, and links to corresponding published journal articles when available.
No authentication required. The bioRxiv API is fully open and does not require any API key, token, or registration. All endpoints are publicly accessible without rate limiting restrictions.
Fetch detailed metadata for preprints posted within a specified date range or for a specific server (bioRxiv or medRxiv).
GET https://api.biorxiv.org/details/{server}/{interval}/{cursor}| Parameter | Type | Required | Description |
|---|---|---|---|
| server | string | Yes | Server name: biorxiv or medrxiv |
| interval | string | Yes | Date range in YYYY-MM-DD/YYYY-MM-DD format |
| cursor | int | No | Pagination cursor (default 0, increments of 100) |
curl "https://api.biorxiv.org/details/biorxiv/2024-01-01/2024-01-31/0"
doi, title, authors, author_corresponding, date, category, abstract, published (journal DOI if available), and jatsxml link.Look up which preprints have been published in peer-reviewed journals, providing the mapping between preprint DOIs and journal article DOIs.
GET https://api.biorxiv.org/pubs/{server}/{interval}/{cursor}| Parameter | Type | Required | Description |
|---|---|---|---|
| server | string | Yes | Server name: biorxiv or medrxiv |
| interval | string | Yes | Date range in YYYY-MM-DD/YYYY-MM-DD format |
| cursor | int | No | Pagination cursor (default 0, increments of 100) |
curl "https://api.biorxiv.org/pubs/biorxiv/2024-01-01/2024-06-30/0"
preprint_doi, published_doi, preprint_title, published_journal, published_date, and preprint_date.No formal rate limits are documented for the bioRxiv API. However, responsible use is expected. Results are paginated at 100 records per request, and the cursor parameter should be incremented to retrieve additional pages. Avoid excessive concurrent requests to ensure availability for all users.
Retrieve the latest preprints and filter by category to track new submissions in your field:
# Fetch recent neuroscience preprints
curl "https://api.biorxiv.org/details/biorxiv/2024-06-01/2024-06-07/0" \
| jq '.collection[] | select(.category == "neuroscience")'
Monitor which preprints in your area have been formally published:
# Check publication status for recent preprints
curl "https://api.biorxiv.org/pubs/biorxiv/2024-01-01/2024-06-30/0" \
| jq '.collection[] | select(.published_doi != "")'
Paginate through all results for a given date range to build a comprehensive alert feed:
import requests
base = "https://api.biorxiv.org/details/biorxiv/2024-06-01/2024-06-07"
cursor = 0
all_preprints = []
while True:
resp = requests.get(f"{base}/{cursor}").json()
records = resp.get("collection", [])
if not records:
break
all_preprints.extend(records)
cursor += 100
print(f"Total preprints retrieved: {len(all_preprints)}")
medrxiv as server parameter)