技能档案

ClinicalTrials.gov API v2 Skill

Name: ClinicalTrials.gov API v2 Skill
Author: 1102tools

Query the ClinicalTrials.gov REST API v2 for clinical study data. Covers searching trials by condition, intervention, sponsor, location, and status; retrieving full study details by NCT ID; filtering by phase, study type, and date ranges. Trigger for any mention of clinical trials, ClinicalTrials.gov, NCT numbers, trial status, recruiting studies, clinical study data, trial phases, trial sponsors, interventional studies, observational studies, trial enrollment, eligibility criteria, trial results, or drug pipeline research. Also trigger when the user needs to find active trials for a condition, identify a sponsor's trial portfolio, look up a specific NCT study, compare trial designs, assess competitive pipelines, or support regulatory review with trial data. This skill serves scientists, reviewers, program staff, and anyone who works with clinical research data at FDA or in the broader biomedical community.

1102tools0 星标2026年4月10日

职业
分类: 调试

技能内容

Overview

The ClinicalTrials.gov API v2 (https://clinicaltrials.gov/api/v2/) provides free, no-auth access to data on 575,000+ clinical studies registered worldwide. No API key, no registration, no auth headers. Just HTTP GET with JSON responses.

Base URL: https://clinicaltrials.gov/api/v2/ API Docs: https://clinicaltrials.gov/data-api/api Web UI: https://clinicaltrials.gov

What this data is: Registration and results data for clinical studies conducted around the world. Includes study design, eligibility criteria, interventions, outcomes, sponsor information, recruitment status, enrollment, locations, and (for completed studies) results summaries. Data is submitted by study sponsors and investigators as required by FDAAA 801 and the Final Rule (42 CFR Part 11).

ClinicalTrials.gov API v2 Skill

1102tools0 星标2026年4月10日

职业
分类: 调试

技能内容

Overview

Base URL: https://clinicaltrials.gov/api/v2/ API Docs: https://clinicaltrials.gov/data-api/api Web UI: https://clinicaltrials.gov

相关技能

import urllib.request, urllib.parse, json

BASE = "https://clinicaltrials.gov/api/v2"

def ctgov_query(endpoint, params=None):
    """Query ClinicalTrials.gov API v2. Returns parsed JSON."""
    url = f"{BASE}/{endpoint}"
    if params:
        # Remove None values
        params = {k: v for k, v in params.items() if v is not None}
        url += "?" + urllib.parse.urlencode(params, safe=',|')
    req = urllib.request.Request(url, headers={"Accept": "application/json"})
    with urllib.request.urlopen(req, timeout=20) as resp:
        return json.loads(resp.read().decode())

Parameter	Description	Example
`query.cond`	Condition or disease	`query.cond=lung+cancer`
`query.intr`	Intervention or treatment	`query.intr=pembrolizumab`
`query.term`	General search term (any field)	`query.term=remdesivir+COVID`
`query.spons`	Sponsor or collaborator	`query.spons=Pfizer`
`query.locn`	Location (country, state, city, or facility)	`query.locn=Maryland`
`query.id`	NCT number(s), comma-separated	`query.id=NCT04280705,NCT05684276`

Parameter	Description	Values
`filter.overallStatus`	Recruitment status	RECRUITING, NOT_YET_RECRUITING, ACTIVE_NOT_RECRUITING, COMPLETED, ENROLLING_BY_INVITATION, SUSPENDED, TERMINATED, WITHDRAWN, AVAILABLE, NO_LONGER_AVAILABLE, TEMPORARILY_NOT_AVAILABLE, APPROVED_FOR_MARKETING, WITHHELD, UNKNOWN
`filter.ids`	Specific NCT IDs	Pipe-delimited: `NCT04280705\|NCT05684276`
`filter.geo`	Geographic proximity	`distance(lat,lng,radius)` e.g., `distance(38.9,-77.0,50mi)`
`filter.advanced`	Advanced field-level filter using AREA[] syntax	See Advanced Filtering below

Parameter	Description	Default
`pageSize`	Results per page	10 (max 1000)
`pageToken`	Cursor for next page (from previous response)	None
`countTotal`	Include total count in response	false (set to `true` to get `totalCount`)
`format`	Response format	`json` (also supports `csv`)
`sort`	Sort order	`LastUpdatePostDate:desc`, `StudyFirstPostDate:asc`, `@relevance`
`fields`	Specific fields to return (comma-separated)	All fields (see Field Selection)

def search_studies(condition=None, intervention=None, sponsor=None,
                   term=None, location=None, status=None, 
                   advanced_filter=None, geo=None,
                   page_size=10, count_total=True, sort=None, fields=None):
    """Search ClinicalTrials.gov studies."""
    params = {
        "format": "json",
        "pageSize": page_size,
    }
    if count_total:
        params["countTotal"] = "true"
    if condition:
        params["query.cond"] = condition
    if intervention:
        params["query.intr"] = intervention
    if sponsor:
        params["query.spons"] = sponsor
    if term:
        params["query.term"] = term
    if location:
        params["query.locn"] = location
    if status:
        params["filter.overallStatus"] = status
    if advanced_filter:
        params["filter.advanced"] = advanced_filter
    if geo:
        params["filter.geo"] = geo
    if sort:
        params["sort"] = sort
    if fields:
        params["fields"] = fields
    return ctgov_query("studies", params)

def get_study(nct_id, fields=None):
    """Get full study details by NCT ID."""
    params = {"format": "json"}
    if fields:
        params["fields"] = fields
    return ctgov_query(f"studies/{nct_id}", params)

def get_version():
    return ctgov_query("version")
# Returns: {"apiVersion": "2.0.5", "dataTimestamp": "2026-03-12T10:00:04"}

def get_stats():
    return ctgov_query("stats/size")
# Returns: {"totalStudies": 575781, ...}

# Filter by study type
filter_advanced = "AREA[StudyType]INTERVENTIONAL"

# Filter by phase
filter_advanced = "AREA[Phase]PHASE3"

# Filter by date range
filter_advanced = "AREA[StartDate]RANGE[2024-01-01,2024-12-31]"

# Combine multiple advanced filters with AND
filter_advanced = "AREA[StudyType]INTERVENTIONAL AND AREA[Phase]PHASE3"

# Filter by sponsor type
filter_advanced = "AREA[LeadSponsorClass]INDUSTRY"

Field	Values	Description
`StudyType`	INTERVENTIONAL, OBSERVATIONAL, EXPANDED_ACCESS	Type of study
`Phase`	EARLY_PHASE1, PHASE1, PHASE2, PHASE3, PHASE4, NA	Trial phase
`LeadSponsorClass`	NIH, FED (other federal), INDUSTRY, NETWORK, OTHER	Sponsor category
`StartDate`	RANGE[YYYY-MM-DD,YYYY-MM-DD]	Study start date range
`CompletionDate`	RANGE[YYYY-MM-DD,YYYY-MM-DD]	Study completion date range
`ResultsFirstPostDate`	RANGE[YYYY-MM-DD,YYYY-MM-DD]	Results posting date range
`Sex`	MALE, FEMALE, ALL	Eligible sex
`AgeRange`	CHILD, ADULT, OLDER_ADULT	Eligible age group
`StudyFirstPostDate`	RANGE[YYYY-MM-DD,YYYY-MM-DD]	Registration date range

{
  "totalCount": 2279,
  "nextPageToken": "ZVNj7o2Elu8o3lpoUti54...",
  "studies": [
    {
      "protocolSection": {
        "identificationModule": { "nctId": "...", "briefTitle": "...", "officialTitle": "..." },
        "statusModule": { "overallStatus": "RECRUITING", "startDateStruct": {...}, ... },
        "sponsorCollaboratorsModule": { "leadSponsor": { "name": "...", "class": "INDUSTRY" }, ... },
        "descriptionModule": { "briefSummary": "...", "detailedDescription": "..." },
        "conditionsModule": { "conditions": ["..."], "keywords": ["..."] },
        "designModule": { "studyType": "INTERVENTIONAL", "phases": ["PHASE3"], "enrollmentInfo": {...}, ... },
        "armsInterventionsModule": { "armGroups": [...], "interventions": [...] },
        "outcomesModule": { "primaryOutcomes": [...], "secondaryOutcomes": [...] },
        "eligibilityModule": { "eligibilityCriteria": "...", "sex": "ALL", "minimumAge": "18 Years", ... },
        "contactsLocationsModule": { "locations": [...] },
        "referencesModule": { "references": [...] }
      },
      "hasResults": false
    }
  ]
}

Module	Key Fields
`identificationModule`	nctId, briefTitle, officialTitle, organization
`statusModule`	overallStatus, startDateStruct, completionDateStruct, lastUpdateSubmitDate
`sponsorCollaboratorsModule`	leadSponsor (name, class), collaborators
`descriptionModule`	briefSummary, detailedDescription
`conditionsModule`	conditions (array), keywords
`designModule`	studyType, phases (array), enrollmentInfo (count, type), designInfo
`armsInterventionsModule`	armGroups (label, type, description), interventions (name, type, description)
`outcomesModule`	primaryOutcomes, secondaryOutcomes (measure, description, timeFrame)
`eligibilityModule`	eligibilityCriteria (full text), sex, minimumAge, maximumAge, healthyVolunteers
`contactsLocationsModule`	locations (facility, city, state, country, status)
`referencesModule`	references (pmid, citation, type)

# Minimal fields for a summary listing
fields = "NCTId,BriefTitle,OverallStatus,LeadSponsorName,Phase"

# Fields for sponsor portfolio analysis
fields = "NCTId,BriefTitle,OverallStatus,Phase,Condition,InterventionName,EnrollmentCount,StartDate"

import time

def paginate_studies(max_results=100, **search_kwargs):
    """Paginate through search results."""
    all_studies = []
    page_token = None
    
    while len(all_studies) < max_results:
        params = {**search_kwargs}
        if page_token:
            params["page_token"] = page_token
        
        r = search_studies(**params, count_total=(page_token is None))
        studies = r.get("studies", [])
        if not studies:
            break
        
        all_studies.extend(studies)
        page_token = r.get("nextPageToken")
        if not page_token:
            break
        time.sleep(0.5)
    
    return all_studies[:max_results]

def find_recruiting_trials(condition, page_size=10):
    """Find currently recruiting trials for a condition."""
    return search_studies(
        condition=condition,
        status="RECRUITING",
        page_size=page_size,
        sort="LastUpdatePostDate:desc")

def sponsor_pipeline(sponsor_name, page_size=20):
    """Get a sponsor's trial portfolio by phase and status."""
    return search_studies(
        sponsor=sponsor_name,
        status="RECRUITING|NOT_YET_RECRUITING|ACTIVE_NOT_RECRUITING",
        page_size=page_size,
        sort="LastUpdatePostDate:desc")

def trials_near_location(condition, lat, lng, radius_miles=50):
    """Find recruiting trials near a geographic point."""
    return search_studies(
        condition=condition,
        status="RECRUITING",
        geo=f"distance({lat},{lng},{radius_miles}mi)",
        page_size=20)

def study_detail_summary(nct_id):
    """Get a structured summary of a study."""
    study = get_study(nct_id)
    proto = study.get("protocolSection", {})
    
    ident = proto.get("identificationModule", {})
    status = proto.get("statusModule", {})
    design = proto.get("designModule", {})
    sponsor = proto.get("sponsorCollaboratorsModule", {}).get("leadSponsor", {})
    arms = proto.get("armsInterventionsModule", {})
    outcomes = proto.get("outcomesModule", {})
    elig = proto.get("eligibilityModule", {})
    
    return {
        "nct_id": ident.get("nctId"),
        "title": ident.get("briefTitle"),
        "official_title": ident.get("officialTitle"),
        "status": status.get("overallStatus"),
        "phases": design.get("phases", []),
        "study_type": design.get("studyType"),
        "enrollment": design.get("enrollmentInfo", {}),
        "sponsor": sponsor.get("name"),
        "sponsor_class": sponsor.get("class"),
        "conditions": proto.get("conditionsModule", {}).get("conditions", []),
        "interventions": [{"name": i.get("name"), "type": i.get("type")} 
                         for i in arms.get("interventions", [])],
        "arms": [{"label": a.get("label"), "type": a.get("type")} 
                for a in arms.get("armGroups", [])],
        "primary_outcomes": [o.get("measure") for o in outcomes.get("primaryOutcomes", [])],
        "eligibility": {
            "min_age": elig.get("minimumAge"),
            "max_age": elig.get("maximumAge"),
            "sex": elig.get("sex"),
            "healthy_volunteers": elig.get("healthyVolunteers")
        },
        "has_results": study.get("hasResults", False)
    }

import time

def compare_conditions(conditions, status="RECRUITING"):
    """Compare recruiting trial counts across conditions."""
    results = {}
    for cond in conditions:
        r = search_studies(condition=cond, status=status, page_size=1, count_total=True)
        results[cond] = r.get("totalCount", 0)
        time.sleep(0.5)
    return results

def sponsor_class_breakdown(condition):
    """Compare industry vs NIH vs other sponsorship for a condition."""
    results = {}
    for cls in ["INDUSTRY", "NIH", "OTHER"]:
        r = search_studies(
            condition=condition,
            advanced_filter=f"AREA[LeadSponsorClass]{cls}",
            page_size=1, count_total=True)
        results[cls] = r.get("totalCount", 0)
        time.sleep(0.5)
    return results

RECRUITING                   - Actively enrolling participants
NOT_YET_RECRUITING           - Approved but not yet enrolling
ACTIVE_NOT_RECRUITING        - Ongoing but no longer enrolling
COMPLETED                    - Study finished
ENROLLING_BY_INVITATION      - Enrolling by invitation only
SUSPENDED                    - Temporarily halted
TERMINATED                   - Stopped early
WITHDRAWN                    - Withdrawn before enrollment
AVAILABLE                    - Expanded access available
UNKNOWN                      - Status not verified in 2+ years

EARLY_PHASE1    - Early Phase 1 (formerly Phase 0)
PHASE1          - Phase 1
PHASE2          - Phase 2
PHASE3          - Phase 3
PHASE4          - Phase 4 (post-marketing)
NA              - Not Applicable (non-drug studies)

INTERVENTIONAL   - Tests an intervention (drug, device, procedure)
OBSERVATIONAL    - Observes outcomes without assigning interventions
EXPANDED_ACCESS  - Treatment use outside of clinical trials

NIH       - National Institutes of Health
FED       - Other federal agency (FDA, CDC, DoD, VA, etc.)
INDUSTRY  - Pharmaceutical/biotech/device company
NETWORK   - Cooperative group or network
OTHER     - Academic, hospital, individual, other

Error	Cause	Fix
"is unknown parameter"	Invalid parameter name	Check spelling; use `filter.advanced` with AREA[] for field-level filters, not `filter.studyType` or `filter.phase`
"Invalid value in parameter overallStatus"	Bad enum value	Use exact values from enum list (e.g., RECRUITING not Recruiting)
No `totalCount` in response	Forgot `countTotal=true`	Always include `countTotal=true` when you need the count
Empty studies array	No matches	Broaden search terms or check status filter
Timeout on large queries	Response too large	Reduce `pageSize`, add `fields` param to limit payload
429 / throttled	Too many requests	Add delays between requests; no official rate limit published
"contains invalid CSV column name"	Used `fields` param with `format=csv`	Remove `fields` param; CSV uses its own default column set
"Unknown sort field"	Invalid sort field name	Use `LastUpdatePostDate`, `StudyFirstPostDate`, or `@relevance` only

ClinicalTrials.gov API v2 Skill

Overview

ClinicalTrials.gov API v2 Skill

Overview

Rate Limits

Core Helper Function

Endpoints

1. Search Studies (PRIMARY WORKHORSE)

2. Get Single Study

3. API Version

4. Database Statistics

Advanced Filtering with AREA[] Syntax

Response Structure

Search Response

Key Response Fields

Field Selection

Pagination

Common Workflows

Search for Recruiting Trials by Condition

Sponsor Pipeline Analysis

Find Trials Near a Location

Get Study Details with Arms and Outcomes

Compare Trial Activity Across Conditions

Industry vs. NIH Sponsor Breakdown

Gotchas and Best Practices

1. countTotal Must Be Explicitly Set (MOST COMMON MISTAKE)

2. filter.overallStatus is the Only Top-Level Status Filter

3. Pagination is Cursor-Based, Not Offset-Based

4. Field Names Differ Between Query and Response

5. No API Key Required, But Be Courteous

6. URL-Encode Brackets in AREA[] Filters

7. Results Data Is Sparse

8. Status Values Are Enumerated

9. CSV Format Uses Different Column Names

10. Limited Sort Fields

Enum Reference

Overall Status Values

Phase Values (for AREA[Phase] filter)

Study Type Values (for AREA[StudyType] filter)

Sponsor Class Values (for AREA[LeadSponsorClass] filter)

Troubleshooting

Session Logs

OpenClaw Test Heap Leaks

Node Connect

Openclaw Qa Testing

Openclaw Secret Scanning Maintainer

Flags