Name: Ma Data Extraction
Author: htlin222

Overview

Identify gaps: Scan extraction.csv for NULL or empty cells in critical columns (e.g., n_total , events_intervention , events_control , mean , sd ).
Search per study: For each study with gaps, run WebSearch with the query pattern: &quot;&lt;first_author&gt; &lt;year&gt; &lt;journal&gt; &lt;intervention&gt; &lt;outcome&gt; results&quot; or &quot;&lt;DOI&gt;&quot; or &quot;PMID:&lt;pmid&gt; abstract&quot;
Fetch structured sources: Use WebFetch on high-value URLs: PubMed abstract: https://pubmed.ncbi.nlm.nih.gov/&lt;pmid&gt;/ ClinicalTrials.gov: https://clinicaltrials.gov/study/&lt;nct_id&gt; Europe PMC: https://europepmc.org/article/MED/&lt;pmid&gt;
Extract and fill: Read the returned content, extract the missing values, and update extraction.csv .
Tag provenance: For every web-filled value, append [web] in the notes column (e.g., n_total from PubMed abstract [web] ).
Log in extraction-log.md: Record which studies/fields were filled via web search, with source URLs.

Extract consistent data, capture provenance, and build a clean analysis dataset.

Inputs

Source	Confidence	Action
PubMed structured abstract	0.90	Accept
ClinicalTrials.gov registry	0.85	Accept
Journal webpage / press release	0.70	Accept with note
Conference abstract only	0.60	Flag for verification
No source found	—	Leave NULL, document gap

Step	Skill	Stage
Prev	`/ma-fulltext-management`	04 Full-text Management
Next	`/ma-meta-analysis`	06 Statistical Analysis
All	`/ma-end-to-end`	Full pipeline orchestration