Collect and manage full-text PDFs for included studies, track provenance, and prepare documents for extraction. Use when moving from screening to data extraction.
Gather full texts, validate completeness, and prepare a clean manifest.
03_screening/round-01/included.bib04_fulltext/manifest.csv04_fulltext/unpaywall_results.csv (optional OA lookup)04_fulltext/fulltext_decisions.csv (Stage 04b — full-text eligibility screening)04_fulltext/ft_agreement.md (Stage 04b — full-text inter-rater agreement)04_fulltext/README.md04_fulltext/ PDF files04_fulltext/previews/ (optional PDF image previews)⚠️ Default approach: Web-based extraction first, PDF retrieval only for gaps.
04_fulltext/ and build manifest.csv with record_id, DOI, PMID, title, and access notes.
03_screening/round-01/included.bibreferences/manifest-template.csv as template04_fulltext/manifest.csv (columns: record_id, DOI, PMID, title, access_method, confidence_score)WebSearch and WebFetch tools:
https://pubmed.ncbi.nlm.nih.gov/<pmid>/)https://clinicaltrials.gov/study/<nct_id>)references/web-extraction.md for scoring).
04_fulltext/manifest.csv (confidence_score column)04_fulltext/manifest.csv (needs_pdf = TRUE)scripts/unpaywall_fetch.py via uv run.
scripts/unpaywall_fetch.py04_fulltext/manifest.csv (needs_pdf = TRUE rows)04_fulltext/unpaywall_results.csvscripts/download_oa_pdfs.py.
scripts/download_oa_pdfs.py04_fulltext/<record_id>.pdfscripts/render_pdf_previews.py for visual QA.
scripts/render_pdf_previews.py04_fulltext/previews/<record_id>_page1.png04_fulltext/manifest.csv (access_method = "manual")references/manifest-template.csv provides a manifest header.scripts/unpaywall_fetch.py queries Unpaywall for open-access links.scripts/analyze_unpaywall.py analyzes Unpaywall results and generates summary statistics.scripts/download_oa_pdfs.py downloads open-access PDFs automatically from Unpaywall URLs.scripts/render_pdf_previews.py renders PDF pages to PNG previews.
Note: Unpaywall requires UNPAYWALL_EMAIL in .env.
Note: PDF previews require pdftoppm or mutool installed.⚠️ MANDATORY — PRISMA 2020 requires reporting the number of full-text articles excluded with reasons.
After completing full-text retrieval (Phases 1-2 above), re-screen all included studies against the full text to confirm eligibility. This step catches issues not visible at the abstract stage (e.g., wrong population subgroup, insufficient sample size, protocol-only publications).
Run AI full-text screening (Reviewer 1):
uv run tooling/python/ai_screen.py --project <project-name> --stage fulltext --reviewer 1
Run AI full-text screening (Reviewer 2) for dual review:
uv run tooling/python/ai_screen.py --project <project-name> --stage fulltext --reviewer 2
Compute full-text inter-rater agreement (Cohen's kappa):
uv run ma-screening-quality/scripts/dual_review_agreement.py \
--file projects/<project-name>/04_fulltext/fulltext_decisions.csv \
--col-a FT_Reviewer1_Decision --col-b FT_Reviewer2_Decision \
--out projects/<project-name>/04_fulltext/ft_agreement.md
Resolve conflicts (if any) — update FT_Final_Decision and FT_Exclusion_Code columns.
Only studies with FT_Final_Decision = include proceed to Stage 05 (data extraction).
fulltext_decisions.csv)| Column | Description |
|---|---|
record_id | Matches manifest.csv and screening decisions |
title | Study title |
doi | Digital Object Identifier |
pmid | PubMed ID |
FT_Reviewer1_Decision | include / exclude |
FT_Reviewer1_Reason | Reason with exclusion code reference |
FT_Reviewer2_Decision | include / exclude |
FT_Reviewer2_Reason | Reason with exclusion code reference |
FT_Final_Decision | include / exclude (resolved) |
FT_Exclusion_Code | Exclusion code (P1, S2, etc.) or NONE |
Reuses standard codes from ma-screening-quality/references/screening-labels.md:
P1/P2 (population), I1/I2 (intervention), C1 (comparator), S1-S4 (study design),
O1/O2 (outcomes), T1/T2 (time), L1 (language), D1 (duplicate).
FT_Exclusion_Code feeds directly into PRISMA flow diagram item 16record_id continuity with screening decisions.fulltext_decisions.csv exists before proceeding to Stage 05.FT_Final_Decision values are resolved (no blanks) before extraction.| Step | Skill | Stage |
|---|---|---|
| Prev | /ma-screening-quality | 03 Screening & Quality |
| 04b | (this skill) | Full-text eligibility screening |
| Next | /ma-data-extraction | 05 Data Extraction |
| All | /ma-end-to-end | Full pipeline orchestration |
Edit PDFs with natural-language instructions using the nano-pdf CLI.