Name: Systematic Review & Meta-Analysis (Autonomous)
Author: chunchiehfan

スキルを検索.../

Systematic Review & Meta-Analysis (Autonomous) | Skills Pool

Structure the user's question using PICO(S):
- P — Population (age, diagnosis, setting, comorbidities)
- I — Intervention (drug, procedure, exposure)
- C — Comparison (placebo, standard care, another intervention)
- O — Outcome(s): primary first, then secondary
- S — Study design (RCTs, observational, all)
Select MeSH terms + free-text synonyms for each PICO element.
Build a Boolean PubMed search string.

Log all decisions:

logger.log(phase=1, key="pico_definition",
           value={"P": "...", "I": "...", "C": "...", "O": "...", "S": "..."},
           rationale="Derived from user prompt: ...", confidence="auto")
logger.log(phase=1, key="search_string",
           value="(MeSH[MeSH] OR synonym[tiab]) AND ...",
           rationale="Combined MeSH + free-text for each PICO element", confidence="auto")
logger.log(phase=1, key="study_design_filter",
           value="RCTs and cohort studies",
           rationale="User asked about effectiveness, included both for breadth", confidence="auto")

If the user provided an NCBI API key, use it. Otherwise proceed without (slower rate limit).

Run PubMed search:

python $SKILL_DIR/scripts/pubmed_search.py "<search_string>" \
  --max-results 500 --api-key <KEY_OR_OMIT> --output pubmed_results.json

If >500 results, make a judgment call: narrow the search or increase max-results. Log the decision.

Fetch PMC full text for all retrieved articles:

python $SKILL_DIR/scripts/pmc_fulltext.py pubmed_results.json \
  --output-dir fulltext_data/ --api-key <KEY_OR_OMIT>

Log decisions:

logger.log(phase=2, key="search_executed",
           value={"query": "...", "results": N, "max_results": 500},
           rationale="...", confidence="auto")
logger.log(phase=2, key="no_fulltext_papers",
           value=["PMID1", "PMID2", ...],  # from fulltext_data/no_fulltext.json
           rationale="These PMIDs are not available in PMC. Screened on abstract only.",
           confidence="needs_review")

Write search_log.md with date, query, record count, API key status, PMC coverage.

[
  {
    "pmid": "12345678",
    "title": "...",
    "authors": "Smith et al.",
    "year": "2020",
    "decision": "include",
    "confidence": "auto",
    "reason": "RCT of empagliflozin vs placebo in T2DM, reports HF hospitalization"
  },
  {
    "pmid": "87654321",
    "title": "...",
    "decision": "exclude",
    "confidence": "auto",
    "reason": "Pediatric population — outside PICO"
  },
  {
    "pmid": "11111111",
    "title": "...",
    "decision": "include",
    "confidence": "uncertain",
    "reason": "Abstract unclear on comparator arm; included based on full-text review showing placebo control"
  }
]

logger.log(phase=3, key="auto_excluded",
           value={"count": N, "top_reasons": [{"reason": "...", "n": N}, ...]},
           rationale="High-confidence exclusions", confidence="auto")
logger.log(phase=3, key="auto_included",
           value={"count": N, "pmids": [...]},
           rationale="High-confidence inclusions", confidence="auto")
logger.log(phase=3, key="uncertain_decisions",
           value=[{"pmid": "...", "title": "...", "decision": "include/exclude", "reason": "..."}],
           rationale="Best-guess decisions for ambiguous papers",
           confidence="needs_review")

Effect measure reporting across included studies:
  OR (odds ratio): 7 studies — reported directly
  Raw events (convertible to OR/RR): 5 studies — event counts in results tables
  HR (hazard ratio): 2 studies — Cox regression results
  Adjusted OR: 3 studies — from multivariate logistic regression

logger.log(phase=4, key="effect_measure_frequency",
           value={"OR": 7, "raw_events": 5, "HR": 2, "adjusted_OR": 3},
           rationale="Surveyed all included studies' results sections and tables",
           confidence="auto")
logger.log(phase=4, key="effect_measure_chosen",
           value="OR",
           rationale="Binary outcome (HF hospitalization). 7/12 studies report OR natively. "
                     "5 more have raw events convertible to OR. HR studies excluded from primary "
                     "analysis (addressed in sensitivity analysis).",
           confidence="auto")

Drug stratification:
  Empagliflozin 10mg: 4 studies, N=12,450
  Empagliflozin 25mg: 2 studies, N=6,200
  Dapagliflozin 10mg: 3 studies, N=8,200
  Canagliflozin 300mg: 2 studies, N=4,100
  Mixed/class-level (no specific drug): 3 studies, N=6,800

logger.log(phase=4, key="drug_stratification",
           value={"drugs": [...], "approach": "individual_drug", "class_level_studies": [...]},
           rationale="Sufficient per-drug studies for individual pooling. Class-level studies "
                     "kept as separate subgroup.",
           confidence="auto")

logger.log(phase=4, key="data_extraction_summary",
           value={"studies_extracted": N, "conversions_applied": [...],
                  "data_source_tracking": [{"study": "Smith 2020", "table": "Table 2", "values": {...}}]},
           rationale="Extracted from full-text tables where available, abstract for PMIDs without PMC access",
           confidence="auto")

# Overall
python $SKILL_DIR/scripts/meta_analysis.py extracted_data.csv \
  --measure OR --output meta_results.json

# Per-drug (filter CSV to each drug first)
python $SKILL_DIR/scripts/meta_analysis.py extracted_data_empagliflozin.csv \
  --measure OR --output meta_results_empagliflozin.json

python $SKILL_DIR/scripts/generate_figures.py meta_results.json \
  --forest forest_plot.png --funnel funnel_plot.png \
  --prisma prisma_data.json --prisma-out prisma_diagram.png \
  --title "<Review Title>"

logger.log(phase=5, key="primary_analysis",
           value={"pooled_effect": 0.72, "ci": [0.61, 0.85], "I2": 42.1, "k": 12},
           rationale="DerSimonian-Laird random effects", confidence="auto")
logger.log(phase=5, key="sensitivity_exclude_high_rob",
           value={"pooled_effect": 0.71, "ci": [0.59, 0.86], "k": 10,
                  "interpretation": "Direction and significance unchanged"},
           rationale="Removed 2 high-RoB studies", confidence="auto")

Create systematic_review_report.md following the standard PRISMA structure (see the report template in references/extraction_templates.md — sections: Executive Summary, Background, Methods, Results, Discussion, Conclusion, Included Studies).
Include GRADE assessment for each primary outcome.
Save the decision log:
```
logger.save()
```

Generate the draft review with checkpoints:

python $SKILL_DIR/scripts/generate_review_report.py \
  --decisions decisions_log.json \
  --report systematic_review_report.md \
  --output draft_review.md

Present draft_review.md to the user and explain:
- Pre-checked items (✓) are decisions Claude is confident about — skim and move on
- Unchecked items (☐) need their attention
- "Change to" fields are where they write overrides
- When done reviewing, tell Claude to finalize or rerun

Parse the user's modifications to identify changed decisions.

Determine rerun scope:

python $SKILL_DIR/scripts/rerun_from_changes.py \
  --original decisions_log.json \
  --modified decisions_modified.json

Rerun only affected phases (from earliest change through Phase 6).
Regenerate draft_review.md with updated decisions and results.
If no changes were made, finalize the report as-is.

Phase	Key files	Decision logging
1. Topic refinement	`pico.md`	PICO, search string, design filter
2. Literature search	`pubmed_results.json`, `fulltext_data/`, `search_log.md`	Query, result count, no-fulltext PMIDs
3. Screening	`screening_results.json`, `prisma_data.json`	Auto-exclude/include/uncertain counts
4. Data extraction	`extracted_data.csv`, `study_characteristics.csv`	Effect measure, drug stratification, source tracking
5. Meta-analysis	`meta_results.json`, PNG figures	Pooled results, sensitivity analyses
6. Report	`draft_review.md`, `decisions_log.json`	All above consolidated

Systematic Review & Meta-Analysis (Autonomous)

Setup

Stage 1: Autonomous Run

Systematic Review & Meta-Analysis (Autonomous)

Setup

Stage 1: Autonomous Run

Phase 1: Topic Refinement

Phase 2: PubMed Search + PMC Full-Text Fetch

Phase 3: Screening

Define criteria

Confidence-tier screening

Phase 4: Data Extraction

Determine effect measure

Drug stratification (pharmacological reviews)

Extract data

Phase 5: Meta-Analysis

Run meta-analysis

Generate figures

Automatic sensitivity analyses

Phase 6: Report & Decision Review

Stage 2: Review & Rerun

Phase Summary

Session Logs

OpenClaw Test Heap Leaks

Node Connect

Openclaw Qa Testing

Openclaw Secret Scanning Maintainer

Flags