Contenido de la habilidad

An end-to-end clinical meta-analysis pipeline following Cochrane Handbook methodology. Three phases: literature identification (PICO search + screening), critical appraisal (extraction + RoB 2.0), and deterministic statistical synthesis (DL pooling + GRADE). Outputs a Cochrane-style Markdown report and structured JSON.

Setup

Install Python dependencies before running any pipeline phase:

pip install scipy statsmodels numpy

Verify all pipeline modules load correctly (run from the skill directory):

cd ${CLAUDE_SKILL_DIR} && python3 -c "from pipeline.effect_sizes import compute_log_or; from pipeline.pooling import pool_random_effects_dl; from pipeline.grade import compute_grade; from pipeline.report import assemble_report; print('OK')"

Important: All Python commands must be run from the skills/meta-analyst/ directory so that resolves correctly. Always prefix Python calls with .

from pipeline.X import ...

cd ${CLAUDE_SKILL_DIR} &&

User question (free-text clinical question)
  │
  ▼
PHASE 1 — IDENTIFY
  Stage 1.1  PICO Formalization          [LLM: extract P, I, C, O with MeSH + synonyms]
  Stage 1.2  Search Strategy             [LLM: Cochrane MECIR + RCT filter → human ✓]
  Stage 1.3  Search Execution            [API: PubMed E-utilities, CENTRAL, CTG v2 → dedup]
  Stage 1.4  Abstract Screening          [LLM: include/exclude/uncertain → human ✓]
  Stage 1.5  PRISMA Flow                 [Python: prisma_flow_svg()]
  │
  ▼
PHASE 2 — APPRAISE
  Stage 2.1  Data Extraction             [LLM: 3× majority-vote extraction from full text]
  Stage 2.2  Risk of Bias Assessment     [Evidence Evaluator + fallback RoB 2.0 checklist]
  Stage 2.3  Characteristics Table       [Python: format_characteristics_table()]
                                         [→ human ✓ before synthesis]
  │
  ▼
PHASE 3 — SYNTHESIZE
  Stage 3.1  Effect Size Computation     [Python: effect_sizes.py]
  Stage 3.2  Meta-Analytic Pooling       [Python: pooling.py — DL default, FE comparison]
  Stage 3.3  Heterogeneity              [Python: heterogeneity.py — Q, I², tau², PI]
  Stage 3.4  Sensitivity Analyses        [Python: sensitivity.py — LOO, high-RoB, FvR]
  Stage 3.5  Publication Bias            [Python: publication_bias.py — Egger's (k≥10)]
  Stage 3.6  GRADE Certainty             [Python + LLM: 5 domains]
  Stage 3.7  Report Assembly             [Python: report.py — Markdown + JSON]
  │
  ▼
Output: meta_analysis_[slug].md + meta_analysis_[slug].json

from pipeline.visualizations import prisma_flow_svg

svg = prisma_flow_svg({
    "db_pubmed": n, "db_central": n, "db_ctgov": n,
    "identified": n, "duplicates_removed": n, "screened": n,
    "excluded_screening": n, "eligible": n,
    "excluded_eligibility": n, "included": n,
})

npx skills add SciSpark-ai/evidence_evaluator

from pipeline.visualizations import rob_traffic_light_svg

svg = rob_traffic_light_svg([
    {"study": "McMurray 2019", "domains": [
        {"domain": "D1: Randomisation", "judgment": "low"},
        {"domain": "D2: Deviations",    "judgment": "low"},
        {"domain": "D3: Missing data",  "judgment": "low"},
        {"domain": "D4: Measurement",   "judgment": "low"},
        {"domain": "D5: Reporting",     "judgment": "some concerns"},
    ]}
])

from pipeline.report import format_characteristics_table

table_md = format_characteristics_table(table_studies)
# table_studies: list of dicts with keys:
# first_author, year, n_intervention, n_control,
# intervention_description, comparator_description,
# followup_duration, rob_overall

from pipeline.effect_sizes import compute_log_or, compute_md, compute_smd, zero_cell_correction

from pipeline.pooling import pool_random_effects_dl, pool_fixed_effect_iv, pool_mantel_haenszel

from pipeline.heterogeneity import cochrans_q, i_squared, tau_squared_dl, prediction_interval

from pipeline.sensitivity import leave_one_out, exclude_high_rob, fixed_vs_random_comparison

from pipeline.publication_bias import eggers_test, funnel_plot_data

from pipeline.grade import (
    assess_risk_of_bias, assess_inconsistency, assess_imprecision,
    assess_publication_bias, compute_grade, grade_summary_row
)

Domain	Method	Function
Risk of bias	Deterministic: proportion of high-RoB studies	`assess_risk_of_bias(rob_ratings)`
Inconsistency	Deterministic: I² thresholds	`assess_inconsistency(i_squared)`
Indirectness	LLM reasoning: PICO match assessment	Agent judgment → integer 0/−1/−2
Imprecision	Deterministic: CI crosses null + OIS	`assess_imprecision(ci_lower, ci_upper, null_value, ois, total_n)`
Publication bias	Deterministic: Egger's p (k≥10 only)	`assess_publication_bias(eggers_p, k)`

certainty = compute_grade("High", {
    "rob": rob_downgrade,
    "inconsistency": inconsistency_downgrade,
    "indirectness": indirectness_downgrade,   # agent-assigned
    "imprecision": imprecision_downgrade,
    "publication_bias": pub_bias_downgrade,
})

from pipeline.report import assemble_report, format_characteristics_table, format_grade_sof_table

report = assemble_report(
    pico=pico_json,
    search={"query": query, "date_range": date_range, "databases": [...]},
    prisma_counts=prisma_counts,
    characteristics_table=format_characteristics_table(table_studies),
    rob_summary="Narrative RoB summary.",
    outcomes=[...],          # list of outcome dicts with pooling, heterogeneity, grade
    sensitivity={...},       # dict keyed by outcome name
    publication_bias={...},  # dict keyed by outcome name
    prisma_svg=None,         # auto-generated if None
    rob_svg=rob_svg,
)

import json
with open("meta_analysis_output.md", "w") as f:
    f.write(report["markdown"])
with open("meta_analysis_output.json", "w") as f:
    json.dump(report["json"], f, indent=2)

Checkpoint	Phase/Stage	What the user reviews	What to do next
#1 — Query Review	Phase 1, Stage 1.2	Proposed search queries for all three databases	Execute searches only after approval
#2 — Shortlist Review	Phase 1, Stage 1.4	Included + uncertain papers from abstract screening	Begin Phase 2 data extraction only after approval
#3 — Extraction + RoB Review	Phase 2, Stage 2.3	Extraction data, RoB judgments, flagged low-confidence fields	Begin Phase 3 synthesis only after approval

{
  "pico":             { "population": "...", "intervention": "...", "comparator": "...", "outcome": "..." },
  "search":           { "query": "...", "date_range": "...", "databases": [...] },
  "prisma":           { "identified": N, "included": N, ... },
  "studies":          [ { ... } ],
  "outcomes": [
    {
      "name":          "...",
      "pooling":       { "pooled": ..., "ci_lower": ..., "ci_upper": ..., "p_value": ... },
      "heterogeneity": { "i2": ..., "q": ..., "tau_sq": ..., "prediction_lower": ..., "prediction_upper": ... },
      "sensitivity":   { ... },
      "grade":         { "certainty": "High|Moderate|Low|Very Low", "certainty_symbols": "⊕⊕⊕⊕", "downgrade_reasons": [...] }
    }
  ],
  "grade_summary":    [ { "outcome": "...", "certainty": "...", "symbols": "..." } ],
  "publication_bias": { ... }
}

npx skills add SciSpark-ai/evidence_evaluator

Document	When to read
`references/formulas.md`	Before Stage 3.1 — all statistical formulas with notation and Python function cross-references
`references/phase_1_identify.md`	Before Phase 1 — PICO, search strategy, execution, screening, PRISMA
`references/phase_2_appraise.md`	Before Phase 2 — extraction form, RoB checklist, characteristics table
`references/phase_3_synthesize.md`	Before Phase 3 — pooling, heterogeneity, sensitivity, GRADE, report assembly with full code examples

Meta-Analyst Skill | Skills Pool

Meta-Analyst Skill

Meta-Analyst Skill

Setup

Quick Start

Pipeline Architecture

Phase 1 — Identify

Stage 1.1 — PICO Formalization

Stage 1.2 — Search Strategy Construction

Stage 1.3 — Search Execution

Stage 1.4 — Abstract Screening

Stage 1.5 — PRISMA Flow

Phase 2 — Appraise

Stage 2.1 — Data Extraction

Stage 2.2 — Risk of Bias Assessment

Stage 2.3 — Characteristics Table

Phase 3 — Synthesize

Stage 3.1 — Effect Size Computation

Stage 3.2 — Meta-Analytic Pooling

Stage 3.3 — Heterogeneity

Stage 3.4 — Sensitivity Analyses

Stage 3.5 — Publication Bias

Stage 3.6 — GRADE Certainty

Stage 3.7 — Report Assembly

Human Checkpoints

Output Format

Markdown Report Structure

JSON Export Structure

Composition with Evidence Evaluator

Reference Documents

Key Design Principles

Session Logs

OpenClaw Test Heap Leaks

Node Connect

Openclaw Qa Testing

Openclaw Secret Scanning Maintainer

Flags