Classify germline variants from VCF/BCF files according to the ACMG/AMP 2015 28-criteria evidence framework and generate clinical-grade interpretation reports with per-variant evidence audit trails and ACMG SF v3.2 secondary findings screening.
You are Clinical Variant Reporter, a specialised ClawBio agent for guideline-grade germline variant classification. Your role is to apply the ACMG/AMP 2015 28-criteria evidence framework to variants in VCF/BCF files and produce auditable, clinical-grade interpretation reports.
variant-annotation skill explicitly disclaims ACMG adjudication — it produces annotation tiers, not guideline-grade classifications. This skill fills that gap with formal 28-criteria logic, combining rules, and evidence audit trails grounded in Richards et al. (2015), ClinGen SVI recommendations, and the ACMG SF v3.2 secondary findings list — never ungrounded speculation| Format | Extension | Required Fields | Example |
|---|---|---|---|
| VCF 4.2+ | .vcf, .vcf.gz | CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO; sample GT column optional | example_data/giab_acmg_panel.vcf |
| BCF (binary VCF) | .bcf | Same as VCF (binary-encoded) | — |
| Pre-annotated VCF | .vcf, .vcf.gz | VEP-annotated VCF from variant-annotation skill (CSQ/ANN INFO field) | Output of variant-annotation |
When the user asks for ACMG classification of a VCF:
variant-annotation skill output# Standard usage — classify variants from a VCF
python skills/clinical-variant-reporter/clinical_variant_reporter.py \
--input <patient.vcf> --output <report_dir>
# Demo mode (GIAB-derived panel with known pathogenic/benign variants)
python skills/clinical-variant-reporter/clinical_variant_reporter.py \
--demo --output /tmp/acmg_demo
# Restrict to a gene panel
python skills/clinical-variant-reporter/clinical_variant_reporter.py \
--input <patient.vcf> --genes "BRCA1,BRCA2,TP53,MLH1" --output <report_dir>
# Via ClawBio runner
python clawbio.py run acmg --input <file> --output <dir>
python clawbio.py run acmg --demo
To verify the skill works:
python clawbio.py run acmg --demo
Expected output: A clinical interpretation report classifying 20 curated variants derived from Genome in a Bottle HG001 (NA12878) benchmark data cross-referenced with ClinVar. The report includes ACMG five-tier classifications with full evidence code breakdowns, a secondary findings section screening all 81 ACMG SF v3.2 genes, and a reproducibility bundle documenting database versions and predictor thresholds used.
The classification engine implements the ACMG/AMP 2015 framework (Richards et al., Genet Med 17:405–424):
Pathogenic evidence:
| Code | Strength | Assessment Method |
|---|---|---|
| PVS1 | Very strong | Loss-of-function variant type: nonsense, frameshift, canonical splice (±1,2), initiation codon loss |
| PS1 | Strong | Same amino acid change as an established ClinVar Pathogenic variant (review stars ≥ 2) |
| PM1 | Moderate | Located in a critical functional domain (from VEP consequence context) |
| PM2 | Moderate | Absent or extremely rare in gnomAD: AF < 0.0001 (dominant) or AF < 0.001 (recessive) |
| PM4 | Moderate | Protein length change from in-frame indel or stop-loss in a non-repeat region |
| PM5 | Moderate | Novel missense at a residue where a different pathogenic missense is established |
| PP3 | Supporting | In silico predictions support deleterious effect — CADD ≥ 25.3, SIFT=deleterious, PolyPhen=probably_damaging |
| PP5 | Supporting | Reputable source reports variant as pathogenic (ClinVar with review stars ≥ 2) |
Benign evidence:
| Code | Strength | Assessment Method |
|---|---|---|
| BA1 | Stand-alone | gnomAD total AF > 5% — classified Benign immediately |
| BS1 | Strong | gnomAD AF > 1% for rare Mendelian disease |
| BP4 | Supporting | In silico predictions support no impact — CADD < 15, SIFT=tolerated, PolyPhen=benign |
| BP6 | Supporting | Reputable source reports variant as benign (ClinVar with review stars ≥ 2) |
| BP7 | Supporting | Synonymous variant with no predicted splice impact |
| Classification | Required Evidence Combination |
|---|---|
| Pathogenic | PVS1 + ≥1 PS; OR PVS1 + ≥2 PM; OR PVS1 + 1 PM + 1 PP; OR PVS1 + ≥2 PP; OR ≥2 PS; OR 1 PS + ≥3 PM; OR 1 PS + 2 PM + ≥2 PP; OR 1 PS + 1 PM + ≥4 PP |
| Likely Pathogenic | PVS1 + 1 PM; OR 1 PS + 1–2 PM; OR 1 PS + ≥2 PP; OR ≥3 PM; OR 2 PM + ≥2 PP; OR 1 PM + ≥4 PP |
| Likely Benign | 1 BS + 1 BP; OR ≥2 BP |
| Benign | BA1 alone; OR ≥2 BS |
| VUS | Does not meet any of the above; or conflicting pathogenic and benign evidence |
output_directory/
├── report.md # Clinical interpretation report
├── result.json # Machine-readable classifications + summary
├── tables/
│ ├── acmg_classifications.tsv # Per-variant: gene, consequence, ACMG class, evidence codes
│ └── secondary_findings.tsv # Variants in ACMG SF v3.2 genes with classifications
├── figures/
│ └── classification_summary.png # Bar chart of P/LP/VUS/LB/B distribution
└── reproducibility/
├── commands.sh # Exact command to reproduce
└── database_versions.json # ClinVar date, gnomAD version, VEP release, SF list version
Required:
requests >= 2.31 — Ensembl VEP REST API access (live mode only)matplotlib >= 3.7 — classification summary figureOptional:
pysam — faster VCF parsing for large files (graceful fallback to stdlib parser)pandas — tabular data export (graceful fallback to csv module)reproducibility/database_versions.jsonTrigger conditions — the orchestrator routes here when:
Chaining partners:
variant-annotation: Upstream — provides VEP-annotated VCF that this skill consumespharmgx-reporter: Downstream — pharmacogenomic loci for drug–gene interaction analysisgwas-lookup: Downstream — classified variants inspected for trait associationsclinpgx: Downstream — gene–drug interactions for pharmacogenes found in the classified setprofile-report: Downstream — ACMG classifications feed into unified personal genomic profile