Bio Variant Calling Clinical Interpretation | Skills Pool
Bio Variant Calling Clinical Interpretation Clinical variant interpretation using ClinVar, ACMG guidelines, and pathogenicity predictors. Prioritize variants for diagnostic and research applications. Use when interpreting clinical significance of variants.
mdbabumiamssm 24 星標 2026年2月4日 Clinical Variant Interpretation
Prioritize and interpret variants for clinical significance using databases and ACMG/AMP guidelines.
Interpretation Framework
Annotated VCF
│
├── Database Lookup
│ ├── ClinVar (clinical assertions)
│ ├── OMIM (disease associations)
│ └── gnomAD (population frequency)
│
├── Computational Predictions
│ ├── SIFT, PolyPhen-2
│ ├── CADD, REVEL
│ └── SpliceAI
│
├── ACMG Classification
│ └── Pathogenic → Likely Pathogenic → VUS → Likely Benign → Benign
│
└── Prioritized Variant List
ClinVar Annotation
Download ClinVar
wget https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz
wget https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz.tbi
快速安裝
Bio Variant Calling Clinical Interpretation npx skills add mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills-
星標 24
更新時間 2026年2月4日
職業
bcftools annotate \
-a clinvar.vcf.gz \
-c INFO/CLNSIG,INFO/CLNDN,INFO/CLNREVSTAT \
input.vcf.gz -Oz -o with_clinvar.vcf.gz
Filter Pathogenic Variants # Pathogenic or Likely pathogenic
bcftools view -i 'INFO/CLNSIG~"Pathogenic" || INFO/CLNSIG~"Likely_pathogenic"' \
with_clinvar.vcf.gz -Oz -o pathogenic.vcf.gz
# Exclude benign
bcftools view -e 'INFO/CLNSIG~"Benign" || INFO/CLNSIG~"Likely_benign"' \
with_clinvar.vcf.gz -Oz -o not_benign.vcf.gz
ClinVar Significance Levels CLNSIG Meaning Action Pathogenic Disease-causing Report Likely_pathogenic Probably disease-causing Report with caveat Uncertain_significance VUS May report, needs follow-up Likely_benign Probably not disease-causing Usually exclude Benign Not disease-causing Exclude Conflicting Multiple interpretations Manual review
ClinVar Review Status CLNREVSTAT Stars Meaning practice_guideline 4 Expert panel reviewed reviewed_by_expert_panel 3 ClinGen expert reviewed criteria_provided,_multiple_submitters 2 Consistent assertions criteria_provided,_single_submitter 1 One submitter with criteria no_assertion_criteria 0 No criteria provided
# Filter for high-confidence assertions (2+ stars)
bcftools view -i 'INFO/CLNREVSTAT~"multiple_submitters" || \
INFO/CLNREVSTAT~"expert_panel" || \
INFO/CLNREVSTAT~"practice_guideline"' \
with_clinvar.vcf.gz -Oz -o high_confidence.vcf.gz
InterVar (ACMG Classification) Automated ACMG/AMP variant classification.
Installation git clone https://github.com/WGLab/InterVar.git
cd InterVar
# Download databases per documentation
Run InterVar python Intervar.py \
-i input.avinput \
-o output \
-b hg38 \
-d humandb/ \
--input_type=AVinput
From VCF # Convert VCF to ANNOVAR format
convert2annovar.pl -format vcf4 input.vcf > input.avinput
# Run InterVar
python Intervar.py -i input.avinput -o intervar_results -b hg38
ACMG/AMP Criteria
Pathogenic Criteria Code Type Description PVS1 Very Strong Null variant in gene where LOF is disease mechanism PS1-4 Strong Same AA change, functional studies, etc. PM1-6 Moderate Hot spot, absent from controls, etc. PP1-5 Supporting Co-segregation, computational evidence
Benign Criteria Code Type Description BA1 Stand-alone AF >5% in gnomAD BS1-4 Strong AF greater than expected, functional studies BP1-7 Supporting Missense in gene with truncating mechanism
Population Frequency Filtering # Rare variants only (gnomAD AF < 0.01)
bcftools view -i 'INFO/gnomAD_AF<0.01 || INFO/gnomAD_AF="."' \
input.vcf.gz -Oz -o rare.vcf.gz
# Ultra-rare for dominant diseases (AF < 0.0001)
bcftools view -i 'INFO/gnomAD_AF<0.0001 || INFO/gnomAD_AF="."' \
input.vcf.gz -Oz -o ultrarare.vcf.gz
Pathogenicity Score Filtering
CADD Scores # CADD > 20 (top 1% deleterious)
bcftools view -i 'INFO/CADD_PHRED>20' input.vcf.gz -Oz -o cadd_filtered.vcf.gz
# CADD > 30 (top 0.1%)
bcftools view -i 'INFO/CADD_PHRED>30' input.vcf.gz -Oz -o highly_deleterious.vcf.gz
REVEL Scores # REVEL > 0.5 (likely pathogenic)
bcftools view -i 'INFO/REVEL>0.5' input.vcf.gz -Oz -o revel_filtered.vcf.gz
Combined Filtering bcftools view -i '(INFO/CADD_PHRED>20 || INFO/REVEL>0.5) && \
(INFO/CLNSIG~"Pathogenic" || INFO/CLNSIG~"Likely" || INFO/CLNSIG=".")' \
input.vcf.gz -Oz -o prioritized.vcf.gz
Python: Clinical Prioritization from cyvcf2 import VCF, Writer
def classify_variant(variant):
clnsig = variant.INFO.get('CLNSIG', '')
af = variant.INFO.get('gnomAD_AF', 0) or 0
cadd = variant.INFO.get('CADD_PHRED', 0) or 0
revel = variant.INFO.get('REVEL', 0) or 0
# Known pathogenic
if 'Pathogenic' in str(clnsig):
return 'PATHOGENIC'
if 'Likely_pathogenic' in str(clnsig):
return 'LIKELY_PATHOGENIC'
# Known benign
if 'Benign' in str(clnsig) or af > 0.05:
return 'BENIGN'
# Computational prediction
if cadd > 25 or revel > 0.7:
if af < 0.0001:
return 'LIKELY_PATHOGENIC'
elif af < 0.01:
return 'VUS_FAVOR_PATH'
if cadd < 10 and revel < 0.3:
return 'LIKELY_BENIGN'
return 'VUS'
vcf = VCF('annotated.vcf.gz')
results = []
for variant in vcf:
classification = classify_variant(variant)
if classification in ('PATHOGENIC', 'LIKELY_PATHOGENIC', 'VUS_FAVOR_PATH'):
gene = variant.INFO.get('SYMBOL', 'Unknown')
consequence = variant.INFO.get('Consequence', 'Unknown')
results.append({
'chrom': variant.CHROM,
'pos': variant.POS,
'ref': variant.REF,
'alt': variant.ALT[0],
'gene': gene,
'consequence': consequence,
'classification': classification,
'clnsig': variant.INFO.get('CLNSIG', '.'),
'cadd': variant.INFO.get('CADD_PHRED', '.'),
'af': variant.INFO.get('gnomAD_AF', '.')
})
# Output prioritized variants
for r in results:
print(f"{r['gene']}\t{r['chrom']}:{r['pos']}\t{r['consequence']}\t{r['classification']}")
Gene Panel Filtering # Filter to gene panel
bcftools view -R gene_panel.bed input.vcf.gz -Oz -o panel_variants.vcf.gz
# Or by gene symbol (requires VEP annotation)
bcftools view -i 'INFO/CSQ~"BRCA1" || INFO/CSQ~"BRCA2"' \
input.vcf.gz -Oz -o brca_variants.vcf.gz
Disease-Specific Resources Resource Content Use ClinVar Clinical assertions Primary lookup OMIM Gene-disease relationships Gene prioritization HGMD Published mutations Literature evidence gnomAD Population frequencies Rarity filtering ClinGen Gene validity/dosage LOF interpretation
Reporting Template bcftools query -f '%CHROM\t%POS\t%REF\t%ALT\t%INFO/SYMBOL\t%INFO/Consequence\t\
%INFO/CLNSIG\t%INFO/CLNDN\t%INFO/gnomAD_AF\t%INFO/CADD_PHRED\n' \
prioritized.vcf.gz > clinical_report.tsv
Complete Workflow #!/bin/bash
set -euo pipefail
INPUT=$1
CLINVAR=$2
OUTPUT_PREFIX=$3
echo "=== Add ClinVar annotations ==="
bcftools annotate -a $CLINVAR \
-c INFO/CLNSIG,INFO/CLNDN,INFO/CLNREVSTAT,INFO/CLNVC \
$INPUT -Oz -o ${OUTPUT_PREFIX}_clinvar.vcf.gz
echo "=== Filter rare variants ==="
bcftools view -i 'INFO/gnomAD_AF<0.01 || INFO/gnomAD_AF="."' \
${OUTPUT_PREFIX}_clinvar.vcf.gz -Oz -o ${OUTPUT_PREFIX}_rare.vcf.gz
echo "=== Extract pathogenic/likely pathogenic ==="
bcftools view -i 'INFO/CLNSIG~"athogenic"' \
${OUTPUT_PREFIX}_rare.vcf.gz -Oz -o ${OUTPUT_PREFIX}_pathogenic.vcf.gz
echo "=== Extract high-impact VUS ==="
bcftools view -i 'INFO/CLNSIG~"Uncertain" && INFO/CADD_PHRED>20' \
${OUTPUT_PREFIX}_rare.vcf.gz -Oz -o ${OUTPUT_PREFIX}_vus_review.vcf.gz
echo "=== Generate report ==="
bcftools query -H -f '%CHROM\t%POS\t%REF\t%ALT\t%INFO/SYMBOL\t%INFO/Consequence\t\
%INFO/CLNSIG\t%INFO/CLNDN\t%INFO/gnomAD_AF\t%INFO/CADD_PHRED\n' \
${OUTPUT_PREFIX}_pathogenic.vcf.gz > ${OUTPUT_PREFIX}_report.tsv
echo "=== Complete ==="
echo "Pathogenic: ${OUTPUT_PREFIX}_pathogenic.vcf.gz"
echo "VUS for review: ${OUTPUT_PREFIX}_vus_review.vcf.gz"
echo "Report: ${OUTPUT_PREFIX}_report.tsv"
variant-calling/variant-annotation - VEP/SnpEff annotation
variant-calling/filtering-best-practices - Quality filtering
database-access/entrez-fetch - Download ClinVar/OMIM data
pathway-analysis/go-enrichment - Gene set analysis
Clinical Variant Interpretation
計算化學
Drug Discovery Pharmaceutical research assistant for drug discovery workflows. Search bioactive compounds on ChEMBL, calculate drug-likeness (Lipinski Ro5, QED, TPSA, synthetic accessibility), look up drug-drug interactions via OpenFDA, interpret ADMET profiles, and assist with lead optimization. Use for medicinal chemistry questions, molecule property analysis, clinical pharmacology, and open-science drug research.