Call SNPs and indels from aligned reads using bcftools mpileup and call. Use when detecting variants from BAM files or generating VCF from alignments.
Reference examples tested with: bcftools 1.19+
Before using code patterns, verify installed versions match. If versions differ:
<tool> --version then <tool> --help to confirm flagsIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
Call SNPs and indels from aligned reads using bcftools.
BAM file + Reference FASTA
|
v
bcftools mpileup (generate pileup)
|
v
bcftools call (call variants)
|
v
VCF file
Detect SNPs and indels from aligned reads using the bcftools pileup-and-call pipeline.
Approach: Generate per-position pileup likelihoods with mpileup, then call genotypes with the multiallelic caller.
"Call variants from my BAM file" → Generate genotype likelihoods from aligned reads and identify variant sites using a Bayesian caller.
bcftools mpileup -f reference.fa input.bam | bcftools call -mv -o variants.vcf
bcftools mpileup -f reference.fa input.bam | bcftools call -mv -Oz -o variants.vcf.gz
bcftools index variants.vcf.gz
bcftools mpileup -f reference.fa -r chr1:1000000-2000000 input.bam | \
bcftools call -mv -o region.vcf
bcftools mpileup -f reference.fa sample1.bam sample2.bam sample3.bam | \
bcftools call -mv -o variants.vcf
# bams.txt: one BAM path per line
bcftools mpileup -f reference.fa -b bams.txt | bcftools call -mv -o variants.vcf
Goal: Control pileup generation with quality thresholds, annotations, and region restrictions.
Approach: Set minimum mapping/base quality, request specific FORMAT/INFO tags, and restrict to target regions.
bcftools mpileup -f reference.fa \
-q 20 \ # Min mapping quality
-Q 20 \ # Min base quality
input.bam | bcftools call -mv -o variants.vcf
bcftools mpileup -f reference.fa -a DP,AD input.bam | bcftools call -mv -o variants.vcf
bcftools mpileup -f reference.fa \
-a FORMAT/DP,FORMAT/AD,FORMAT/ADF,FORMAT/ADR,INFO/AD \
input.bam | bcftools call -mv -o variants.vcf
bcftools mpileup -f reference.fa -R targets.bed input.bam | \
bcftools call -mv -o variants.vcf
bcftools mpileup -f reference.fa -d 1000 input.bam | bcftools call -mv -o variants.vcf
| Flag | Model | Use Case |
|---|---|---|
-m | Multiallelic caller | Default, recommended |
-c | Consensus caller | Legacy, single sample |
bcftools mpileup -f reference.fa input.bam | bcftools call -mv -o variants.vcf
# -v outputs variant sites only (not reference calls)
bcftools mpileup -f reference.fa input.bam | bcftools call -m -o all_sites.vcf
# Without -v, outputs all sites including reference
# Haploid calling
bcftools mpileup -f reference.fa input.bam | bcftools call -m --ploidy 1 -o variants.vcf
# Specify ploidy file
bcftools mpileup -f reference.fa input.bam | bcftools call -m --ploidy-file ploidy.txt -o variants.vcf
# Adjust variant prior (default 1.1e-3)
bcftools mpileup -f reference.fa input.bam | bcftools call -m -P 0.001 -o variants.vcf
Goal: Run production-ready variant calling workflows for single-sample and multi-sample analyses.
Approach: Chain mpileup and call with quality filters, annotations, and compressed output, optionally parallelized by chromosome.
bcftools mpileup -Ou -f reference.fa \
-q 20 -Q 20 \
-a FORMAT/DP,FORMAT/AD \
input.bam | \
bcftools call -mv -Oz -o variants.vcf.gz
bcftools index variants.vcf.gz
bcftools mpileup -Ou -f reference.fa \
-a FORMAT/DP,FORMAT/AD \
sample1.bam sample2.bam sample3.bam | \
bcftools call -mv -Oz -o cohort.vcf.gz
bcftools index cohort.vcf.gz
bcftools mpileup -Ou -f reference.fa \
-R targets.bed \
-a FORMAT/DP,FORMAT/AD \
input.bam | \
bcftools call -mv -Oz -o targets.vcf.gz
for chr in chr1 chr2 chr3; do
bcftools mpileup -Ou -f reference.fa -r "$chr" input.bam | \
bcftools call -mv -Oz -o "${chr}.vcf.gz" &
done
wait
# Concatenate results
bcftools concat -Oz -o all.vcf.gz chr*.vcf.gz
bcftools index all.vcf.gz
| Tag | Description |
|---|---|
DP | Total read depth |
AD | Allelic depths |
MQ | Mapping quality |
FS | Fisher strand bias |
SGB | Segregation based metric |
| Tag | Description |
|---|---|
GT | Genotype |
DP | Read depth per sample |
AD | Allelic depths per sample |
ADF | Forward strand allelic depths |
ADR | Reverse strand allelic depths |
GQ | Genotype quality |
PL | Phred-scaled likelihoods |
bcftools mpileup -f reference.fa \
-a FORMAT/DP,FORMAT/AD,FORMAT/SP,INFO/AD \
input.bam | bcftools call -mv -o variants.vcf
Goal: Speed up variant calling for large datasets.
Approach: Use multi-threading and uncompressed BCF piping to reduce I/O overhead.
bcftools mpileup -f reference.fa --threads 4 input.bam | \
bcftools call -mv --threads 4 -o variants.vcf
bcftools mpileup -Ou -f reference.fa input.bam | bcftools call -mv -Ou | \
bcftools filter -Oz -o filtered.vcf.gz
| Task | Command |
|---|---|
| Basic calling | bcftools mpileup -f ref.fa in.bam | bcftools call -mv -o out.vcf |
| With quality filter | bcftools mpileup -f ref.fa -q 20 -Q 20 in.bam | bcftools call -mv |
| Region | bcftools mpileup -f ref.fa -r chr1:1-1000 in.bam | bcftools call -mv |
| Multi-sample | bcftools mpileup -f ref.fa s1.bam s2.bam | bcftools call -mv |
| With annotations | bcftools mpileup -f ref.fa -a DP,AD in.bam | bcftools call -mv |
| Error | Cause | Solution |
|---|---|---|
no FASTA reference | Missing -f | Add -f reference.fa |
reference mismatch | Wrong reference | Use same reference as alignment |
no variants called | Low quality/depth | Lower quality thresholds |