Design and evaluate vaccine candidates using computational immunology tools. Covers epitope prediction (MHC-I/II binding via IEDB), population coverage analysis, antigen selection, adjuvant matching, and immunogenicity assessment. Integrates IEDB for epitope prediction, UniProt for antigen sequences, PDB/AlphaFold for structural epitopes, BVBRC for pathogen proteomes, and literature for clinical precedent. Use when asked about vaccine design, epitope prediction, immunogenicity, MHC binding, T-cell epitopes, B-cell epitopes, or population coverage for vaccine candidates.
Computational pipeline for designing peptide/subunit vaccine candidates through epitope prediction, population coverage optimization, and immunogenicity assessment.
Vaccine design requires presenting the right epitopes to elicit protective immunity — not just any immune response, but one that is neutralizing, durable, and broadly applicable. For T-cell vaccines, the core tool is MHC binding prediction (IEDB tools): predict peptide-MHC affinity across multiple HLA alleles, then select epitopes with broad coverage of the target population. For antibody vaccines, prioritize surface-exposed conserved regions — a deeply buried or hypervariable region makes a poor antibody target. MHC binding does not equal immunogenicity; many good binders are not immunogenic in vivo due to tolerance, poor processing, or lack of T-cell help. A multi-epitope strategy (combining MHC-I for CD8+ CTL response, MHC-II for CD4+ helper response, and B-cell epitopes for antibody induction) is more robust than any single epitope. Conservation across pathogen strains is critical — an epitope that mutates under immune pressure (like HIV envelope hypervariable regions) is a poor vaccine target.
LOOK UP DON'T GUESS: Do not predict MHC binding or population coverage from memory — use IEDB_predict_mhci_binding and IEDB_predict_mhcii_binding for predictions and for validated experimental data. Do not assume what's on the pathogen surface; retrieve annotated sequences from UniProt or BVBRC.
IEDB_search_epitopesKey principles:
Not this skill: For HLA typing or allele frequency only, use tooluniverse-hla-immunogenomics. For antibody engineering, use tooluniverse-antibody-engineering.
| Tool | Use For |
|---|---|
IEDB_search_epitopes | Search experimentally validated epitopes |
IEDB_get_epitope | Get detailed epitope data (assay results, MHC restriction) |
iedb_search_mhc | Search validated MHC binding assay data |
IEDB_predict_mhci_binding | Predict MHC-I binding (NetMHCpan EL; rank < 0.5% = strong binder) |
IEDB_predict_mhcii_binding | Predict MHC-II binding (NetMHCIIpan EL; CD4+ helper epitopes) |
UniProt_get_entry_by_accession | Get antigen protein sequence |
UniProt_search | Find pathogen protein sequences |
BVBRC_search_genome_features | Search pathogen proteomes |
alphafold_get_prediction | Get/predict antigen 3D structure |
EnsemblVEP_annotate_hgvs | Check epitope conservation across variants |
PubMed_search_articles | Find published vaccine studies |
search_clinical_trials | Find ongoing vaccine clinical trials |
Phase 0: Antigen Selection
Pathogen → essential surface proteins → sequence retrieval
|
Phase 1: T-Cell Epitope Prediction
MHC-I (CD8+ CTL) and MHC-II (CD4+ helper) binding prediction
|
Phase 2: B-Cell Epitope Prediction
Linear and conformational B-cell epitopes for antibody response
|
Phase 3: Population Coverage
HLA allele frequencies → design for target population
|
Phase 4: Conservation Analysis
Cross-strain epitope conservation → broad protection
|
Phase 5: Candidate Assembly & Report
Multi-epitope construct design → immunogenicity assessment
Best antigens for vaccines: Surface-exposed, essential for pathogen function, conserved across strains.
# Find pathogen surface proteins
UniProt_search(query="[organism] AND locations:(location:cell surface) AND reviewed:true")
# Or search BVBRC for annotated pathogen proteomes
BVBRC_search_genome_features(keyword="surface protein", genome_id="[taxon_id]")
Antigen prioritization: prefer surface-exposed (secreted/outer membrane) over cytoplasmic; >95% conserved across strains; essential for pathogen viability; known immunogen in natural infection. Use UniProt subcellular location annotations and PubMed to verify these properties.
MHC-I epitopes (CD8+ cytotoxic T cells — kill infected cells):
# Option A: Search for KNOWN validated epitopes from IEDB
iedb_search_mhc(
mhc_class="I",
qualitative_measure="Positive",
filters={"source_organism_iri": "eq.NCBITaxon:2697049"}, # SARS-CoV-2
select=["linear_sequence", "mhc_restriction", "qualitative_measure"],
limit=50
)
# Option B: PREDICT novel peptide binding (recommended for new proteins)
IEDB_predict_mhci_binding(
sequence="YOUR_PROTEIN_SEQUENCE", # full protein or peptide
allele="HLA-A*02:01", # or H-2-Kd for mouse
method="netmhcpan_el", # EL = eluted ligand (recommended)
length=9 # 8-11 for MHC-I
)
# Returns peptides ranked by percentile_rank:
# < 0.5% = strong binder (include in vaccine)
# 0.5-2% = moderate binder (consider)
# > 2% = weak/non-binder (exclude)
MHC-II epitopes (CD4+ helper T cells — activate B cells and CD8+ T cells):
iedb_search_mhc(
mhc_class="II",
qualitative_measure="Positive",
filters={"source_organism_iri": "eq.NCBITaxon:2697049"},
limit=50
)
Binding affinity interpretation:
| IC50 (nM) | Classification | Vaccine Relevance |
|---|---|---|
| < 50 | Strong binder | Include — high presentation probability |
| 50-500 | Moderate binder | Consider — may contribute to response |
| 500-5000 | Weak binder | Exclude — unlikely to be presented |
| > 5000 | Non-binder | Exclude |
HLA supertype strategy: For broad coverage, predict against HLA supertypes:
B-cell epitopes trigger antibody production. Look for:
# Check for known B-cell epitopes
IEDB_search_epitopes(query="[protein_name]", epitope_type="B cell")
# Get structure for conformational epitope prediction
alphafold_get_prediction(uniprot_id="[accession]")
B-cell epitope criteria: Surface-exposed loops, hydrophilic regions, flexible regions (high B-factor). Combine computational prediction with structural analysis.
# Search for epitopes restricted to common HLA alleles in target population
# NOTE: No HLA frequency tool exists in ToolUniverse. For population coverage:
# 1. Use IEDB Analysis Resource (tools.iedb.org/population) for population coverage calculation
# 2. Use the HLA supertype strategy (see above) to ensure broad coverage
# 3. Search PubMed for published HLA frequency data: PubMed_search_articles(query="HLA allele frequency [population]")
Population coverage targets:
| Coverage Level | Interpretation | Action |
|---|---|---|
| >90% | Excellent — vaccine will work in most individuals | Proceed to development |
| 70-90% | Good — most people covered; some populations underserved | Add more epitopes for uncovered HLA types |
| 50-70% | Moderate — significant gaps | Redesign with broader HLA coverage |
| <50% | Poor — vaccine will miss too many people | Fundamental redesign needed |
Check if epitopes are conserved across pathogen strains/variants:
# Search for protein variants across strains
PubMed_search_articles(query="[pathogen] [protein] sequence variation strains")
# Check specific mutations in epitope regions
EnsemblVEP_annotate_hgvs(hgvs_notation="[variant_in_epitope]")
Conservation interpretation:
Multi-epitope construct design principles:
Report structure: