ToolUniverse Immune Repertoire Analysis | Skills Pool
Skill ファイル
ToolUniverse Immune Repertoire Analysis
Comprehensive immune repertoire analysis for T-cell and B-cell receptor sequencing data. Analyze TCR/BCR repertoires to assess clonality, diversity, V(D)J gene usage, CDR3 characteristics, convergence, and predict epitope specificity. Integrate with single-cell data for clonotype-phenotype associations. Use for adaptive immune response profiling, cancer immunotherapy research, vaccine response assessment, autoimmune disease studies, or repertoire diversity analysis in immunology research.
mims-harvard1,271 スター2026/03/29
職業
カテゴリ
バイオインフォマティクス
スキル内容
Comprehensive skill for analyzing T-cell receptor (TCR) and B-cell receptor (BCR) repertoire sequencing data to characterize adaptive immune responses, clonal expansion, and antigen specificity.
Domain Reasoning
Repertoire diversity reflects immune history. High clonality — a few clones dominating — indicates antigen-driven expansion, as seen in active infection, tumor-infiltrating lymphocytes, or chronic stimulation. Low diversity points to immunodeficiency or treatment-induced lymphopenia. Always compare observed metrics against healthy donor reference distributions before drawing conclusions; a Shannon entropy of 7 is unremarkable in a healthy adult but alarming post-chemotherapy.
LOOK UP DON'T GUESS
Clonotype frequency thresholds, CDR3 length ranges, and convergence ratios: query IEDB or VDJdb; do not assume values from memory.
Epitope specificities for expanded clones: search iedb_search_tcell_assays and BVBRC_search_epitopes; never infer antigen identity from CDR3 alone.
V gene family usage biases in healthy donors: retrieve published reference data or query ImmPort; do not assume baseline distributions are uniform.
関連 Skill
Sequencing depth adequacy: compute rarefaction curves from the actual data; do not guess whether depth is sufficient.
Overview
Adaptive immune receptor repertoire sequencing (AIRR-seq) enables comprehensive profiling of T-cell and B-cell populations through high-throughput sequencing of TCR and BCR variable regions. This skill provides an 8-phase workflow for:
Clonotype identification and tracking
Diversity and clonality assessment
V(D)J gene usage analysis
CDR3 sequence characterization
Clonal expansion and convergence detection
Epitope specificity prediction
Integration with single-cell phenotyping
Longitudinal repertoire tracking
Core Workflow
Phase 1: Data Import & Clonotype Definition
Load AIRR-seq data from common formats (MiXCR, ImmunoSEQ, AIRR standard, 10x Genomics VDJ). Standardize columns to: cloneId, count, frequency, cdr3aa, cdr3nt, v_gene, j_gene, chain. Define clonotypes using one of three methods:
Amino acid composition: Weighted by clonotype frequency
Flag unusual length distributions (may indicate PCR bias)
Phase 5: Clonal Expansion Detection
Identify expanded clonotypes above a frequency threshold (default: 95th percentile). Track clonotypes longitudinally across multiple timepoints to measure persistence, mean/max frequency, and fold changes.
Phase 6: Convergence & Public Clonotypes
Convergent recombination: Same CDR3 amino acid from different nucleotide sequences (evidence of antigen-driven selection)
Public clonotypes: Shared across multiple samples/individuals (may indicate common antigen responses)
Phase 7: Epitope Prediction & Specificity
Query epitope databases for known TCR-epitope associations:
IEDB (iedb_search_tcell_assays): Search T-cell assay records by sequence or MHC class; use iedb_search_epitopes with sequence_contains for motif search
BVBRC (BVBRC_search_epitopes): Best for organism-based epitope discovery (e.g., taxon_id="2697049" for SARS-CoV-2); returns epitope sequences with T-cell/B-cell assay counts
Clonal expansion > 1% frequency, convergent recombination confirmed, epitope match in IEDB/VDJdb
CDR3 at 5% frequency with 3 nucleotide variants encoding same amino acid, IEDB hit
Moderate
Expanded clone (0.1-1%), V(D)J bias significant (chi-sq p < 0.01), partial epitope match
Clone at 0.5% with TRBV20-1 bias, similar CDR3 motif in VDJdb
Weak
Low-frequency expansion (0.01-0.1%), single timepoint only, no epitope database match
Moderately expanded clone without convergence or known specificity
Insufficient
Below detection threshold, sequencing depth < 10,000 clonotypes, no replication
Singleton clonotypes that may be PCR/sequencing artifacts
Interpretation Guidance
Clonality metrics: Shannon diversity measures overall repertoire complexity (higher = more diverse, typical range 5-12 for healthy blood). Gini coefficient ranges from 0 (perfectly even) to 1 (single dominant clone); values > 0.3 suggest clonal expansion. Clonality (1 - Pielou's evenness) > 0.2 indicates moderate clonal dominance; > 0.5 suggests strong oligoclonal expansion (common in active infection or tumor-infiltrating lymphocytes).
V(D)J usage significance: Biased V or J gene usage (chi-square p < 0.01 vs expected uniform distribution) may indicate antigen-driven selection. However, baseline V gene usage is not uniform even in healthy repertoires due to genomic proximity and recombination efficiency. Compare against healthy donor reference distributions rather than uniform expectation when possible.
CDR3 convergence meaning: Convergent recombination (same CDR3 amino acid from different nucleotide sequences) is strong evidence of antigen-driven selection because independent recombination events converged on the same receptor. Public clonotypes (shared across individuals) further strengthen this inference. A convergence ratio > 2 (nucleotide variants per amino acid sequence) for expanded clones is noteworthy.
Sequencing depth: Rarefaction curves that plateau indicate sufficient depth. If the curve is still rising, richness and diversity estimates are underestimates. Minimum recommended depth: 50,000-100,000 total reads for bulk TCR-seq.
Longitudinal tracking: Persistent clones across timepoints with stable or increasing frequency indicate antigen-driven maintenance. Transient expansions that disappear may reflect acute responses.
Synthesis Questions
Does the observed clonal expansion pattern (Gini coefficient, top-clone frequency) match the expected immune context (e.g., post-vaccination expansion, tumor-infiltrating lymphocyte oligoclonality)?
Are convergent CDR3 sequences found across multiple individuals in the cohort, suggesting a public response to a shared antigen?
Do expanded clonotypes show biased V gene usage consistent with known antigen-specific repertoire features (e.g., TRBV20-1 enrichment in CMV-specific responses)?
Is the sequencing depth sufficient (rarefaction plateau reached) to reliably estimate diversity metrics and detect low-frequency expanded clones?
For longitudinal data, do clonal dynamics (expansion, contraction, persistence) correlate with clinical outcomes or treatment response?
References
Dash P, et al. (2017) Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature
Glanville J, et al. (2017) Identifying specificity groups in the T cell receptor repertoire. Nature
Stubbington MJT, et al. (2016) T cell fate and clonality inference from single-cell transcriptomes. Nature Methods
Vander Heiden JA, et al. (2014) pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics
See Also
ANALYSIS_DETAILS.md - Detailed code snippets for all 8 phases
USE_CASES.md - Complete use cases (immunotherapy, vaccine, autoimmune, single-cell integration) and best practices