Humanize murine antibody sequences using CDR grafting and framework optimization to reduce immunogenicity while preserving antigen binding. Predicts optimal human germline frameworks and identifies critical back-mutations for therapeutic antibody development.
Bioinformatics platform for converting murine antibodies into humanized variants by grafting complementarity-determining regions (CDRs) onto human framework templates while preserving antigen-binding affinity and reducing immunogenicity risk.
Key Capabilities:
✅ Use this skill when:
❌ Do NOT use when:
phage-display-libraryantibody-design-aiaffinity-maturation-predictorfc-engineering-toolkitIntegration:
antibody-sequencer (VH/VL sequence determination), cdr-grafting-validator (structural assessment)protein-struct-viz (3D visualization), immunogenicity-predictor (T-cell epitope analysis)Parse antibody sequences and identify CDR boundaries:
from scripts.humanizer import AntibodyHumanizer
humanizer = AntibodyHumanizer()
# Analyze antibody sequence
analysis = humanizer.analyze_sequence(
vh_sequence="QVQLQQSGPELVKPGASVKISCKASGYTFTDYYMHWVKQSHGKSLEWIGYINPSTGYTEYNQKFKDKATLTVDKSSSTAYMQLSSLTSEDSAVYYCAR...",
vl_sequence="DIQMTQSPSSLSASVGDRVTITCRASQGISSWLAWYQQKPGKAPKLLIYKASSLESGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQYSSYPYT...",
scheme="chothia" # Options: kabat, chothia, imgt
)
# Output CDR locations
print(analysis.cdr_regions)
# {
# "VH_CDR1": {"start": 26, "end": 32, "seq": "GYTFTDY"},
# "VH_CDR2": {"start": 52, "end": 58, "seq": "INPSTGY"},
# ...
# }
Numbering Schemes:
| Scheme | VH CDR1 | VH CDR2 | VH CDR3 | Best For |
|---|---|---|---|---|
| Chothia | 26-32 | 52-56 | 95-102 | Structural analysis |
| Kabat | 31-35 | 50-65 | 95-102 | Sequence-based work |
| IMGT | 27-38 | 56-65 | 105-117 | Standardized analysis |
Identify optimal human germline templates:
# Match against human germline database
matches = humanizer.find_human_frameworks(
vh_framework=analysis.vh_frameworks,
vl_framework=analysis.vl_frameworks,
top_n=5,
criteria=["homology", "canonical_structure", "vernier_similarity"]
)
# Evaluate each candidate
for match in matches:
print(f"Template: {match.germline_genes}")
print(f"Homology: {match.homology:.2%}")
print(f"Vernier Score: {match.vernier_score:.1f}")
print(f"Risk Level: {match.immunogenicity_risk}")
Matching Criteria:
Assess immunogenicity risk of candidates:
# Score humanization candidates
scores = humanizer.score_candidates(
murine_antibody=analysis,
human_templates=matches,
scoring_methods=["t20", "h_score", "germline_deviation", "paratope_diversity"]
)
# Rank by overall score
ranked = scores.rank_by_composite_score(
weights={"humanness": 0.4, "binding_retention": 0.4, "developability": 0.2}
)
Scoring Methods:
| Method | Description | Target |
|---|---|---|
| T20 Score | 20-mer peptide humanization | >80% human |
| H-Score | Hummerblind germline distance | <15 mutations |
| Paratope Diversity | CDR germline gene diversity | Low diversity |
| Developability | Aggregation/pH stability prediction | High score |
Identify critical residues to retain from murine framework:
# Predict back-mutations
back_mutations = humanizer.predict_back_mutations(
murine_vh=analysis.vh_sequence,
human_vh=matches[0].human_template,
cdr_regions=analysis.cdr_regions,
rationale_required=True
)
# Output shows position-specific recommendations
for mutation in back_mutations:
print(f"Position {mutation.position}: {mutation.human_aa} → {mutation.murine_aa}")
print(f"Rationale: {mutation.reason}") # e.g., "Vernier region contact"
print(f"Priority: {mutation.priority}") # Critical/Important/Optional
Critical Residue Classes:
Scenario: Convert murine anti-tumor antibody to therapeutic candidate.
# Humanize single antibody
python scripts/main.py \
--vh "QVQLQQSGPELVKPGASVKISCKAS..." \
--vl "DIQMTQSPSSLSASVGDRVTITCRAS..." \
--name "Anti-HER2-Murine-1" \
--scheme chothia \
--top-n 3 \
--output humanization_report.json
# Review top candidates
cat humanization_report.json | jq '.candidates[0]'
Workflow:
Scenario: Screen multiple murine clones from hybridoma campaign.
# Process multiple antibodies
antibodies = [
{"name": "Clone-A", "vh": "...", "vl": "..."},
{"name": "Clone-B", "vh": "...", "vl": "..."},
{"name": "Clone-C", "vh": "...", "vl": "..."}
]
results = humanizer.batch_humanize(
antibodies=antibodies,
ranking_criteria="composite_score",
min_humanness=0.85
)
# Rank by developability
ranked = results.rank_by(criteria=["humanness", "binding_retention", "stability"])
Selection Criteria:
Scenario: Compare different humanization strategies for lead candidate.
# Test multiple framework combinations
strategies = [
{"vh": "IGHV1-2*02", "vl": "IGKV1-12*01", "name": "Template-A"},
{"vh": "IGHV3-23*01", "vl": "IGKV3-20*01", "name": "Template-B"},
{"vh": "IGHV4-34*01", "vl": "IGKV1-5*01", "name": "Template-C"}
]
comparison = humanizer.compare_strategies(
murine_antibody=analysis,
strategies=strategies,
metrics=["homology", "back_mutations", "immunogenicity", "paratope_structure"]
)
comparison.generate_report("framework_comparison.pdf")
Comparison Metrics:
Scenario: Assess humanization for patent landscape analysis.
# Generate humanized variants
python scripts/main.py \
--input murine_lead.json \
--generate-variants 10 \
--include-back-mutations \
--output variants_for_ip.json
# Check novelty against patent databases
python scripts/patent_check.py \
--sequences variants_for_ip.json \
--databases [USPTO, EPO, WIPO] \
--output novelty_report.pdf
IP Considerations:
From murine hybridoma to therapeutic candidate:
# Step 1: Sequence analysis and CDR identification
python scripts/main.py \
--vh $VH_SEQUENCE \
--vl $VL_SEQUENCE \
--scheme chothia \
--output step1_analysis.json
# Step 2: Find best human frameworks
python scripts/main.py \
--input step1_analysis.json \
--find-frameworks \
--top-n 5 \
--output step2_frameworks.json
# Step 3: Score and rank candidates
python scripts/main.py \
--input step2_frameworks.json \
--score-candidates \
--include-immunogenicity \
--output step3_scored.json
# Step 4: Predict back-mutations
python scripts/main.py \
--input step3_scored.json \
--predict-back-mutations \
--rationale \
--output step4_backmutations.json
# Step 5: Generate final humanized sequences
python scripts/main.py \
--input step4_backmutations.json \
--generate-sequences \
--format fasta \
--output humanized_antibody.fasta
Python API:
from scripts.humanizer import AntibodyHumanizer
from scripts.scoring import HumanizationScorer
from scripts.backmutation import BackMutationPredictor
# Initialize pipeline
humanizer = AntibodyHumanizer()
scorer = HumanizationScorer()
bm_predictor = BackMutationPredictor()
# Step 1: Parse and analyze
antibody = humanizer.analyze_sequence(
vh_sequence=murine_vh,
vl_sequence=murine_vl,
scheme="chothia"
)
# Step 2: Find human frameworks
candidates = humanizer.find_human_frameworks(
antibody,
top_n=5
)
# Step 3: Score candidates
for candidate in candidates:
scores = scorer.calculate_scores(
murine=antibody,
humanized=candidate
)
candidate.composite_score = scores.weighted_score()
# Step 4: Select best and predict back-mutations
best = max(candidates, key=lambda x: x.composite_score)
back_mutations = bm_predictor.predict(
murine=antibody,
human_template=best
)
# Step 5: Generate final sequence
final_sequence = humanizer.generate_humanized_sequence(
template=best,
back_mutations=back_mutations,
cdrs=antibody.cdr_regions
)
print(f"Humanized antibody generated:")
print(f"- Humanness: {best.humanness:.1%}")
print(f"- Back-mutations: {len(back_mutations)}")
print(f"- Risk level: {best.immunogenicity_risk}")
Input Quality:
Humanization Assessment:
Output Validation:
Before Experimental Work:
Sequence Issues:
❌ Incomplete sequences → Missing framework regions
❌ Wrong numbering scheme → CDR boundaries incorrect
❌ Non-standard residues → Unusual amino acids
Design Issues:
❌ Over-humanization → Losing antigen binding
❌ Ignoring back-mutations → Assuming 100% human framework works
❌ Single candidate only → No backup options
Experimental Issues:
❌ Skipping binding validation → Assuming in silico = in vivo
❌ Ignoring developability → Aggregation or instability
Available in references/ directory:
imgt_germline_database.md - Human germline gene reference sequencescdr_numbering_schemes.md - Kabat, Chothia, IMGT comparisonhumanization_case_studies.md - Successful therapeutic examplesvernier_positions_guide.md - Critical framework residuesimmunogenicity_assessment.md - T-cell epitope prediction methodspatent_landscape.md - Humanization IP considerationsLocated in scripts/ directory:
main.py - CLI interface for humanizationhumanizer.py - Core humanization enginecdr_parser.py - CDR identification and numberingframework_matcher.py - Human germline database searchscoring.py - Humanization quality assessmentbackmutation.py - Critical residue predictionbatch_processor.py - Multiple antibody screeningstructure_predictor.py - CDR conformation analysis| Parameter | Type | Default | Required | Description |
|---|---|---|---|---|
--vh | string | - | No | Murine VH sequence (amino acids) |
--vl | string | - | No | Murine VL sequence (amino acids) |
--input, -i | string | - | No | Input JSON file path |
--name, -n | string | "" | No | Antibody name |
--output, -o | string | - | No | Output file path |
--format, -f | string | json | No | Output format (json, fasta, csv) |
--scheme, -s | string | chothia | No | Numbering scheme (kabat, chothia, imgt) |
--top-n | int | 3 | No | Number of best candidates to return |
# Humanize with direct sequence input
python scripts/main.py --vh "QVQLQQSGPELVKPGASVKMSCKAS..." --vl "DIQMTQSPSSLSASVGDRVTITC..." --name "MyAntibody"
# Use JSON input file
python scripts/main.py --input antibody.json --output results.json
# Use IMGT numbering scheme
python scripts/main.py --vh "SEQUENCE" --vl "SEQUENCE" --scheme imgt
{
"vh_sequence": "QVQLQQSGPELVKPGASVKMSCKAS...",
"vl_sequence": "DIQMTQSPSSLSASVGDRVTITC...",
"name": "MyAntibody",
"scheme": "chothia"
}
| Risk Indicator | Assessment | Level |
|---|---|---|
| Code Execution | Python script executed locally | Medium |
| Network Access | No external API calls | Low |
| File System Access | Read input files, write output files | Low |
| Instruction Tampering | Standard prompt guidelines | Low |
| Data Exposure | Output may contain proprietary sequences | Medium |
# Python 3.7+
# No external packages required (uses standard library)
🔬 Critical Note: Computational humanization is a design tool, not a substitute for experimental validation. Always express and test humanized candidates for binding affinity, specificity, stability, and immunogenicity before therapeutic development.