Analyze protein mutations by retrieving protein data, explaining mutation effects, predicting protein structure, and visualizing results. Use this skill when the user asks about protein mutations, wants to understand mutation effects, or needs to analyze genetic variants. Triggers on phrases like "analyze mutation", "explain mutation", "what does this mutation do", "protein variant analysis".
Analyze the functional impact of protein mutations using MutaPLM and visualize protein structures.
from open_biomed.tools.tool_registry import TOOLS
tool = TOOLS["protein_uniprot_request"]
result, message = tool.run(accession="P04637")
protein = result.get("protein")
mutation_tool = TOOLS["mutation_explanation"]
mutation_result, _ = mutation_tool.run(
protein=protein,
mutation="R248Q" # Format: OriginalAA + Position + MutantAA
)
folding_tool = TOOLS["protein_folding"]
fold_result, _ = folding_tool.run(protein=protein)
predicted_protein = fold_result.get("protein")
viz_tool = TOOLS["visualize_protein"]
viz_result, _ = viz_tool.run(protein=predicted_protein, style="cartoon")
See examples/basic_analysis.py for the complete implementation.
| Step | Output | Description |
|---|---|---|
| Retrieve Protein | Protein object | Name, sequence from UniProt |
| Explain Mutation | Text | Functional impact from MutaPLM |
| Predict Structure | Protein with 3D coords | Structure from ESMFold |
| Visualize | PNG file | Rendered protein structure |
Single amino acid mutation: OriginalAA + Position + MutantAA
| Valid | Invalid | Reason |
|---|---|---|
| R248Q | R248 | Missing mutant AA |
| V600E | 248Q | Missing original AA |
| L858R | ARG248GLN | Use single-letter codes |
Symptom: FileNotFoundError or AttributeError
Solution: Check checkpoints exist:
./checkpoints/server/mutaplm.pth./checkpoints/esm2/650m/./checkpoints/biomedgpt-lm/Fallback: Use web search for mutation literature.
position = int(mutation[1:-1])
if position > len(protein.sequence):
print(f"Error: Position exceeds sequence length")
See references/troubleshooting.md for detailed error handling.
| pLDDT Score | Confidence |
|---|---|
| > 90 | High |
| 70-90 | Moderate |
| < 70 | Low (disordered) |
Input: P04637 R248Q
Step 1: Retrieved TP53 (393 aa)
Step 2: "In lung cancer, mutation R248Q..."
Step 3: Structure predicted (~8s)
Step 4: Visualization saved
Output: Mutation analysis + structure + visualization
Model checkpoints required (see references/troubleshooting.md):
protein_pdb_request - Get existing PDB structuresprotein_question_answering - Ask about protein functionexport_protein - Save structure to PDB format