Name: Text-Based Molecule Editing
Author: PharMolix

Search skills.../

Text-Based Molecule Editing | Skills Pool

qed_tool = TOOLS["molecule_qed"]
logp_tool = TOOLS["molecule_logp"]
sa_tool = TOOLS["molecule_sa"]

qed, _ = qed_tool.run(molecule=molecule)
logp, _ = logp_tool.run(molecule=molecule)
sa, _ = sa_tool.run(molecule=molecule)

from open_biomed.core.pipeline import InferencePipeline
from open_biomed.data import Text

pipeline = InferencePipeline(
    task="text_based_molecule_editing",
    model="molt5",
    model_ckpt="./checkpoints/server/text_based_molecule_editing_biot5.ckpt",
    device="cuda:0"
)

outputs = pipeline.run(
    molecule=molecule,
    text=Text.from_str("This molecule should be more soluble in water"),
)
edited_molecule = outputs[0][0]

qed_new, _ = qed_tool.run(molecule=edited_molecule)
logp_new, _ = logp_tool.run(molecule=edited_molecule)

print(f"Original SMILES: {molecule.smiles}")
print(f"Edited SMILES: {edited_molecule.smiles}")
print(f"LogP change: {logp[0]:.2f} → {logp_new[0]:.2f}")

Step	Output	Description
Step 1	`Molecule` object	Input molecule with SMILES
Step 2	`float` values	QED (0-1), LogP, SA scores
Step 3	`Molecule` object	Edited molecule with new structure
Step 4	Comparison	Before/after property summary

import os
ckpt_path = "./checkpoints/server/text_based_molecule_editing_biot5.ckpt"
if not os.path.exists(ckpt_path):
    raise FileNotFoundError(f"Download checkpoint to: {ckpt_path}")

pipeline = InferencePipeline(
    task="text_based_molecule_editing",
    model="molt5",
    model_ckpt="./checkpoints/server/text_based_molecule_editing_biot5.ckpt",
    device="cpu"  # Fallback to CPU
)

Input: aspirin
Prompt: "This molecule should be more soluble in water"

Original SMILES: CC(=O)Oc1ccccc1C(=O)O
Edited SMILES:   CC(=O)Oc1ccc(C(=O)O)cc1C(=O)O

Property Changes:
  LogP: 1.31 → 1.01 (-0.30, more soluble)
  QED:  0.55 → 0.59 (+0.04, better drug-likeness)
  SA:   1.58 → 1.81 (+0.23, slightly harder to synthesize)

Value	Solubility	Interpretation
< 0	High water solubility	Very hydrophilic
0-2	Moderate	Good balance for oral drugs
2-5	Low water solubility	May need formulation help
> 5	Very lipophilic	Poor absorption likely

Value	Quality	Interpretation
> 0.7	Excellent	Highly drug-like
0.5-0.7	Good	Acceptable drug-likeness
0.3-0.5	Moderate	May need optimization
< 0.3	Poor	Significant liabilities

Value	Difficulty	Interpretation
1-3	Easy	Straightforward synthesis
3-5	Moderate	Some challenges
5-7	Difficult	Complex synthesis needed
> 7	Very difficult	Likely impractical

Text-Based Molecule Editing

When to Use

Workflow

Step 1: Prepare Input Molecule

Step 2: Calculate Baseline Properties (Optional)

Text-Based Molecule Editing

When to Use

Workflow

Step 1: Prepare Input Molecule

Step 2: Calculate Baseline Properties (Optional)

Step 3: Run Text-Based Editing

Step 4: Compare Properties

Expected Outputs

Interpretation Guide

LogP (Lipophilicity)

QED (Quantitative Estimate of Drug-likeness)

SA (Synthetic Accessibility)

Error Handling

Model Checkpoint Not Found

Invalid SMILES Output

CUDA Out of Memory

Example

See Also

Healthcare Cdss Patterns

Drug Discovery

Qmd

Attack Tree Construction

Azure Ai Anomalydetector Java

Viboscope