Name: torchdrug
Author: jaechang-hits

torchdrug

TorchDrug is a PyTorch-based machine learning platform for drug discovery. Use it for graph-based molecular representation learning, molecular property prediction (ADMET, activity), retrosynthesis prediction, drug-target interaction (DTI) modeling, and pretraining on large molecular datasets. Provides GNN layers (GraphConv, GAT, MPNN), pretrained models, and benchmark datasets in a unified PyTorch-compatible API.

jaechang-hits119 스타2026. 2. 18.

직업
카테고리: 전산화학

Overview

TorchDrug is a comprehensive machine learning framework for drug discovery built on PyTorch. It provides graph-based molecular representations (atoms as nodes, bonds as edges), a library of graph neural network (GNN) architectures, benchmark datasets, and pretrained models for tasks including molecular property prediction, drug-target interaction, retrosynthesis, and generative molecular design. TorchDrug integrates with PyTorch Lightning and standard ML tooling, making it accessible to both computational chemists and ML practitioners.

When to Use

Molecular property prediction: Training or fine-tuning GNN models to predict ADMET properties (solubility, toxicity, permeability) or bioactivity (IC50, Ki) from molecular graphs.
Drug-target interaction (DTI) prediction: Building models that predict binding affinity between a compound (SMILES) and a protein (sequence or structure).
Retrosynthesis prediction: Identifying plausible synthetic routes for a target molecule using template-based or template-free models.
Pretraining on large molecular datasets: Leveraging pretrained GNN representations on ChEMBL or ZINC for transfer learning to small datasets.

torchdrug

jaechang-hits119 스타2026. 2. 18.

직업
카테고리: 전산화학

Overview

When to Use

Molecular property prediction: Training or fine-tuning GNN models to predict ADMET properties (solubility, toxicity, permeability) or bioactivity (IC50, Ki) from molecular graphs.

Drug-target interaction (DTI) prediction: Building models that predict binding affinity between a compound (SMILES) and a protein (sequence or structure).

Retrosynthesis prediction: Identifying plausible synthetic routes for a target molecule using template-based or template-free models.

Pretraining on large molecular datasets: Leveraging pretrained GNN representations on ChEMBL or ZINC for transfer learning to small datasets.

Parameter	Module	Default	Range / Options	Effect
`hidden_dims`	GIN/MPNN/GAT	`[256, 256]`	list of int	Width and depth of GNN layers
`short_cut`	GIN	`False`	`True`, `False`	Add residual connection between layers
`batch_norm`	GIN/MPNN	`False`	`True`, `False`	Apply batch normalization after each layer
`concat_hidden`	GIN	`False`	`True`, `False`	Concatenate all layer outputs as final representation
`num_mlp_layer`	PropertyPrediction	`1`	`1`–`4`	Depth of MLP prediction head after GNN
`criterion`	PropertyPrediction	`"mse"`	`"mse"`, `"bce"`, `"ce"`	Loss function: regression, binary/multi-label classification
`batch_size`	Engine	`32`	`8`–`512`	Training batch size

Problem	Cause	Solution
`ImportError: torchdrug`	Package not installed	`pip install torchdrug` after installing PyTorch
`CUDA error: device-side assert`	Label dtype mismatch	Ensure regression labels are `float`, classification labels are `long`
Poor test metrics with small dataset	Overfitting	Use pretrained weights, add dropout, or reduce model depth
`KeyError: task name` in `dataset.tasks`	Task name mismatch	Print `dataset.tasks` to see exact task names; pass the same list to `PropertyPrediction`
`RuntimeError: Expected all tensors on same device`	Mixed CPU/GPU tensors	Use `solver = core.Engine(..., gpus=[0])` to ensure consistent device placement
Slow training	CPU-only mode	Install CUDA-compatible PyTorch; set `gpus=[0]` in Engine
Missing assay values cause NaN loss	Dataset has missing labels	Set `criterion="bce"` — TorchDrug masks NaN labels during loss computation

torchdrug

Overview

When to Use

torchdrug

Overview

When to Use

Prerequisites

Quick Start

Core API

Module 1: Molecular Graph Representation

Module 2: GNN Architectures

Module 3: Molecular Property Prediction

Module 4: Drug-Target Interaction (DTI) Prediction

Module 5: Retrosynthesis Prediction

Module 6: Pretrained Models and Transfer Learning

Key Concepts

Graph-Based Molecular Representation

Engine and Solver Pattern

Common Workflows

Workflow 1: End-to-End ADMET Property Prediction

Workflow 2: Multi-Task Property Prediction on Tox21

Key Parameters

Best Practices

Common Recipes

Recipe: Generate Molecular Embeddings for Clustering

Recipe: Custom Molecular Dataset from CSV

Troubleshooting

References

Healthcare Cdss Patterns

Drug Discovery

Qmd

Attack Tree Construction

Azure Ai Anomalydetector Java

Viboscope

torchdrug

Overview

When to Use

torchdrug

Overview

When to Use

Prerequisites

Quick Start

Core API

Module 1: Molecular Graph Representation

Module 2: GNN Architectures

Module 3: Molecular Property Prediction

Module 4: Drug-Target Interaction (DTI) Prediction

Module 5: Retrosynthesis Prediction

Module 6: Pretrained Models and Transfer Learning

Key Concepts

Graph-Based Molecular Representation

Engine and Solver Pattern

Common Workflows

Workflow 1: End-to-End ADMET Property Prediction

Workflow 2: Multi-Task Property Prediction on Tox21

Key Parameters

Best Practices

Common Recipes

Recipe: Generate Molecular Embeddings for Clustering

Recipe: Custom Molecular Dataset from CSV

Troubleshooting

Related Skills

References

Healthcare Cdss Patterns

Drug Discovery

Qmd

Attack Tree Construction

Azure Ai Anomalydetector Java

Viboscope