Provides guidance for PyTorch-native agentic RL using torchforge, Meta's library separating infra from algorithms. Use when you want clean RL abstractions, easy algorithm experimentation, or scalable training with Monarch and TorchTitan.
torchforge is Meta's PyTorch-native RL library that separates infrastructure concerns from algorithm concerns. It enables rapid RL research by letting you focus on algorithms while handling distributed training, inference, and weight sync automatically.
Choose torchforge when you need:
Consider alternatives when:
┌─────────────────────────────────────────────────────────┐
│ Application Layer (Your Code) │
│ - Define reward models, loss functions, sampling │
└─────────────────────┬───────────────────────────────────┘
│
┌─────────────────────▼───────────────────────────────────┐
│ Forge API Layer │
│ - Episode, Group dataclasses │
│ - Service interfaces (async/await) │
└─────────────────────┬───────────────────────────────────┘
│
┌─────────────────────▼───────────────────────────────────┐
│ Distributed Services (Monarch) │
│ ├── Trainer (TorchTitan FSDP) │
│ ├── Generator (vLLM inference) │
│ ├── Reference Model (frozen KL baseline) │
│ └── Reward Actors (compute rewards) │
└─────────────────────────────────────────────────────────┘
# Create environment
conda create -n forge python=3.12
conda activate forge
# Install (handles PyTorch nightly + dependencies)
./scripts/install.sh
# Verify
python -c "import torch, forge, vllm; print('OK')"
./scripts/install_rocm.sh
python -m apps.sft.main --config apps/sft/llama3_8b.yaml
python -m apps.grpo.main --config apps/grpo/qwen3_1_7b.yaml
Use this workflow for training reasoning models with group-relative advantages.
# config/grpo_math.yaml