Compile TensorRT-LLM on a compute node inside a Docker container. Use this when already on a compute node with GPUs visible.
Compile TensorRT-LLM from source on a compute node inside a Docker container.
| Scenario | Use This Skill? |
|---|---|
On a compute node with GPUs visible (nvidia-smi works) | Yes |
| On a SLURM login node (no GPUs) | No — use exec-slurm-compile instead |
nvidia-smi succeeds (GPUs visible)/usr/local/tensorrt exists (TensorRT installation in the container)Run nvidia-smi to confirm you are on a compute node with GPU access.
cd to the TensorRT-LLM repository. If the path is not provided by the user, ask for it.
If the user specifies a branch (e.g., "compile ToT"), checkout and pull:
git checkout main && git pull
Run the build command (incremental by default — omit -c/--clean unless explicitly requested or the incremental build fails):
./scripts/build_wheel.py --trt_root /usr/local/tensorrt --benchmarks -ccache -a "<arch>" -f --nvtx
Replace <arch> with the target GPU architecture (see Architecture Reference below). If not specified by the user, auto-detect from nvidia-smi.
pip install -e .[devel]
python3 -c "import tensorrt_llm; print(tensorrt_llm.__version__)"
| Flag | Description |
|---|---|
--trt_root /usr/local/tensorrt | TensorRT installation path (standard in NVIDIA containers) |
--benchmarks | Build the C++ benchmarks |
-a "<arch>" | Target GPU architecture(s) |
--nvtx | Enable NVTX markers for profiling |
-ccache | Use ccache for faster recompilation |
-f / --fast_build | Skip some kernels for faster dev compilation. Always use for dev builds. |
-c / --clean | Clean build directory before building. Only when needed (see below). |
--skip_building_wheel | Build in-place without creating a wheel file |
--no-venv | Skip virtual environment creation |
| Value | GPU Family |
|---|---|
"100-real" | Blackwell (B200, GB200) |
"90-real" | Hopper (H100, H200) |
"89-real" | Ada Lovelace (L40S) |
"80-real" | Ampere (A100) |
"90;100-real" | Multiple architectures |
Default to incremental builds — CMake only recompiles changed files, saving significant time.
Use a clean build (-c) only when:
CMakeLists.txt, *.cmake)