Deploy and optimize applications on Jetson Orin Nano with TensorRT. Use when setting up Jetson environments, converting models to TensorRT, managing power modes, and containerizing edge AI applications.
"The future of AI is at the edge. Every robot, every camera, every sensor will have AI processing locally." — Dustin Franklin, NVIDIA Jetson AI Developer
This skill orchestrates the full lifecycle of deploying AI applications to NVIDIA Jetson Orin Nano devices. Every decision is constrained by thermal limits, power budgets, and memory ceilings that do not exist in cloud or desktop environments.
Non-Negotiable Constraints:
jetson-containers from dustynv to build reproducible deployment environments. Bare-metal installs create fragile, unreproducible setups.| Principle | Description | Priority |
|---|---|---|
| Power Mode Awareness | Select and validate power mode before benchmarking or deploying; results are meaningless without a fixed power profile | Critical |
| TensorRT First | Convert all inference models to TensorRT engines before deployment; never ship raw ONNX or PyTorch models to production | Critical |
| JetPack Compatibility | Verify JetPack version, L4T version, and CUDA version before installing any package or building any container | Critical |
| Container Reproducibility | Use jetson-containers for all deployments; pin base images to specific L4T versions; never rely on bare-metal installs | High |
| Thermal Management | Profile thermal behavior under sustained load; set power mode and fan policy before benchmarking; monitor with tegrastats | High |
| Memory Budget Discipline | The Orin Nano has 8GB unified memory shared between CPU and GPU; always account for OS overhead (~1.5GB), display server, and framework footprint | High |
| On-Device Validation | Never trust desktop or cloud benchmarks; always validate latency, throughput, and accuracy on the target Jetson device | High |
| Precision-Accuracy Tradeoff | FP16 is the default for Orin Nano; INT8 requires calibration data and accuracy validation; never assume precision reduction is lossless | Medium |
| Incremental Deployment | Deploy one component at a time; validate each stage before adding the next pipeline element | Medium |
| Telemetry from Day One | Instrument with tegrastats and jtop from the first deployment; do not wait for production to add monitoring | Medium |
Use search_knowledge (grounded-code-mcp) to ground decisions in authoritative references.
| Query | When to Call |
|---|---|
search_knowledge("TensorRT FP16 INT8 quantization Jetson") | During CONVERT/OPTIMIZE — selecting precision and quantization strategy |
search_knowledge("Jetson JetPack CUDA cuDNN compatibility") | During SETUP — verifying version compatibility before any installation |
search_knowledge("Docker container NVIDIA GPU runtime") | During CONTAINERIZE — configuring nvidia-docker runtime |
search_knowledge("TensorRT ONNX model conversion trtexec") | During CONVERT — converting ONNX models to TensorRT engines |
search_knowledge("Jetson power mode thermal monitoring tegrastats") | During BENCHMARK — measuring thermal behavior and power draw |
search_code_examples("TensorRT Python inference engine") | Before writing inference code — find TensorRT Python API patterns |
search_code_examples("Docker Compose systemd service autostart") | During DEPLOY — configuring auto-start and restart policies |
Protocol: Search edge_ai and robotics collections for Jetson and TensorRT guidance. Search automation for containerization and fleet deployment context. Always cite the source path from KB results.
┌──────────────────────────────────────────────────────────────────────┐
│ │
│ ┌───────┐ ┌─────────────┐ ┌─────────┐ ┌──────────┐ │
│ │ SETUP │───>│ CONTAINERIZE│───>│ CONVERT │───>│ OPTIMIZE │ │
│ └───────┘ └─────────────┘ └─────────┘ └──────────┘ │
│ │ │
│ v │
│ ┌────────┐ ┌───────────┐ │
│ │ DEPLOY │<───│ BENCHMARK │ │
│ └────────┘ └───────────┘ │
│ │ │
│ └──── (iterate if needed) ──┐ │
│ │ │
│ ┌──────────┐ │ │
│ │ OPTIMIZE │<───────────────────┘ │
│ └──────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────┘
Before beginning any deployment workflow, verify:
PRE-FLIGHT VERIFICATION
┌──────────────────────────────────────────────────────────────────┐
│ □ JetPack version confirmed (cat /etc/nv_tegra_release) │
│ □ L4T version matches expected (dpkg -l nvidia-l4t-core) │
│ □ CUDA version confirmed (nvcc --version) │
│ □ TensorRT version confirmed (dpkg -l tensorrt) │
│ □ Available disk space > 10GB (df -h) │
│ □ Docker runtime is nvidia (docker info | grep -i runtime) │
│ □ Power mode is set (sudo nvpmodel -q) │
│ □ Fan mode is set (sudo jetson_clocks --show) │
│ □ Network access for container pulls (if needed) │
│ □ Model files are accessible on device │
└──────────────────────────────────────────────────────────────────┘
If ANY checkbox is unchecked → STOP. Resolve before proceeding.
Objective: Confirm the Jetson device is properly configured for deployment.
Actions:
cat /etc/nv_tegra_release to confirm L4T versionsudo nvpmodel -q to check current power modesudo nvpmodel -m <MODE> to set target power mode (0=MAXN, 1=15W, 2=7W for Orin Nano)sudo jetson_clocks to lock clock frequencies for consistent benchmarkingsudo pip3 install jetson-statsdocker run --rm --runtime nvidia --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smiExit Criteria:
Objective: Build a reproducible container environment using jetson-containers.
Actions:
git clone https://github.com/dusty-nv/jetson-containersjetson-containers build or docker buildExit Criteria:
--runtime nvidiapython3 -c "import tensorrt; print(tensorrt.__version__)")Objective: Convert model from training format to TensorRT engine.
Actions:
python3 -c "import onnx; model = onnx.load('model.onnx'); onnx.checker.check_model(model)"trtexec or Python APIExit Criteria:
.engine or .trt)Objective: Tune the TensorRT engine and pipeline for target performance.
Actions:
trtexec --loadEngine=model.engine --iterations=100 --avgRuns=50--memPoolSize=workspace:1024MiBExit Criteria:
Objective: Produce reliable, reproducible performance measurements.
Actions:
sudo nvpmodel -m <MODE>sudo jetson_clockstegrastats --interval 1000Exit Criteria:
Objective: Finalize the deployment for production operation.
Actions:
Exit Criteria:
Maintain state across conversation turns using this block:
<jetson-deploy-state>