Runpod Deploy and manage GPU/CPU pods, network volumes, and templates on RunPod. Use when the user asks to launch a GPU server, manage cloud compute, run ML training or inference on remote GPUs, or interact with the RunPod platform in any way. Triggers: runpod, gpu cloud, launch pod, spin up server, deploy gpu, remote training, cloud gpu, inference server.
0 stars
Mar 18, 2026
Occupation Categories System Administration RunPod — Pod & Infrastructure Management
Auth
Set RUNPOD_API_KEY environment variable, or prompt user for their API key.
Authorization: Bearer <RUNPOD_API_KEY>
API Surface
Two APIs exist. Use REST for everything unless noted otherwise.
API Base URL Use For REST (primary)https://rest.runpod.io/v1All CRUD operations on pods, volumes, templates GraphQL https://api.runpod.io/graphqlGPU availability queries, runtime metrics, spot instance deployment
npx skillvault add latentwill/latentwill-runpod-skill-skill-md
stars 0
Updated Mar 18, 2026
Occupation CAUTION: api.runpod.io is GraphQL only. REST calls to api.runpod.io will fail silently. Always use rest.runpod.io for REST.
1. Check GPU Availability Use GraphQL — REST has no availability filter.
POST https://api.runpod.io/graphql
query {
gpuTypes {
id
displayName
memoryInGb
communityPrice
securePrice
stockStatus # LOW, MEDIUM, HIGH, or null (unavailable)
communityCloud
secureCloud
}
}
Common GPU IDs GPU VRAM ID Notes RTX 4090 24 GB NVIDIA GeForce RTX 4090Dev/testing, cheap RTX 4000 Ada 20 GB NVIDIA RTX 4000 Ada GenerationLight inference L40S 48 GB NVIDIA L40SBest value for training A40 48 GB NVIDIA A40Inference workhorse RTX 6000 Ada 48 GB NVIDIA RTX 6000 Ada GenerationAlternative 48 GB A100 80 GB 80 GB NVIDIA A100 80GB PCIeLarge models A100 SXM 80 GB NVIDIA A100-SXM4-80GBHigher bandwidth A100 H100 SXM 80 GB NVIDIA H100 80GB HBM3Fastest training H100 NVL 94 GB NVIDIA H100 NVLMax VRAM H100
GPU availability is seconds-level volatile for premium cards (H100, A100, L40S). Always wrap deploys in retry logic with a fallback GPU list.
Suggested Fallback Chains Use Case Try In Order Training (48 GB+) L40S → A40 → RTX 6000 Ada → A100 80 GB Inference (24 GB+) RTX 4090 → RTX 4000 Ada → L40S Frontier models H100 SXM → H100 NVL → A100 SXM → A100 80 GB Budget dev RTX 4090 → RTX 4000 Ada
2. Create a Pod
REST (preferred) POST https://rest.runpod.io/v1/pods
Content-Type: application/json
Authorization: Bearer <API_KEY>
{
"name": "my-pod",
"imageName": "runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04",
"gpuTypeIds": ["NVIDIA L40S"],
"gpuCount": 1,
"cloudType": "ALL",
"containerDiskInGb": 50,
"volumeInGb": 100,
"volumeMountPath": "/workspace",
"ports": ["8888/http", "22/tcp"],
"env": {
"JUPYTER_PASSWORD": "mypassword",
"HF_TOKEN": "hf_..."
}
}
Returns 201 with full pod object including id.
Full Field Reference (REST POST /pods) Field Type Default Notes namestring "my pod"Max 191 chars imageNamestring required Container image tag gpuTypeIdsstring[] — Array. GPU pods onlygpuCountint 1 Multi-GPU: 2, 4, 8 gpuTypePrioritystring "availability"availability or custom (use ordering in array)computeTypestring "GPU"GPU or CPUcpuFlavorIdsstring[] — CPU pods only: cpu3c, cpu3g, cpu3m, cpu5c, cpu5g, cpu5m cloudTypestring "SECURE"SECURE, COMMUNITY, or omit for secure onlycontainerDiskInGbint 50 Ephemeral — wiped on every restart volumeInGbint 20 Persists across restarts, mounted at volumeMountPath volumeMountPathstring "/workspace"Where volume mounts networkVolumeIdstring — Attach a network volume (replaces local volume) portsstring[] ["8888/http","22/tcp"]Format: port/protocol envobject {}{"KEY": "value"} formatdockerEntrypoint
GraphQL (use for spot instances) Spot instances (podRentInterruptable) are only available via GraphQL.
POST https://api.runpod.io/graphql
mutation {
podRentInterruptable(input: {
name: "spot-training"
imageName: "runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04"
gpuTypeId: "NVIDIA L40S"
gpuCount: 1
cloudType: SECURE
bidPerGpu: 0.40
containerDiskInGb: 50
volumeInGb: 100
volumeMountPath: "/workspace"
ports: "8888/http,22/tcp"
env: [{ key: "HF_TOKEN", value: "hf_..." }]
}) {
id
machineId
machine { podHostId }
}
}
CRITICAL DIFFERENCE: GraphQL env is [{key, value}] array. REST env is {"key": "value"} object. GraphQL gpuTypeId is singular string. REST gpuTypeIds is plural array. GraphQL ports is a single comma-separated string. REST ports is an array.
SPOT WARNING: Interruptible pods can be terminated at any time with zero notice. Always checkpoint work to network volumes or external storage.
3. List Pods
REST GET https://rest.runpod.io/v1/pods
Returns array of pod objects with fields: id, name, desiredStatus (RUNNING/EXITED/TERMINATED), costPerHr, gpu, image, ports, publicIp, portMappings, env, volumeInGb, containerDiskInGb, machine, lastStatusChange, interruptible, locked.
Get single pod GET https://rest.runpod.io/v1/pods/{podId}
Runtime metrics (GraphQL only) REST doesn't return live GPU utilization or container metrics. Use GraphQL:
query {
myself {
pods {
id
name
desiredStatus
costPerHr
runtime {
uptimeInSeconds
ports { ip isIpPublic privatePort publicPort type }
gpus { id gpuUtilPercent memoryUtilPercent }
container { cpuPercent memoryPercent }
}
}
}
}
4. Pod Lifecycle
Stop (releases GPU, preserves volume) POST https://rest.runpod.io/v1/pods/{podId}/stop
Start / Resume POST https://rest.runpod.io/v1/pods/{podId}/start
Restart (soft restart, keeps GPU) POST https://rest.runpod.io/v1/pods/{podId}/restart
Reset (wipes container disk, keeps volume) POST https://rest.runpod.io/v1/pods/{podId}/reset
Terminate (permanent deletion) DELETE https://rest.runpod.io/v1/pods/{podId}
TERMINATE IS IRREVERSIBLE. All data not on a network volume is permanently destroyed.
Resume spot instance (GraphQL only) mutation {
podBidResume(input: {
podId: "abc123"
bidPerGpu: 0.40
gpuCount: 1
}) {
id
desiredStatus
}
}
5. Update a Pod PATCH https://rest.runpod.io/v1/pods/{podId}
Content-Type: application/json
{
"env": {"NEW_VAR": "value"},
"ports": ["8888/http", "22/tcp", "5000/http"],
"volumeInGb": 200,
"imageName": "runpod/pytorch:2.8.0-py3.11-cuda12.6-devel-ubuntu22.04"
}
Updatable fields: containerDiskInGb, containerRegistryAuthId, dockerEntrypoint, dockerStartCmd, env, globalNetworking, imageName, locked, name, ports, volumeInGb, volumeMountPath.
Updates may trigger a pod reset. Warn user that running processes will be interrupted.
6. Network Volumes Network volumes are persistent storage independent of any pod . They survive pod termination and can be attached to any pod in the same datacenter. Critical for ML workflows.
Create POST https://rest.runpod.io/v1/networkvolumes
{
"name": "training-data",
"size": 100,
"dataCenterId": "US-TX-3"
}
List GET https://rest.runpod.io/v1/networkvolumes
Get GET https://rest.runpod.io/v1/networkvolumes/{networkVolumeId}
Update PATCH https://rest.runpod.io/v1/networkvolumes/{networkVolumeId}
Delete DELETE https://rest.runpod.io/v1/networkvolumes/{networkVolumeId}
Attach to pod at creation Set networkVolumeId in the pod create call. This replaces the local volume — the network volume mounts at volumeMountPath (default /workspace).
{
"name": "my-pod",
"imageName": "runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04",
"gpuTypeIds": ["NVIDIA L40S"],
"networkVolumeId": "vol_abc123",
"volumeMountPath": "/workspace"
}
Gotchas
Cannot attach/detach after pod creation. Must terminate and recreate.
Size cannot be decreased , only increased. Over 4 TB requires support contact.
Concurrent writes from multiple pods corrupt data. One writer at a time.
Pod and volume must be in the same datacenter. Specify dataCenterIds on pod create to match.
$0.07/GB/month for first TB, $0.05/GB/month after. Volumes with no payment method risk deletion.
7. Templates
List GET https://rest.runpod.io/v1/templates
Create POST https://rest.runpod.io/v1/templates
{
"name": "my-training-template",
"imageName": "runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04",
"containerDiskInGb": 50,
"volumeInGb": 100,
"volumeMountPath": "/workspace",
"ports": ["8888/http", "22/tcp"],
"env": {"JUPYTER_PASSWORD": "default"},
"dockerStartCmd": []
}
Update PATCH https://rest.runpod.io/v1/templates/{templateId}
Delete DELETE https://rest.runpod.io/v1/templates/{templateId}
Template must not be in use by any pod or serverless endpoint.
8. Connecting to a Pod
HTTP Proxy (easiest, has timeout) 02
Auth
Override image ENTRYPOINT
dockerStartCmdstring[] []Override image CMD
templateIdstring — UUID from template list
interruptiblebool false Spot pricing (can be preempted at any time)
lockedbool false Prevents accidental stop/reset
dataCenterIdsstring[] all e.g. ["US-TX-3","US-KS-2","EU-RO-1"]
dataCenterPrioritystring "availability"availability or custom
countryCodesstring[] — Filter by country
allowedCudaVersionsstring[] — e.g. ["12.4","12.3"]
minRAMPerGPUint 8 GPU pods only
minVCPUPerGPUint 2 GPU pods only
vcpuCountint 2 CPU pods only
supportPublicIpbool — Community Cloud only
globalNetworkingbool false Low-latency private networking (limited availability)
Runpod | Skills Pool