Use when working with Nomad — jobs, allocations, deployments, nomad CLI, or any Nomad-related task
Read state.json for active_company, then load config.yaml for that company. Nomad configuration lives under cloud.nomad and provides:
| Field | Purpose |
|---|---|
addr | Nomad API address (e.g. https://nomad:4646) |
token_ref | Reference to the ACL token in secrets store |
cacert | Path to CA certificate for TLS verification |
NOMAD_ADDR and NOMAD_TOKEN are set automatically when you run hat on <company>. If TLS is required, NOMAD_CACERT is also exported. You do not need to set these manually.
# List all jobs
nomad status
# Inspect a specific job
nomad status <job>
# Show job version history
nomad job history <job>
# Dry-run: plan a job file to see the diff (safe — makes no changes)
nomad job plan <job.nomad.hcl>
# Run a job (ONLY when explicitly instructed)
nomad job run <job.nomad.hcl>
# Stop a job (ONLY when explicitly instructed)
nomad job stop <job>
# Force a new deployment (periodic job)
nomad job periodic force <job>
# List all allocations for a job
nomad job allocs <job>
# Inspect a specific allocation
nomad alloc status <alloc-id>
# Stream stdout logs for an allocation
nomad alloc logs <alloc-id>
# Stream stderr logs for an allocation
nomad alloc logs -stderr <alloc-id>
# Follow (tail) stdout logs
nomad alloc logs -f <alloc-id>
# Follow stderr
nomad alloc logs -f -stderr <alloc-id>
# Follow logs for a specific task within an allocation
nomad alloc logs -f <alloc-id> <task-name>
# Execute a command inside a running allocation
nomad alloc exec -task <task-name> <alloc-id> /bin/sh
# Execute a one-off command
nomad alloc exec -task <task-name> <alloc-id> env
# List all client nodes
nomad node status
# Inspect a specific node
nomad node status <node-id>
# Check cluster health (Raft peers)
nomad operator raft list-peers
# Drain a node before maintenance (ONLY when explicitly instructed)
nomad node drain -enable -deadline 10m <node-id>
# Disable drain after maintenance (ONLY when explicitly instructed)
nomad node drain -disable <node-id>
# Check ACL token (verify auth is working)
nomad acl token self
When a job allocation is in failed or lost state:
List jobs to find the affected job:
nomad status
Get job detail and look for failed allocations:
nomad status <job>
Inspect the failing allocation (note the alloc ID from step 2):
nomad alloc status <alloc-id>
Check the Recent Events section in alloc status output for error messages (OOM kill, port conflict, image pull failure, etc.).
Read stderr logs for the task:
nomad alloc logs -stderr <alloc-id>
If the container exited immediately, also read stdout:
nomad alloc logs <alloc-id>
Cross-reference with the node the alloc ran on:
nomad node status <node-id>
When deploying a new version of a job:
Edit the job spec with the new version or image tag.
Plan the job to see what will change (this is always safe):
nomad job plan <job.nomad.hcl>
Review the diff output — confirm the only changes are the expected ones.
When explicitly instructed to apply, run the job:
nomad job run <job.nomad.hcl>
Watch the deployment status:
nomad status <job>
Follow logs during rollout:
nomad alloc logs -f <new-alloc-id>
If a deployment gets stuck, check the deployment detail:
nomad job deployments <job>
Before taking a node offline for maintenance (only when explicitly instructed):
Identify the node ID:
nomad node status
Inspect the node to see currently running allocations:
nomad node status <node-id>
Enable drain with a deadline (allocations migrate to other nodes):
nomad node drain -enable -deadline 10m <node-id>
Monitor until the node shows ineligible and all allocs have migrated:
nomad node status <node-id>
Perform maintenance on the node.
Re-enable the node after maintenance:
nomad node drain -disable <node-id>
When a developer asks for logs from a running or recently-stopped service:
Find the job:
nomad status <job>
List allocations and find the relevant one (most recent, running status):
nomad job allocs <job>
For real-time log tailing:
nomad alloc logs -f <alloc-id>
For stderr (crash logs, panic output):
nomad alloc logs -f -stderr <alloc-id>
If the job has multiple tasks (e.g. sidecar), specify the task:
nomad alloc logs -f <alloc-id> <task-name>
Never run the following without an explicit instruction from the user:
nomad job run — deploys or updates a running jobnomad job stop — stops and removes a jobnomad system gc — forces garbage collection of stopped allocations and jobsnomad node drain -enable — drains a client nodeRead-only commands (nomad status, nomad alloc status, nomad alloc logs, nomad job plan, nomad operator raft list-peers) are always safe to run.