Capacity planning for backend services and infrastructure — calculate limits, predict saturation, plan scaling, and set SLOs. Covers Little's Law, USE method, goroutine/thread pool sizing, connection pool sizing, Kubernetes resource requests, cgroup CPU quota, and cost-aware scaling decisions. Apply when planning capacity for a service, sizing a deployment, setting resource limits, or deciding when to scale horizontally vs vertically.
For every resource, check three things:
| Resource | Utilisation | Saturation | Errors |
|---|---|---|---|
| CPU | sar -u → %idle | vmstat → r column | perf stat → exceptions |
| Memory | free → used/total | vmstat → si/so (swap) | dmesg → OOM |
| Disk | iostat → %util | iostat → avgqu-sz | dmesg → I/O errors |
| Network | sar -n DEV → bandwidth | ss → retransmits | ip -s link → errors |
Rule: Fix saturated resources first. High utilisation without saturation is fine.
// Default: GOMAXPROCS = runtime.NumCPU() — correct for bare metal
// In containers: NumCPU() returns HOST CPUs, not the cgroup limit → wrong!
import _ "go.uber.org/automaxprocs" // Auto-detects cgroup CPU quota — add to main.go
| Workload type | GOMAXPROCS guidance |
|---|---|
| CPU-bound (compression, encoding) | = number of allocated cores |
| I/O-bound (HTTP handlers, gRPC) | Default is fine; more P's = more parallelism in netpoller |
| Mixed | Default (automaxprocs handles containers correctly) |
Using Little's Law:
Max RPS = Concurrency / Avg_Latency_seconds
Example:
GOMAXPROCS=4, avg handler latency = 20ms (0.02s)
Max RPS = 4 / 0.02 = 200 RPS per pod
With 5 pods: 200 × 5 = 1000 RPS theoretical max
Apply 70% safety margin: safe operating point = 700 RPS
When to add more replicas:
current_rps > 0.70 × max_rps_per_pod × pod_count
Pool size = (target_rps × avg_db_latency_seconds) + buffer
Example:
1000 RPS → DB, avg DB query = 5ms (0.005s)
Active connections = 1000 × 0.005 = 5 concurrent connections
Pool size = 5 × 2 (headroom) = 10 connections
For Go: database/sql
db.SetMaxOpenConns(10)
db.SetMaxIdleConns(5)
db.SetConnMaxLifetime(5 * time.Minute)
Signs of pool exhaustion:
driver: bad connection or context deadline exceeded in DB logsdb.Stats().WaitCount increasing in metrics