Skill File

Infrastructure Capacity Planning

Name: Infrastructure Capacity Planning
Author: abhiabhi0

Capacity planning for backend services and infrastructure — calculate limits, predict saturation, plan scaling, and set SLOs. Covers Little's Law, USE method, goroutine/thread pool sizing, connection pool sizing, Kubernetes resource requests, cgroup CPU quota, and cost-aware scaling decisions. Apply when planning capacity for a service, sizing a deployment, setting resource limits, or deciding when to scale horizontally vs vertically.

abhiabhi00 starsApr 9, 2026

Occupation
Categories: Project Management

Skill Content

The USE Method (Start Here)

For every resource, check three things:

Resource	Utilisation	Saturation	Errors
CPU	`sar -u` → `%idle`	`vmstat` → `r` column	`perf stat` → exceptions
Memory	`free` → used/total	`vmstat` → `si`/`so` (swap)	`dmesg` → OOM
Disk	`iostat` → `%util`	`iostat` → `avgqu-sz`	`dmesg` → I/O errors
Network	`sar -n DEV` → bandwidth	`ss` → retransmits	`ip -s link` → errors

Rule: Fix saturated resources first. High utilisation without saturation is fine.

Related Skills

Infrastructure Capacity Planning | Skills Pool

// Default: GOMAXPROCS = runtime.NumCPU() — correct for bare metal
// In containers: NumCPU() returns HOST CPUs, not the cgroup limit → wrong!
import _ "go.uber.org/automaxprocs"  // Auto-detects cgroup CPU quota — add to main.go

Workload type	GOMAXPROCS guidance
CPU-bound (compression, encoding)	= number of allocated cores
I/O-bound (HTTP handlers, gRPC)	Default is fine; more P's = more parallelism in netpoller
Mixed	Default (automaxprocs handles containers correctly)

Max RPS = Concurrency / Avg_Latency_seconds

Example:
  GOMAXPROCS=4, avg handler latency = 20ms (0.02s)
  Max RPS = 4 / 0.02 = 200 RPS per pod

  With 5 pods: 200 × 5 = 1000 RPS theoretical max
  Apply 70% safety margin: safe operating point = 700 RPS

Pool size = (target_rps × avg_db_latency_seconds) + buffer

Example:
  1000 RPS → DB, avg DB query = 5ms (0.005s)
  Active connections = 1000 × 0.005 = 5 concurrent connections
  Pool size = 5 × 2 (headroom) = 10 connections

  For Go: database/sql
  db.SetMaxOpenConns(10)
  db.SetMaxIdleConns(5)
  db.SetConnMaxLifetime(5 * time.Minute)

Infrastructure Capacity Planning

The USE Method (Start Here)

Infrastructure Capacity Planning

The USE Method (Start Here)

Sizing Go Services

GOMAXPROCS — CPU Core Allocation

Max RPS — Theoretical Ceiling

Connection Pool Sizing

Kubernetes Resource Sizing

CPU Requests and Limits

Things Mac

Trello

Production Scheduling

Jira Integration

Production Scheduling

Cost Aware Llm Pipeline