Operate, debug, harden, and scale Kubernetes-based systems with production discipline. Use when the task involves Kubernetes manifests, Helm values, deployments, rollout or rollback planning, probes, requests and limits, autoscaling, scheduling, ingress or service networking, config and secret handling, cluster policy, workload debugging, or reliability improvement for workloads already running on Kubernetes. Do not use for generic Linux administration, broad cloud architecture without cluster-level decisions, or application-code implementation as the primary task.
Keep Kubernetes systems boring in production: predictable rollouts, clear failure signals, sane resource behavior, and secure defaults.
This skill is for cluster and workload operations: manifests, workload runtime behavior, rollout safety, traffic routing, scheduling, scaling, policy, and observability.
Use adjacent skills instead when the main need is:
Provide some combination of:
Check:
Check:
Check:
Check:
A strong deliverable makes it obvious:
Before finishing, read guides/qa-checklist.md, align the response structure with prompt.md, and sanity-check the deliverable against examples/README.md.