Use when working with K8S Metrics Server — kubernetes Metrics Server management and resource metrics analysis. Covers Metrics Server deployment health, node and pod resource utilization, top consumers, API availability, and metrics accuracy. Use when debugging HPA scaling issues, reviewing resource utilization, troubleshooting metrics unavailability, or analyzing cluster capacity.
Manage and analyze Kubernetes Metrics Server for resource utilization data.
Always check Metrics Server health before querying metrics.
#!/bin/bash
echo "=== Metrics Server Deployment ==="
kubectl get deployment metrics-server -n kube-system -o custom-columns='NAME:.metadata.name,READY:.status.readyReplicas,AVAILABLE:.status.availableReplicas,IMAGE:.spec.template.spec.containers[0].image' 2>/dev/null
echo ""
echo "=== Metrics Server Pods ==="
kubectl get pods -n kube-system -l k8s-app=metrics-server -o custom-columns='NAME:.metadata.name,STATUS:.status.phase,RESTARTS:.status.containerStatuses[0].restartCount,NODE:.spec.nodeName' 2>/dev/null
echo ""
echo "=== Metrics API Availability ==="
kubectl get apiservices v1beta1.metrics.k8s.io -o custom-columns='NAME:.metadata.name,SERVICE:.spec.service.name,AVAILABLE:.status.conditions[?(@.type=="Available")].status' 2>/dev/null
echo ""
echo "=== Metrics Server Args ==="
kubectl get deployment metrics-server -n kube-system -o jsonpath='{.spec.template.spec.containers[0].args}' 2>/dev/null
echo ""
#!/bin/bash
echo "=== Node Resource Usage ==="
kubectl top nodes 2>/dev/null | head -20
echo ""
echo "=== Top CPU Pods (all namespaces) ==="
kubectl top pods --all-namespaces --sort-by=cpu 2>/dev/null | head -15
echo ""
echo "=== Top Memory Pods (all namespaces) ==="
kubectl top pods --all-namespaces --sort-by=memory 2>/dev/null | head -15
echo ""
echo "=== Pods Without Resource Requests ==="
kubectl get pods --all-namespaces -o json 2>/dev/null | jq -r '
.items[] |
select(.spec.containers[].resources.requests == null or .spec.containers[].resources.requests == {}) |
"\(.metadata.namespace)/\(.metadata.name)\t\(.spec.containers[].name)"
' | head -15
echo ""
echo "=== Resource Utilization vs Requests ==="
kubectl get pods --all-namespaces -o json 2>/dev/null | jq -r '
.items[] |
select(.status.phase == "Running") |
.spec.containers[] |
select(.resources.requests.cpu // "" != "") |
"\(.name)\tCPU-req:\(.resources.requests.cpu)\tMem-req:\(.resources.requests.memory // "none")"
' | head -15
echo ""
echo "=== Metrics Server Logs (errors) ==="
kubectl logs deployment/metrics-server -n kube-system --tail=20 2>/dev/null | grep -i "error\|fail\|unable" | head -10
kubectl top for resource usage summaries--help output.| Shortcut | Counter | Why |
|---|---|---|
| "I'll skip discovery and check known resources" | Always run Phase 1 discovery first | Resource names change, new resources appear — assumed names cause errors |
| "The user only asked for a quick check" | Follow the full discovery → analysis flow | Quick checks miss critical issues; structured analysis catches silent failures |
| "Default configuration is probably fine" | Audit configuration explicitly | Defaults often leave logging, security, and optimization features disabled |
| "Metrics aren't needed for this" | Always check relevant metrics when available | API/CLI responses show current state; metrics reveal trends and intermittent issues |
| "I don't have access to that" | Try the command and report the actual error | Assumed permission failures prevent useful investigation; actual errors are informative |
--kubelet-insecure-tls may be needed in some environments -- check logs for x509 errorskubectl top shows actual usage; compare against resources.requests for right-sizing--enable-aggregator-routing