Production Health Check

13 checks. One verdict. All read-only. Uses Azure MCP tools.

Prerequisites

Azure MCP tools require az login credentials. Before starting, verify:

Run az account show in terminal to confirm authentication and note the active subscription ID
If not authenticated, prompt the user to run az login

Verdict Logic

Evaluated top-down, first match wins:

🔴 Critical — ANY of: readiness probe non-200, any 5xx in 24h, DB CPU > 80% peak, fired Sev0/Sev1 alerts in 24h, ContainerCrashing on current revision, LLM dependency failures > 5 in 24h, any init.failed logs in 24h, GitHub API failures > 20 in 24h

⚠️ Warning — ANY of: P95 latency > 500ms, DB CPU 50–80% peak or Memory 70–85% or Storage 70–85%, any failed availability tests in 24h, non-zero unhandled exceptions in 7d, active connections > 80, without matching scale events, error rate spike (single day > 2× weekly average) or rising trend (3+ consecutive days increasing), Container App CPU > 80% or Memory > 80%, ERROR-level AppTraces > 10 in 24h, auth failure rate > 50% in 24h

Production Health Check

13 checks. One verdict. All read-only. Uses Azure MCP tools.

Prerequisites

Azure MCP tools require az login credentials. Before starting, verify:

Run az account show in terminal to confirm authentication and note the active subscription ID
If not authenticated, prompt the user to run az login

Verdict Logic

Evaluated top-down, first match wins:

Metric	✅ Healthy	⚠️ Warning	🔴 Critical
CPU (peak)	< 50%	50–80%	> 80%
Memory (peak)	< 70%	70–85%	> 85%
Storage (peak)	< 70%	70–85%	> 85%
Connections (peak)	< 80	80–100	> 100

Metric	✅ Healthy	⚠️ Warning	🔴 Critical
CPU (UsageNanoCores peak)	< 300M	300M–400M	> 400M (80% of 500M)
Memory (WorkingSetBytes peak)	< 750Mi	750Mi–860Mi	> 860Mi (80% of 1Gi)
RestartCount (total)	0	1–2	> 2

Check Prod

Production Health Check

Prerequisites

Verdict Logic

Check Prod

Production Health Check

Prerequisites

Verdict Logic

Step 0: Resource Discovery

Step 1: Live Readiness Probe

Steps 2–11: MCP Queries

Azure MCP Tool Reference

Step 2: Resource Health (all resources)

Step 3: Availability Tests (24h)

Step 4: Request Health (24h)

Step 5: Error Rate Trend (7d)

Step 6: Errors — Exceptions + AppTraces (7d)

Step 7: Dependency Health (24h)

Step 8: Database Metrics (24h)

Step 9: Container App Metrics (24h)

Step 10: Container Stability (24h)

Step 11: Fired Alerts (24h)

Step 12: Business Metrics (24h)

Step 13: LLM Performance (24h)

Summary Report

Notes

Ordercli

Deployment Patterns

Junta Leiloeiros

Persona Sales Ops

Eliza App Development

Ordercli