Deep-dive Azure architecture review, debugging, and service design. Use for structured investigations of Azure-specific issues, identity or cost audits, and multi-service design reviews. Triggers on: "Azure audit", "Azure design review", "EntraID review", "Managed Identity debug", "review my Azure", "Azure troubleshooting", "AKS deep-dive".
Structured investigation for Azure workloads. Five phases: gather context, diagnose, design, recommend, verify.
$0 — subscription context, service scope, or problem description. Required.**/*.bicep, **/azuredeploy.json, **/main.tf, **/*.parameters.json.az account show
az group list --query "[].name"
az provider list --query "[?registrationState=='Registered'].namespace" -o tsv
Compute / containers:
az vm list --query "[].{name:name,rg:resourceGroup,state:powerState}" -o table
az aks list --query "[].{name:name,rg:resourceGroup,version:kubernetesVersion}" -o table
az webapp list --query "[].{name:name,rg:resourceGroup,state:state}" -o table
Identity / RBAC:
az role assignment list --assignee <principal-id> --all
az ad app list --display-name <name>
az identity list --resource-group <rg>
Networking:
az network vnet list
az network nsg rule list --nsg-name <nsg> --resource-group <rg>
az network private-endpoint list
Serverless / events:
az functionapp list
az servicebus namespace list
az eventgrid topic list
Cost / quotas:
az consumption usage list --start-date <> --end-date <>
az vm list-usage --location <region>
Map symptoms to causes:
| Symptom | Common Causes | Check |
|---|---|---|
| AKS pod AuthN fails | Managed Identity not assigned, missing federated credential | az aks show --query identity + pod ServiceAccount annotations |
| App Service slow | Cold start on consumption plan, misconfigured scale rules | Plan tier, autoscale settings |
| Storage 403 | Private endpoint with wrong DNS, firewall IP allowlist | az storage account network-rule list |
| Function cold starts | Consumption plan + infrequent traffic | Switch to Premium or Always-Ready instances |
| Cosmos DB throttle (429) | RU/s too low, hot partition | Diagnostic settings, metrics, partition key review |
| EntraID app login fails | Redirect URI mismatch, missing API permission grant | az ad app show + consent status |
Cite resource ID or file:line for every finding.
Output findings in priority order:
[CRITICAL] <title>
Resource: <resource ID or file:line>
Issue: <one sentence>
Evidence: <CLI output or code snippet>
Fix: <specific change, with Bicep/ARM/Terraform diff>
Trade-off: <alternative and its downside, if meaningful>
references/ where applicable.After fixes are applied:
az role assignment list and a live workload token request.az network watcher test-connectivity or NSG flow-log review.what-if before and after to confirm intended drift only.Consult references/ for decision guides:
| File | When to use |
|---|---|
compute.md | AKS, ACI, VMs, App Service |
serverless.md | Functions, Logic Apps, Service Bus, Event Grid |
storage.md | Blob, Files, Queues, Cosmos DB, Azure SQL |
networking.md | VNet, App Gateway, Front Door, Private Endpoints |
identity.md | EntraID, Managed Identity, RBAC scopes |
devops.md | Azure Pipelines, ACR, release management |
iac-patterns.md | Bicep, ARM, Terraform (AzureRM/azapi) patterns |