End-to-end platform deployment orchestration — prerequisites, Terraform, Kubernetes verification, and troubleshooting
az login)gh auth login)./scripts/validate-prerequisites.sh).tfvars configured./scripts/validate-prerequisites.sh
# Login
az login
az account set --subscription "$AZURE_SUBSCRIPTION_ID"
# Register providers
for provider in Microsoft.ContainerService Microsoft.ContainerRegistry \
Microsoft.KeyVault Microsoft.Network Microsoft.ManagedIdentity \
Microsoft.Security Microsoft.CognitiveServices Microsoft.Monitor; do
az provider register --namespace "$provider"
done
./scripts/setup-terraform-backend.sh \
--customer-name contoso \
--environment dev \
--location brazilsouth
# Copy template and edit
cp terraform/terraform.tfvars.example terraform/environments/dev.tfvars
# Edit with your values
# Set sensitive vars
export TF_VAR_azure_subscription_id="..."
export TF_VAR_azure_tenant_id="..."
export TF_VAR_admin_group_id="..."
export TF_VAR_github_org="..."
export TF_VAR_github_token="..."
# Validate
./scripts/validate-config.sh --environment dev
cd terraform
terraform init
terraform plan -var-file=environments/dev.tfvars -out=deploy.tfplan
terraform apply deploy.tfplan
# Get AKS credentials
az aks get-credentials \
--resource-group "$(terraform output -raw resource_group_name)" \
--name "$(terraform output -raw aks_cluster_name)"
# Run validation
./scripts/validate-deployment.sh --environment dev
# Access ArgoCD
kubectl port-forward svc/argocd-server -n argocd 8080:443
# Visit https://localhost:8080
# Access Grafana
kubectl port-forward svc/prometheus-grafana -n observability 3000:80
# Visit http://localhost:3000
# Full deployment
./scripts/deploy-full.sh --environment dev
# Dry run (plan only)
./scripts/deploy-full.sh --environment dev --dry-run
# Deploy specific horizon
./scripts/deploy-full.sh --environment dev --horizon h1
# CI/CD mode (no prompts)
./scripts/deploy-full.sh --environment prod --auto-approve
# Resume after failure
./scripts/deploy-full.sh --environment dev --resume
# Destroy
./scripts/deploy-full.sh --environment dev --destroy
| Environment | Mode | Estimated Cost | Features |
|---|---|---|---|
| dev | express | $50-100/month | Minimal: AKS + ACR + ArgoCD + Observability |
| staging | standard | $500-1000/month | Production-like: + Databases + ESO + Defender + AI |
| prod | enterprise | $3000+/month | Full HA: + DR + Purview + Runners + Backstage + Cost Mgmt |
| Mode | Nodes | HA | GPU | Best For |
|---|---|---|---|---|
| express | 3 × D4s | No | No | Development, testing |
| standard | 5 × D4s | Yes | No | Production workloads |
| enterprise | 10 × D8s + workload pool | Yes (3 zones) | Optional | Enterprise, multi-tenant |
# Clear cache and retry
rm -rf terraform/.terraform terraform/.terraform.lock.hcl
terraform init -upgrade
# Verify all required vars are set
./scripts/validate-config.sh --environment <env>
# Refresh credentials
az aks get-credentials --resource-group <rg> --name <cluster> --overwrite-existing
kubectl get nodes
kubectl get pods -n argocd
kubectl describe pod -n argocd -l app.kubernetes.io/name=argocd-server
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-server
# Disable AI Foundry
# Set enable_ai_foundry = false in tfvars
terraform plan -var-file=environments/<env>.tfvars -out=rollback.tfplan
terraform apply rollback.tfplan
./scripts/deploy-full.sh --environment <env> --destroy