Infrastructure change workflow — Terraform plan/apply, Helm diff/upgrade, Kubernetes manifest changes with mandatory approval gates. Applies DevOps role. Safe-by-default with plan review before any mutation.
Safe infrastructure change workflow for Terraform, Helm, and Kubernetes. Every mutation requires explicit user approval after reviewing the plan/diff. Applies Agent(devops-engineer) for all steps.
⚠️ SAFETY: No apply, upgrade, delete, or scale command runs without explicit user APPROVE.
Read CLAUDE.md (or AGENTS.md) at the project root to identify:
Ask the user (or extract from parent workflow context):
If Risk = HIGH or Environment = production:
⚠️ HIGH-RISK INFRASTRUCTURE CHANGE
Destructive actions require explicit "APPROVE" before execution.
Affected: [list resources/services]
Apply Agent(devops-engineer). For production changes, also apply Agent(sre-engineer) for SLO impact assessment. For cloud infrastructure design (landing zones, networking topology, IAM), consult Agent(cloud-architect). For CI/CD pipeline architecture changes, consult Agent(devops-architect).
Before making changes, capture the current state:
// turbo
terraform validate
terraform fmt -check -recursive
Review relevant .tf files and current state:
// turbo
terraform plan -out=tfplan
// turbo
helm list -n <namespace>
helm get values <release-name> -n <namespace>
// turbo
kubectl get deployments,services,ingress -n <namespace>
kubectl get pods -n <namespace> -o wide
Record: Current resource counts, versions, replicas, config values as baseline.
Make the infrastructure code changes following Agent(devops-engineer) standards:
For Terraform:
.tf files as needed// turbo
terraform fmt -recursive
terraform validate
For Helm:
values.yaml or chart templates// turbo
helm lint <chart-path>
For raw Kubernetes manifests:
kubectl apply --dry-run=client -f <manifest>
terraform plan -out=tfplan
Present the plan summary:
## Terraform Plan Summary
| Action | Count | Resources |
|---------|-------|-----------|
| Add | X | [list] |
| Change | X | [list] |
| Destroy | X | [list] |
⚠️ DESTROY/REPLACE resources:
- [resource] — [reason]
Data loss risk: [yes/no]
Estimated cost impact: [if applicable]
helm diff upgrade <release> <chart-path> -n <namespace> -f <values-file>
Present the diff summary:
## Helm Diff Summary
| Object | Action | Key Changes |
|--------|--------|-------------|
| Deployment/X | changed | replicas: 1→2, image: v1→v2 |
| Service/Y | added | port 8080 |
| Secret/Z | changed | keys modified |
⚠️ Critical changes:
- [deployment restart expected]
- [secret reference changed]
Downtime risk: [yes/no]
kubectl diff -f <manifest>
⚠️ STOP. Present plan/diff summary and request APPROVE before proceeding to Step 6.
Only after the user explicitly approves:
Terraform:
terraform apply tfplan
Helm:
helm upgrade <release> <chart-path> -n <namespace> -f <values-file>
Kubectl:
kubectl apply -f <manifest>
Rules:
After applying, verify the changes took effect:
// turbo
kubectl get pods -n <namespace> -o wide
kubectl get events -n <namespace> --sort-by='.lastTimestamp' --field-selector type!=Normal
Check:
If production — monitor SLIs for 5–10 minutes after apply.
Document the rollback before considering the change complete:
Terraform:
# Revert the .tf file changes and re-apply
terraform plan -out=tfplan-rollback
terraform apply tfplan-rollback
Helm:
helm rollback <release> <previous-revision> -n <namespace>
Kubectl:
kubectl rollout undo deployment/<name> -n <namespace>
## Infrastructure Change Summary
- **Change**: [what was changed]
- **Tool**: [Terraform / Helm / kubectl]
- **Environment**: [dev / staging / production]
- **Risk**: [low / medium / high]
- **Plan reviewed**: [yes — summary of add/change/destroy]
- **Applied**: [yes/no — with APPROVE]
- **Verification**: [pass/fail]
- **Rollback plan**: [documented above]
- **Next steps**: [monitoring, follow-up changes]
Agent(devops-engineer) (primary), Agent(sre-engineer) (review), Agent(cloud-architect) (cloud design review), Agent(devops-architect) (CI/CD pipeline architecture)/feature-plan (infra work stream), /architecture (cloud architecture design)/deploy-staging, /deploy-production