Migration tooling expertise — Velero backup/restore, Azure Migrate for containers, container registry migration (ECR/GCR/Harbor to ACR), persistent volume data migration, DNS cutover, and validation procedures. Cross-cutting skill used in Analysis (assess data migration needs) and Documentation (migration runbooks). Triggers: "Velero", "Azure Migrate", "registry migration", "data migration", "ACR", "image migration", "backup restore", "PV migration", "DNS cutover", "migration tooling", "migration validation"
Provides migration tooling knowledge for planning and executing the operational aspects of Kubernetes migration — beyond YAML manifest conversion. Covers backup/restore, registry migration, persistent data migration, DNS cutover, and validation.
This skill covers the operational execution of migration — the tasks that happen alongside or after YAML manifest conversion:
YAML Conversion (other skills) Migration Tooling (this skill)
───────────────────────────── ──────────────────────────────
Manifest transformation Container image migration
API version updates Persistent volume data migration
Annotation/label changes Backup and disaster recovery
Security context adjustments DNS and traffic cutover
Pre/post-migration validation
Registry migration
| Source Registry | Target | Migration Method |
|---|---|---|
| Amazon ECR | Azure ACR | az acr import (direct, no local pull needed) |
| Google GCR / Artifact Registry | Azure ACR | az acr import (direct) |
| Docker Hub | Azure ACR | az acr import (direct) |
| Harbor (on-prem) | Azure ACR | Push via docker tag + docker push, or skopeo copy |
| GitLab Container Registry | Azure ACR | az acr import or skopeo copy |
| Quay.io | Azure ACR | az acr import (direct) |
| Self-hosted registry | Azure ACR | skopeo copy or crane copy (no Docker daemon needed) |
Server-side import — no local Docker daemon, no bandwidth through your machine:
# Import from ECR (requires IAM credentials configured)
az acr import \
--name myacr \
--source public.ecr.aws/myorg/myapp:v1.2.3 \
--image myapp:v1.2.3
# Import from GCR
az acr import \
--name myacr \
--source gcr.io/my-project/myapp:v1.2.3 \
--image myapp:v1.2.3
# Import from Docker Hub
az acr import \
--name myacr \
--source docker.io/library/nginx:1.25 \
--image nginx:1.25
# Batch import with tag preservation
for img in app1:v1 app2:v2 app3:v3; do
az acr import --name myacr --source old-registry.example.com/$img --image $img
done
# Copy without Docker daemon
skopeo copy \
docker://harbor.internal.com/project/myapp:v1 \
docker://myacr.azurecr.io/project/myapp:v1 \
--src-tls-verify=false \
--dest-creds "$(az acr credential show -n myacr --query username -o tsv):$(az acr credential show -n myacr --query 'passwords[0].value' -o tsv)"
# Copy all tags for a repository
skopeo sync \
--src docker --dest docker \
harbor.internal.com/project/myapp \
myacr.azurecr.io/project/myapp
# Copy single image
crane copy old-registry.example.com/myapp:v1 myacr.azurecr.io/myapp:v1
# Copy with platform filtering (linux/amd64 only)
crane copy --platform linux/amd64 old-registry.example.com/myapp:v1 myacr.azurecr.io/myapp:v1
After registry migration, update all image references in converted YAML:
| Source Pattern | Target Pattern |
|---|---|
<account>.dkr.ecr.<region>.amazonaws.com/<repo>:<tag> | <acr-name>.azurecr.io/<repo>:<tag> |
gcr.io/<project>/<repo>:<tag> | <acr-name>.azurecr.io/<repo>:<tag> |
public.ecr.aws/<alias>/<repo>:<tag> | <acr-name>.azurecr.io/<repo>:<tag> or MCR equivalent |
harbor.internal.com/<project>/<repo>:<tag> | <acr-name>.azurecr.io/<project>/<repo>:<tag> |
# Attach ACR to AKS (managed identity-based pull, no imagePullSecret needed)
az aks update --name myaks --resource-group myrg --attach-acr myacr
# Verify
az aks check-acr --name myaks --resource-group myrg --acr myacr.azurecr.io
| Capability | Details |
|---|---|
| Cluster backup | Full or filtered backup of K8s resources (namespaces, labels) |
| PV snapshots | CSI snapshots or Restic/Kopia file-level backup for PVs |
| Cross-cluster restore | Restore backups to a different cluster (migration use case) |
| Scheduled backups | Cron-based recurring backups for DR |
| Namespace migration | Backup namespace on source → restore on target with remapping |
Source Cluster Target AKS Cluster
────────────── ──────────────────
1. Install Velero 4. Install Velero
(configured with (configured with
Azure Blob storage) same Azure Blob storage)
2. Backup namespace(s) 5. Restore backup
velero backup create velero restore create
--include-namespaces app --from-backup app-backup
--snapshot-volumes --namespace-mappings
old-ns:new-ns
3. Verify backup completed 6. Verify restore
velero backup describe velero restore describe
# Install Velero CLI
# Windows: choco install velero / scoop install velero
# macOS: brew install velero
# Linux: download from GitHub releases
# Install Velero on AKS with Azure plugin
velero install \
--provider azure \
--plugins velero/velero-plugin-for-microsoft-azure:v1.10.0 \
--bucket velero-backups \
--secret-file ./credentials-velero \
--backup-location-config \
resourceGroup=myRG,storageAccount=myStorageAcct,subscriptionId=xxx \
--snapshot-location-config \
resourceGroup=myRG,subscriptionId=xxx \
--use-node-agent # for file-level PV backup via Kopia
# Backup a namespace (resources + PV snapshots)
velero backup create app-migration \
--include-namespaces production \
--snapshot-volumes \
--wait
# Backup with label selector
velero backup create frontend-backup \
--selector app=frontend \
--snapshot-volumes
# Restore to AKS (with namespace remapping)
velero restore create app-restore \
--from-backup app-migration \
--namespace-mappings production:production \
--restore-volumes
# Restore specific resources only
velero restore create db-restore \
--from-backup app-migration \
--include-resources persistentvolumeclaims,persistentvolumes \
--include-namespaces database
| Limitation | Workaround |
|---|---|
| Cross-cloud PV snapshots not supported | Use Restic/Kopia (file-level) backup instead of CSI snapshots |
| StorageClass must exist on target | Pre-create StorageClasses before restore, or use --storage-class-mappings |
| CRDs must be compatible | Verify CRD versions match between source and target |
| Large PVs slow to backup via file-level | Use Azure Data Box or azcopy for large datasets |
| Cluster-scoped resources need care | Use --include-cluster-resources=true selectively |
| Data Size | Strategy | Tool | Downtime |
|---|---|---|---|
| < 100 GB | Velero file-level backup | Velero + Kopia | Minutes |
| 100 GB - 1 TB | Azure Disk snapshot + restore | az disk, az snapshot | Minutes |
| 1 TB - 10 TB | azcopy between storage accounts | azcopy | Hours (can be online) |
| > 10 TB | Azure Data Box | Physical device | Days (offline transfer) |
| Database data | Native dump/restore | pg_dump, mysqldump, mongodump | Application-dependent |
# Copy Azure Files share to new location
azcopy copy \
"https://source-storage.file.core.windows.net/share?SAS_TOKEN" \
"https://target-storage.file.core.windows.net/share?SAS_TOKEN" \
--recursive
# Copy Azure Blob container
azcopy copy \
"https://source-storage.blob.core.windows.net/container?SAS_TOKEN" \
"https://target-storage.blob.core.windows.net/container?SAS_TOKEN" \
--recursive
# Sync (incremental copy — only changed files)
azcopy sync \
"https://source.file.core.windows.net/share?SAS" \
"https://target.file.core.windows.net/share?SAS" \
--recursive
For PVs on source platforms (EBS, GCP PD, vSphere VMDK):
azcopy/rsync to Azure Files → mount on AKS# Example: rsync from source pod to Azure Files (via jumpbox)
kubectl exec -n app source-pod -- tar czf - /data | \
ssh jumpbox "az storage file upload-batch --destination share --source /dev/stdin"
flowchart TD
Pre["Pre-Cutover<br/>Both clusters running<br/>Source handles traffic"]
DNS["DNS Update<br/>Lower TTL to 60s<br/>24-48h before cutover"]
Switch["Traffic Switch<br/>Update DNS to AKS<br/>ingress IP / FQDN"]
Monitor["Monitor<br/>Watch error rates,<br/>latency, 5xx responses"]
Rollback{"Issues?"}
Complete["Cutover Complete<br/>Restore normal TTL<br/>Decommission source"]
Revert["Rollback<br/>Revert DNS to source<br/>Investigate issues"]
Pre --> DNS --> Switch --> Monitor --> Rollback
Rollback -- No --> Complete
Rollback -- Yes --> Revert --> Monitor
| Step | Action | Verify |
|---|---|---|
| T-48h | Lower DNS TTL to 60s | dig +short domain.com shows new TTL |
| T-24h | Smoke test AKS endpoints | All health checks pass |
| T-1h | Final data sync (if stateful) | Data consistency check |
| T-0 | Update DNS records to AKS IPs | nslookup resolves to new IPs |
| T+5m | Monitor error rates | <1% error rate |
| T+30m | Verify all traffic on AKS | Source cluster traffic → 0 |
| T+24h | Restore DNS TTL to normal (300-3600s) | Confirmed stable |
| T+7d | Decommission source cluster | Backup retained |
# Update A record to point to AKS ingress
az network dns record-set a update \
--resource-group dns-rg \
--zone-name example.com \
--name app \
--set aRecords[0].ipv4Address=<AKS_INGRESS_IP>
# Or use CNAME for Azure-managed ingress FQDN
az network dns record-set cname set-record \
--resource-group dns-rg \
--zone-name example.com \
--record-set-name app \
--cname <AKS_INGRESS_FQDN>
| Check | Command | Purpose |
|---|---|---|
| API versions in use | kubectl api-resources --verbs=list -o name | xargs -I{} kubectl get {} -A -o json | jq '.items[].apiVersion' | sort | uniq -c | Identify deprecated APIs |
| Resource inventory | kubectl get all -A -o wide | Full resource listing |
| PV usage | kubectl get pv -o custom-columns=NAME:.metadata.name,CAPACITY:.spec.capacity.storage,STATUS:.status.phase | Storage migration planning |
| Image list | kubectl get pods -A -o jsonpath='{range .items[*]}{.spec.containers[*].image}{"\n"}{end}' | sort -u | Registry migration scope |
| CRD inventory | kubectl get crd -o name | CRD compatibility check |
| RBAC audit | kubectl get clusterrolebindings,rolebindings -A -o wide | Access control mapping |
| Network policies | kubectl get networkpolicies -A | Security posture assessment |
| Resource utilization | kubectl top nodes && kubectl top pods -A | Right-sizing baseline |
# Verify AKS cluster is ready
kubectl get nodes -o wide
kubectl get cs # component status (deprecated but useful)
# Verify CSI drivers
kubectl get csidriver
# Verify storage classes
kubectl get sc
# Verify ingress controller
kubectl get svc -n ingress-nginx # or kube-system for AGIC
# Verify ACR connectivity
az aks check-acr --name myaks --resource-group myrg --acr myacr.azurecr.io
# Verify DNS resolution
kubectl run dnstest --image=busybox --rm -it -- nslookup kubernetes.default
| Category | Validation | Command / Method |
|---|---|---|
| Pods | All pods Running/Ready | kubectl get pods -A | grep -v Running |
| Services | Endpoints populated | kubectl get endpoints -A |
| Ingress | External access works | curl -I https://app.example.com |
| Storage | PVCs bound, data intact | kubectl get pvc -A, exec into pod to verify data |
| DNS | Internal resolution | kubectl exec -- nslookup service-name.namespace |
| Secrets | Mounted correctly | kubectl exec -- ls /mnt/secrets |
| RBAC | Access controls work | Test with different service accounts |
| Network policies | Traffic allowed/denied correctly | Deploy test pods, verify connectivity |
| Autoscaling | HPA/VPA/CA responding | Generate load, observe scaling |
| Monitoring | Metrics flowing | Check Container Insights, Grafana dashboards |
# Compare record counts (database workloads)
# Source:
kubectl exec -n db source-db-pod -- psql -c "SELECT COUNT(*) FROM important_table"
# Target:
kubectl exec -n db target-db-pod -- psql -c "SELECT COUNT(*) FROM important_table"
# Compare file checksums (file storage workloads)
# Source:
kubectl exec -n app source-pod -- find /data -type f -exec md5sum {} \; | sort > /tmp/source-checksums.txt
# Target:
kubectl exec -n app target-pod -- find /data -type f -exec md5sum {} \; | sort > /tmp/target-checksums.txt
diff /tmp/source-checksums.txt /tmp/target-checksums.txt
Every migration must have a documented rollback procedure:
During source analysis, assess and document:
Include in the analysis report:
Include in the migration report:
az acr import commands for each imageaz aks update --attach-acr)