Provides systematic debugging approaches for HyperShift hosted-cluster issues. Auto-applies when debugging cluster problems, investigating stuck deletions, or troubleshooting control plane issues.
This skill provides structured debugging workflows for common HyperShift hosted-cluster issues.
This skill automatically applies when:
For provider-specific issues and detailed troubleshooting steps, refer to these subskills:
The main skill below provides provider-agnostic debugging workflows. When you encounter provider-specific issues, consult the relevant subskill for detailed resolution steps.
default, clusters)clusters-<cluster-name>)When a hosted-cluster is stuck in deleting state, follow this systematic debugging process:
Check and verify NodePool deletion is progressing:
# Check NodePool resources in HC namespace
kubectl get nodepool -n <hc-namespace>
# Check CAPI cluster resource status in HCP namespace
kubectl get cluster -n <hcp-namespace> -o yaml
# Check CAPI provider pod logs
kubectl logs -n <hcp-namespace> deployment/capi-provider
# Check CAPI machines status in HCP namespace
kubectl get machines -n <hcp-namespace>
kubectl describe machines -n <hcp-namespace>
# Review HyperShift operator logs for NodePool issues
kubectl logs -n hypershift deployment/operator --tail=100 | grep -i nodepool
kubectl logs -n hypershift deployment/operator --tail=100 | grep -i <cluster-name>
What to look for:
Verify HCP resource and pods are being cleaned up:
# Check HCP resource status
kubectl get hostedcontrolplane -n <hcp-namespace> -o yaml
# Check pods in HCP namespace
kubectl get pods -n <hcp-namespace>
# Check for stuck pods
kubectl get pods -n <hcp-namespace> --field-selector=status.phase!=Running
# Review control-plane-operator logs
kubectl logs -n <hcp-namespace> deployment/control-plane-operator --tail=100
What to look for:
Investigate why the HCP namespace isn't being removed:
# Check namespace status
kubectl get namespace <hcp-namespace> -o yaml
# List all remaining resources in namespace
kubectl api-resources --verbs=list --namespaced -o name | \
xargs -n 1 kubectl get --show-kind --ignore-not-found -n <hcp-namespace>
# Check for resources with finalizers
kubectl get all -n <hcp-namespace> -o json | \
jq '.items[] | select(.metadata.finalizers != null) | {kind: .kind, name: .metadata.name, finalizers: .metadata.finalizers}'
# Review HO logs for namespace cleanup
kubectl logs -n hypershift deployment/operator --tail=100 | grep -i namespace
What to look for:
Final check on the HostedCluster resource itself:
# Check HostedCluster status
kubectl get hostedcluster -n <hc-namespace> <cluster-name> -o yaml
# Check HostedCluster finalizers
kubectl get hostedcluster -n <hc-namespace> <cluster-name> -o jsonpath='{.metadata.finalizers}'
# Review HO logs for HostedCluster deletion
kubectl logs -n hypershift deployment/operator --tail=200 | grep -i "hostedcluster.*<cluster-name>"
What to look for:
When investigating cluster deletion issues:
InfrastructureReady: 1 of 2 completedWhen HyperShift CRDs are accidentally deleted or corrupted (e.g., after using hypershift destroy infra), you'll need to reinstall HyperShift.
⚠️ WARNING: HyperShift reinstallation should be a last resort. Only proceed if CRDs are genuinely missing or corrupted and cannot be recovered through other means. Reinstallation will cause downtime and may impact existing hosted-clusters.
kubectl get hostedclusters fail with: error: the server doesn't have a resource type "hostedclusters"# Check if critical HyperShift CRDs exist
kubectl get crd hostedclusters.hypershift.openshift.io
kubectl get crd nodepools.hypershift.openshift.io
# Count HyperShift CRDs (should be ~9)
kubectl get crd | grep hypershift | wc -l
# Count CAPI CRDs (should be ~50)
kubectl get crd | grep cluster.x-k8s.io | wc -l
🤖 AI Assistant Note: When this scenario is encountered, Claude should guide and suggest the reinstallation steps to the user but NEVER execute the reinstallation commands itself. The user must explicitly run these commands. Provide clear instructions and explanations, but do not use the Bash tool to perform the actual reinstallation.
# You'll need these for reinstallation:
# - OIDC storage provider configuration (provider-specific, see below)
# - Provider credentials (if applicable)
# - Any custom configuration flags used in original installation
Provider-specific parameters:
hypershift install render | kubectl delete -f -
This will:
hypershift install \
[provider-specific-flags] \
--enable-defaulting-webhook true
Add any other flags that were part of your original installation.
Provider-specific installation:
# Check operator is running
kubectl get deploy -n hypershift
kubectl get pods -n hypershift
# Verify CRDs are installed
kubectl get crd | grep hostedcluster
kubectl get crd | grep nodepool
kubectl get crd | grep cluster.x-k8s.io | wc -l
# Test API accessibility
kubectl get hostedclusters -A
# Check operator logs for errors
kubectl logs -n hypershift deployment/operator --tail=50
# Verify controllers are running
kubectl logs -n hypershift deployment/operator --tail=100 | grep "Starting workers"
Expected Results After Reinstallation:
kubectl get hostedclusters -A returns successfully (even if no clusters exist)Important Notes:
# List all finalizers on a resource
kubectl get <resource-type> <name> -n <namespace> -o jsonpath='{.metadata.finalizers}'
# Remove a specific finalizer (use with caution!)
kubectl patch <resource-type> <name> -n <namespace> -p '{"metadata":{"finalizers":null}}' --type=merge
# HyperShift operator logs with context
kubectl logs -n hypershift deployment/operator --tail=500 --timestamps
# Control plane operator logs
kubectl logs -n <hcp-namespace> deployment/control-plane-operator --tail=500 --timestamps
# Follow logs in real-time
kubectl logs -n hypershift deployment/operator -f
# Get events for a specific resource
kubectl describe <resource-type> <name> -n <namespace>
# Get all events in a namespace, sorted by time
kubectl get events -n <namespace> --sort-by='.lastTimestamp'
# Check resource conditions
kubectl get <resource-type> <name> -n <namespace> -o jsonpath='{.status.conditions}' | jq .
# Check specific condition
kubectl get hostedcluster <name> -n <namespace> -o jsonpath='{.status.conditions[?(@.type=="Available")]}'
hypershift-operator/controllers/hostedcluster/control-plane-operator/controllers/api/hypershift/v1beta1/test/e2e/