Cilium CNI operations: kube-proxy replacement, BPF configuration, Gateway API, Ingress Controller, LB-IPAM, L2 Announcements, upgrades, and troubleshooting on K3s/ARM64.
This cluster uses Cilium with full kube-proxy replacement (--disable-kube-proxy on K3s).
Node: Raspberry Pi CM4, ARM64, Ubuntu 24.04.
Helm release: cilium in namespace kube-system.
Chart version pinned in: roles/install-cilium/defaults/main.yml.
kubeProxyReplacement: "true"
k8sServiceHost: \"192.168.178.30\"
k8sServicePort: "6443"
rollOutCiliumPods: true # hash ConfigMap into pod template → auto rollout
operator.replicas: 1
operator.rollOutPods: true # same for operator
envoy.rollOutPods: true # same for envoy
bpf.masquerade: true # required on ARM64
loadBalancer.acceleration: disabled # no hardware offload on CM4
gatewayAPI.enabled: true
gatewayAPI.externalTrafficPolicy: Cluster # REQUIRED for L2 Announcements (see below)
l2announcements.enabled: true # Cilium handles ARP (MetalLB removed)
l2announcements.leaseDuration: 3s
l2announcements.leaseRenewDeadline: 1s
l2announcements.leaseRetryPeriod: 500ms
k8sClientRateLimit.qps: 32 # sized for L2 announcement leader election
k8sClientRateLimit.burst: 64
ingressController.enabled: true
ingressController.default: true
ingressController.enforceHttps: false # flip to true once cert-manager is up
ingressController.loadbalancerMode: shared
ingressController.service.externalTrafficPolicy: Cluster # REQUIRED for L2 Announcements
Always set these three values in the Cilium Helm chart:
rollOutCiliumPods: true
operator.rollOutPods: true
envoy.rollOutPods: true
These inject a hash of cilium-config ConfigMap into pod template annotations.
Any helm upgrade that changes config triggers an automatic rolling restart of
all three components, making wait: true reliable.
Without these flags: helm upgrade updates the ConfigMap but pods keep
running with stale in-memory config. Silent deadlock — agent waits forever for
CRDs that the stale operator never registers. No obvious error message.
ciliumenvoyconfigs.cilium.io and ciliumclusterwideenvoyconfigs.cilium.io
are registered by the operator ONLY when enable-envoy-config=true, which
is set automatically when gatewayAPI.enabled=true or
ingressController.enabled=true.
If the operator pod started before those flags were set (stale pod), it will never register the Envoy CRDs. The agent will then hang indefinitely:
Still waiting for Cilium Operator to register CRDs: [ciliumenvoyconfigs.cilium.io ...]
Fix: ensure operator.rollOutPods: true is set so the operator restarts on
every config change. If already stuck, delete the operator pod manually:
kubectl delete pod -n kube-system -l io.cilium/app=operator
# Wait for new pod, then check:
kubectl get crd | grep envoy
If a helm upgrade is interrupted (Ctrl+C, timeout, context canceled), the
release may be left in pending-upgrade state. All subsequent upgrades silently
hang — no error, no output, just blocks indefinitely.
# Diagnose
helm history cilium -n kube-system
# Look for STATUS=pending-upgrade
# Fix: delete the stuck secret (vN = the pending-upgrade revision number)
kubectl delete secret sh.helm.release.v1.cilium.vN -n kube-system
# Verify state is clean (last revision should be "deployed")
helm history cilium -n kube-system
# Then re-run
ansible-playbook playbooks/bootstrap.yml -i inventory/hosts.ini \
--start-at-task "Add Cilium Helm repository"
Cilium 1.19 supports Gateway API v1.4.1 (standard channel).
Prerequisites (all already met in this cluster):
kubeProxyReplacement: truel7Proxy: true (default)gatewayAPI.enabled=true is appliedRole order in bootstrap.yml:
install-gateway-api-crds → install-cilium → install-cilium-pools
What gets created automatically:
GatewayClass named cilium (controller: io.cilium/gateway-controller)LoadBalancer Service per Gateway resource (Cilium LB-IPAM assigns IP from pool)GatewayClass status meanings:
Unknown → operator/agent not yet running with new config, or no LB-IPAM pool definedTrue → fully operationalkubectl get gatewayclass
kubectl get gateway -A
kubectl get httproute -A
MetalLB has been removed. Cilium's built-in LB-IPAM handles IP assignment.
LB-IPAM is always compiled in; it activates when the first CiliumLoadBalancerIPPool is created.
The pool lives in the install-cilium-pools role.
apiVersion: "cilium.io/v2alpha1"