You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Cilium dual stack upgrades are facing a bug that causes nodes not ready and loss of connectivity in the cluster.
This issue comes from a bug in cilium upstream where filters and bpf programs that are attached to the interface do not get cleaned up if bpf-filter-priority is changed. Currently, single stack clusters have bpf-filter-priority defaulted to 1, but in the upgrade to dual stack we set it to 2 to allow for our health probe bpf program.
After the upgrade from single stack to dual stack, cilium gets reconciled and the filters from the single stack state are still present. There are duplicate filters at priority 1 and 2, and this breaks connectivity.
To Reproduce
Steps to reproduce the behavior:
Create single stack cilium cluster
Run az aks update <cluster> --ip-families ipv4, ipv6 --load-balancer-managed-outbound-ipv6-count <count>
Expected behavior
The upgrade will take place and complete, so new nodes will have both ipv4 and ipv6 addresses.
Non host network pods may be stuck in Pending or ContainerCreating and nodes will fall into a NotReady state.
The not ready nodes will have duplicate cilium filters at different priorities.
Cluster connectivity is lost.
Environment (please complete the following information):
Cilium v1.14, v1.16 on Kubernetes v1.29, v1.30, v1.31
Additional context
This is a known issue in cilium as well. A fix is in progress upstream cilium/cilium#36172
The text was updated successfully, but these errors were encountered:
Describe the bug
Cilium dual stack upgrades are facing a bug that causes nodes not ready and loss of connectivity in the cluster.
This issue comes from a bug in cilium upstream where filters and bpf programs that are attached to the interface do not get cleaned up if
bpf-filter-priority
is changed. Currently, single stack clusters have bpf-filter-priority defaulted to 1, but in the upgrade to dual stack we set it to 2 to allow for our health probe bpf program.After the upgrade from single stack to dual stack, cilium gets reconciled and the filters from the single stack state are still present. There are duplicate filters at priority 1 and 2, and this breaks connectivity.
To Reproduce
Steps to reproduce the behavior:
az aks update <cluster> --ip-families ipv4, ipv6 --load-balancer-managed-outbound-ipv6-count <count>
Expected behavior
The upgrade will take place and complete, so new nodes will have both ipv4 and ipv6 addresses.
Non host network pods may be stuck in Pending or ContainerCreating and nodes will fall into a NotReady state.
The not ready nodes will have duplicate cilium filters at different priorities.
Cluster connectivity is lost.
Environment (please complete the following information):
Additional context
This is a known issue in cilium as well. A fix is in progress upstream cilium/cilium#36172
The text was updated successfully, but these errors were encountered: