You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current major versions of Linux distributions don't support iptables-legacy anymore. Instead nf_tables is used (e.g. RHEL8 or debian Buster).
The availability of nf_tables only leads to a race condition for starting kube-proxy and node-local-dns in the correct order after a node is started.
Currently node-local-dns supports only iptables-legacy. For the support of nf_tables an open/blocked issue exists.
kube-proxy supports both iptables modes (see here) and determines during it's starting phase what it should use (see here).
This means, if the node-local-dns pod on the node is starting first and creates it's iptables-legacy rules, kube-proxy finds these legacy-rules and starts using the legacy mode too. Unfortunately, kube-proxy uses some chains which the kubelet creates when it starts. E.g. the chain "KUBE_MARK-DROP". Because the OS is offering nf_talbles only, the kubelet creates the particular chains with nf_tables and not iptables-legacy.
If now kube-proxy is starting in iptables-legacy mode, it tries to write to the kubelet chains and fails, becaue it cannot find the nf_tables chains:
kubectl -n kube-system logs $(kubectl -n kube-system get pods -o wide | grep proxy | grep cp-0 | awk '{print $1}') -f
W0813 15:17:48.190861 1 server_others.go:559] Unknown proxy mode "", assuming iptables proxy
I0813 15:17:48.203720 1 node.go:136] Successfully retrieved node IP: 172.16.10.5
I0813 15:17:48.203751 1 server_others.go:186] Using iptables Proxier.
I0813 15:17:48.205552 1 server.go:583] Version: v1.18.6
I0813 15:17:48.206007 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0813 15:17:48.206845 1 config.go:315] Starting service config controller
I0813 15:17:48.206863 1 shared_informer.go:223] Waiting for caches to sync for service config
I0813 15:17:48.206886 1 config.go:133] Starting endpoints config controller
I0813 15:17:48.206898 1 shared_informer.go:223] Waiting for caches to sync for endpoints config
I0813 15:17:48.307068 1 shared_informer.go:230] Caches are synced for endpoints config
I0813 15:17:48.307209 1 shared_informer.go:230] Caches are synced for service config
E0813 15:17:48.347064 1 proxier.go:1555] Failed to execute iptables-restore: exit status 2 (iptables-restore v1.8.3 (legacy): Couldn't load target `KUBE-MARK-DROP':No such file or directory
Error occurred at line: 84
Try `iptables-restore -h' or 'iptables-restore --help' for more information.
)
I0813 15:17:48.347141 1 proxier.go:825] Sync failed; retrying in 30s
Because of this kube-proxy is entering an endless loop, trying to write to the chain KUBE-MARK-DROP and never creates the iptables rules for clusterIPs.
On the other side, if kube-proxy is starting before node-local-dns is creating it's iptables-legacy rules, the kube-proxy creates all rules using nf_tables. Hence, in this case the chain KUBE-MARK-DROP exists and everything is working as expected.
What is the expected behavior:
Working clusterIPs after each node start.
How to reproduce the issue:
Rebooting nodes for testing the race condition.
For a running node delete the running kube-proxy pod and flush the nf_tables rules on the host
Anything else we need to know?
The root cause is also hitting calico, see here.
I tested to use kube-proxy in ipvs mode. Unfortunately, I hit here another blocking issue for Azure (most likely for other public cloud providers as well), regarding services of type LoadBalancer. Hence, using kube-proxy in ipvs mode is not an option.
Because the root cause is the not patched node-local-dns pod I think a good option could be to disable it's deployment for now. Maybe introducing node-local-dns as a feature in kubeone which can be deactivated (e.g. like PodSecurityPolicies)?
Information about the environment:
KubeOne version (1.0.0-beta.2):
Operating system: RHEL8
Provider you're deploying cluster on: Azure
Operating system you're deploying on: CentOS8
The text was updated successfully, but these errors were encountered:
What happened:
Current major versions of Linux distributions don't support iptables-legacy anymore. Instead nf_tables is used (e.g. RHEL8 or debian Buster).
The availability of nf_tables only leads to a race condition for starting kube-proxy and node-local-dns in the correct order after a node is started.
Currently node-local-dns supports only iptables-legacy. For the support of nf_tables an open/blocked issue exists.
kube-proxy supports both iptables modes (see here) and determines during it's starting phase what it should use (see here).
This means, if the node-local-dns pod on the node is starting first and creates it's iptables-legacy rules, kube-proxy finds these legacy-rules and starts using the legacy mode too. Unfortunately, kube-proxy uses some chains which the kubelet creates when it starts. E.g. the chain "KUBE_MARK-DROP". Because the OS is offering nf_talbles only, the kubelet creates the particular chains with nf_tables and not iptables-legacy.
If now kube-proxy is starting in iptables-legacy mode, it tries to write to the kubelet chains and fails, becaue it cannot find the nf_tables chains:
For reference, see
Because of this kube-proxy is entering an endless loop, trying to write to the chain KUBE-MARK-DROP and never creates the iptables rules for clusterIPs.
On the other side, if kube-proxy is starting before node-local-dns is creating it's iptables-legacy rules, the kube-proxy creates all rules using nf_tables. Hence, in this case the chain KUBE-MARK-DROP exists and everything is working as expected.
What is the expected behavior:
Working clusterIPs after each node start.
How to reproduce the issue:
Rebooting nodes for testing the race condition.
For a running node delete the running kube-proxy pod and flush the nf_tables rules on the host
Anything else we need to know?
The root cause is also hitting calico, see here.
I tested to use kube-proxy in ipvs mode. Unfortunately, I hit here another blocking issue for Azure (most likely for other public cloud providers as well), regarding services of type LoadBalancer. Hence, using kube-proxy in ipvs mode is not an option.
Because the root cause is the not patched node-local-dns pod I think a good option could be to disable it's deployment for now. Maybe introducing node-local-dns as a feature in kubeone which can be deactivated (e.g. like PodSecurityPolicies)?
Information about the environment:
KubeOne version (
1.0.0-beta.2
):Operating system: RHEL8
Provider you're deploying cluster on: Azure
Operating system you're deploying on: CentOS8
The text was updated successfully, but these errors were encountered: