Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calico-node fails with KUBE-FIREWALL iptables rule #9752

Open
RonBarkan opened this issue Jan 24, 2025 · 10 comments
Open

calico-node fails with KUBE-FIREWALL iptables rule #9752

RonBarkan opened this issue Jan 24, 2025 · 10 comments

Comments

@RonBarkan
Copy link

RonBarkan commented Jan 24, 2025

We created a standalone Kubernetes setup with 2 nodes: a control plane and a worker node with Calico CNI with VXLANCrossSubnet encapsulation (AFAIK is the default).

Looks like Kubernetes generates the following iptables rule:

# iptables -L  KUBE-FIREWALL --line-numbers
Chain KUBE-FIREWALL (2 references)
num  target     prot opt source               destination         
1    DROP       all  -- !127.0.0.0/8          127.0.0.0/8          /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT

I have looked at the rule generation, which happens in kubelet and in kube-proxy. It looks like the latter can be disabled, but not the former.
To disable the latter, updating the kube-proxy ConfigMap field data.config.conf should have iptables.localhostNodePorts: false. The rule setting in kubelet cannot be disabled, and looks like it's there even on the main branch.

The trouble is that the calico-node on the worker node fails when the rule above is present. If we delete the rule, calico-node pod immediately becomes healthy. When the rule returns, the calico-node pod fails again. All other Calico pods are running fine. Now, if we delete the rule manually, it can make a comeback at some future point.

If I look at the failing calico-node logs, I don't find any ERROR or WARN log lines except for this, which seems harmless, as it appears also when it is healthy:

2025-01-24 11:17:48.766 [WARNING][54] status-reporter/winutils.go 150: Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.

Errors are present in the kubelet log:

E0124 03:19:16.728186   15757 remote_runtime.go:496] "ExecSync cmd from runtime service failed" err="rpc error: 
code = DeadlineExceeded desc = failed to exec in container: timeout 10s exceeded: context deadline exceeded" containerID="6e97549b9b34440016222d70d62d6c4a774be573a477dd4d337772da4f58e9a5" 
cmd=["/bin/calico-node","-felix-ready"]

Describing the calico-node pod shows that the liveness check failed.

Why is this happening and how can I fix it?

Calico version: v3.29.1
Tigera operator: v1.36.2
Kubernetes: v1.29.6
Both nodes are custom Linux kernels. The worker node kernel is based on 5.10.198 arm64, control-plane node is Yocto based 6.1.82 amd64.

I noticed a similar #7028 but, the errors are not at all the same. In particular, we don't get any errors related to the execution of Linux binaries.
When the above mentioned iptables rule is not present, the cluster works as expected, AFAIK, including running a demo service.

@caseydavenport
Copy link
Member

@RonBarkan thanks for raising this! It's not obvious to me what's happening here without a full log. Can you post the full calico/node log so we can take a closer look?

@RonBarkan
Copy link
Author

Here are two logs from a single node, using kubectl -n calico-system logs <pod> and one with --previous:

calico-node-kube-firewall --previous
calico-node-kube-firewall

@RonBarkan
Copy link
Author

@caseydavenport let me know if you need anything else.

@tomastigera
Copy link
Contributor

The logs say that

2025-02-03 02:04:53.222 [INFO][69] felix/health.go 206: Health of component changed name="CalculationGraph" newReport="live,ready" oldReport="live,non-ready"
...
2025-02-03 02:04:54.063 [INFO][69] felix/health.go 206: Health of component changed name="InternalDataplaneMainLoop" newReport="live,ready" oldReport="live,non-ready"

So both reporters are healthy so calico-node should report healthy.

The health probe is afaik on localhost:9099 by default. So if kubelet installs a ruler that drops all to localhost, it is a problem. That problem is with your k8s installation and not with calico. You can change healthHost config option to make calico-node listen on a different address. https://docs.tigera.io/calico/latest/reference/resources/felixconfig#healtHost

@RonBarkan
Copy link
Author

  1. This is the default Kubernetes install and the default Calico install. Why does it not work out of the box for the most common case?
  2. We have multiple other pods deployed, including other Calico pods. All are subject to health testing. Why is only one pod affected: calico-node?

@tomastigera tomastigera reopened this Feb 11, 2025
@tomastigera
Copy link
Contributor

Calico-node is a host networked pod. Do you have any other host networked pods that listen on localhost for their health probe?

Why do you need kube-firewall? calico is a "firewall" and seems like your k8s firewall is preventing calico from starting. So having a firewall and installing another component (that has the same role) which the first component collides with and does not allow to start is not the most common usecase. So my suggestion is to turn of firewall in your k8s install process.

What is you k8s environment? Some public cloud? What is the distro? How did you install k8s?

@RonBarkan
Copy link
Author

RonBarkan commented Feb 12, 2025

I likely do not have any other host networked pods. I guess that explains the reason calico-node could be impacted, but it does not explain why the readinessProbe is sent from a source other than localhost (!127.0.0.0/8, see the original post). The rule should not drop localhost -> localhost requests. Note that the probe is sent by /bin/calico-node -felix-ready, which uses a non localhost source socket for some reason. This is the core of the issue.

To be clear, the KUBE-FIREWALL rule is the out-of-the-box behavior of Kubernetes, as I mentioned in the original post.
Here is how kube-proxy sets up this iptables rule but the kubelet does the same here. It looks like kubelet does this unconditionally, if iptables is used (maybe in different ways if not) and the function in the link hangs off the main kubelet function. If so, it is generated by default for everyone and not something specific to my setup.
Worse, if I manually delete the auto-generated rule, it could come back at any time due to kubelet or kube-proxy. I believe this is done automatically for security reasons. If there's a good way to disable this, I'd be very interested to know how.

We deployed Kubernetes to "bare metal / On-Prem", using the official instructions, with kubeadm init etc.

I don't have visibility into why calico-node -felix-ready uses a source other than localhost, which trips the Kubernetes KUBE-FIREWALL rule.

Could it be that this is a defect in the prober code, or perhaps the default configuration is causing this and I could remedy this with a config change?

@tomastigera
Copy link
Contributor

Do you have conntrack from that machine that would show which IP calico uses as a source?

Note that kube-proxy emits the rule iff nodeposrt on localhost are enabled here

I just installed a fresh kubeadm cluster (1.30) with calico v3.29.2 everything works out of the box including the firewall rule that is present.

Connections to 9099 are all from the localhost IP

[root@tomas-bz-jzwh-kadm-ms /]# conntrack -L | grep 9099
conntrack v1.4.4 (conntrack-tools): 197 flow entries have been shown.
tcp      6 1 TIME_WAIT src=127.0.0.1 dst=127.0.0.1 sport=52416 dport=9099 src=127.0.0.1 dst=127.0.0.1 sport=9099 dport=52416 [ASSURED] mark=0 use=1
tcp      6 61 TIME_WAIT src=127.0.0.1 dst=127.0.0.1 sport=55640 dport=9099 src=127.0.0.1 dst=127.0.0.1 sport=9099 dport=55640 [ASSURED] mark=0 use=1
tcp      6 31 TIME_WAIT src=127.0.0.1 dst=127.0.0.1 sport=35766 dport=9099 src=127.0.0.1 dst=127.0.0.1 sport=9099 dport=35766 [ASSURED] mark=0 use=1
tcp      6 31 TIME_WAIT src=127.0.0.1 dst=127.0.0.1 sport=35756 dport=9099 src=127.0.0.1 dst=127.0.0.1 sport=9099 dport=35756 [ASSURED] mark=0 use=1
tcp      6 91 TIME_WAIT src=127.0.0.1 dst=127.0.0.1 sport=54172 dport=9099 src=127.0.0.1 dst=127.0.0.1 sport=9099 dport=54172 [ASSURED] mark=0 use=1
tcp      6 91 TIME_WAIT src=127.0.0.1 dst=127.0.0.1 sport=54162 dport=9099 src=127.0.0.1 dst=127.0.0.1 sport=9099 dport=54162 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=127.0.0.1 dst=127.0.0.1 sport=35914 dport=9099 src=127.0.0.1 dst=127.0.0.1 sport=9099 dport=35914 [ASSURED] mark=0 use=1

Note that the probe is sent by /bin/calico-node -felix-ready, which uses a non localhost source socket for some reason. This is the core of the issue.

Calico does not force any source IP, that is selected by the system. That depends on your routing. And even though it looks like a rational choice to select 127.0.0.1 as the source IP, that is not mandated by anything and any local IP could work equally well.

If your cluster/nodes setup requires for any reason that cannot be fixed on your side calico to enforce source IP, you could either use healthHost option to and routing to work around the firewall rule or we can discuss why and how to add an option to enforce it.

@RonBarkan
Copy link
Author

Thank you for confirming that the KUBE-FIREWALL is present for you too and for showing the flow from source 127.0.0.1 to the calico-node pod.

There was no deliberate setup, and I don't even know how to, make calico-node -felix-ready choose this source IP or another. Perhaps it is falling back to not using 127.0.0.1 and picking another IP for some unknown reason.

I will try getting a conntrack log next week, when recreating the issue.

Also, will using 0.0.0.0 as the healthHost expected to work in this scenario?

@tomastigera
Copy link
Contributor

Also, will using 0.0.0.0 as the healthHost expected to work in this scenario?

That means that calico will accept a connection to any of the local IPs, that is it will accept connections to 127.0.0.1 as well as say 192.168.0.1 if that is the address of any of the local interfaces. You can use this option to limit what calico accepts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants