Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suse OS Security Patch and later RKE2 Calico Issues #9768

Open
serhatduzenkd opened this issue Jan 29, 2025 · 1 comment
Open

Suse OS Security Patch and later RKE2 Calico Issues #9768

serhatduzenkd opened this issue Jan 29, 2025 · 1 comment

Comments

@serhatduzenkd
Copy link

In one of our environments, we have a cluster with v1.24.13+rke2r1 Kubernetes version.

Here they are running on SUSE Linux Enterprise Server 15 SP5 OS.

Routine OS security packages are updated and then the servers are restarted starting from worker nodes. In the last process, after the servers are restarted:

RKE2 Calico pods are down,
Here, there is v3.25.0 RKE2-Calico as CNI. All pods started to receive access errors and could not reach the 10.43.0.1 Kubernetes Service IP. Calico pods and other pods started giving the following error log:
"""
Loaded configuration from environment config=&config.Config{LogLevel:"info", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:"kubernetes"}
W0126 13:44:56.288873 1 client_config.go:617] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2025-01-26 13:44:56.289 [INFO][1] main.go 131: Ensuring Calico datastore is initialized
2025-01-26 13:45:26.289 [ERROR][1] client.go 290: Error getting cluster information config ClusterInformation="default" error=Get "https://10.43.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.43.0.1:443: i/o timeout
2025-01-26 13:45:26.289 [INFO][1] main.go 138: Failed to initialize datastore error=Get "https://10.43.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.43.0.1:443: i/o timeout
"""
Selinux and firewalld were checked on the servers and no changes were made to the Network settings. System administrators have stated that no blocking updates have been made within the network.
If anyone has information on this issue or has encountered a situation, I would like to ask for your help.

@tomastigera
Copy link
Contributor

Have you investigated connectivity between your nodes? 10.43.0.1:443 is the k8s API server service, do you see that those connections are properly translated (NATed) to the real IPs of your servers? Use tcpdump for investigating where your connections end up / where they break.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants