-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debian Testing worker nodes cannot reach out to the network #2157
Comments
Hello, @cro! Thank you for creating the issue. Could you post the output of the |
|
Just to be clear, |
Thanks for providing the information. iptables v1.8.8 (nf_tables) has some discrepancies with the previous version and doesn't work correctly with Kubernetes. You can find more info here: We are working on a release that brings our own iptables binary for kubelet and will try to ship it asap. Now you can workaround the issue by downgrading the iptables version to v1.8.7. |
@cro if possible, could you test out on the same hosts with 1.24.5-rc.1+k0s.0 release from yesterday. That contains a fix for this iptables incompatibility issue |
Deployment was successful for the 3 control and 2 worker nodes. I have a different issue now:
Some possibly relevant details:
|
Upgrading again to the other RC available, v1.25.1-rc.1+k0s.0, did not correct this issue. |
Furthermore, my other test cluster running Alpine 3.16 seems to deploy OK, but see this:
|
exec and logs failing is usually symptom of other issues in the setup. So I believe the RCs did fix the initial iptables related issue. Exec and logs failing like this is usually a symptom of broken connections in the konnectivity-agent services. In this case as you have multiple controllers (in pure controller mode) it seems the agents cannot establish connections with ALL the controllers. Like the docs say HA controlplane REQUIRES LB with a single address in front to allow We know this is a PITA requirement to have, but it stems from the architectural decisions on upstream konnectivity how it establishes HA comms tunnels. k0s team is working on solution to lift this requirement for most deployment scenarios but unfortunately that did not get ready till 1.25 releases yet. |
I already had a load balancer in place from my previous experiments with k3s, so I repurposed it this morning. As you deduced, this fixed my remaining issues.
Guilty as charged! 😁 In my defense, I did read the docs, but missed the part about the LB being required. Closing this ticket as the RCs fix the original issue. Thanks so much for your help and responsiveness. |
Before creating an issue, make sure you've checked the following:
Platform
Version
v1.24.4+k0s.0
Sysinfo
`k0s sysinfo`
What happened?
I deployed k0s to 3 control nodes and 2 worker nodes via k0sctl. The worker nodes lost network connectivity at deployment time.
Steps to reproduce
Expected behavior
k0sctl should deploy functioning worker nodes.
Actual behavior
After deployment,
k0s kubectl get nodes
shows two worker nodes in NotReady status. I cannot ssh to the worker nodes anymore. If I tryiptables --flush
on the worker nodes and reboot them,get nodes
will show them for a second or two, but they are unable to pull any container images.Screenshots and logs
No response
Additional context
I note there is an open PR dealing with iptables-nft. I'm not sure if this is the actual problem. I can restore network access by going to the console of the worker nodes and running
iptables --flush
, but of course that's not a workable solution.The text was updated successfully, but these errors were encountered: