-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kube-router does not work with iptables 1.8.8 (nf_tables) on host #1370
Comments
So yeah, as pointed out by @BenTheElder in the kube-proxy bug, the problem is this code here that does basically The "right" way to do this is to use eg, if the
and you want to delete
and pass that to (which means you didn't even need most of the |
Hey @danwinship and @jnummelin! Thanks for doing a lot of the leg work on this one. In response to @danwinship's comment, it should be noted that the logic that Dan points out really only accounts for a very small use-case, essentially when people call While it is possible that this may introduce a race condition, it seems unlikely. Since we introduced this in I would also argue that more control loops that interact with iptables should interact on the basis of state not events here. Meaning, that if they miss a single sync or if somehow that sync is ineffectual, then the next time the loop runs it should attempt to correct this by applying state again. This is what kube-router does and it has been the only effective way that we've found to work with iptables / ipsets / ipvs in a consistent way when you consider that these resources are system wide and you can never guarantee that your app is the only app that is changing these resources. Still, I can't deny that it might happen. The current approach is pretty blasé when considering other elements on the system that might try to interact with the filter table. While that hasn't been a problem that users have noticed as of yet, I suppose it could happen, and when it does happen it would probably lead to a bug that was pretty difficult to track down. In response to this, I'll probably create a new issue that tracks upgrading kube-router to work with the However, I think that most of the above is just treating a symptom and not the root cause. The real issue here is that upstream isn't keeping compatibility of I best paths forward that I see here are:
@danwinship / @jnummelin let me know what you think! |
What I was trying to point out is that kube-proxy doesn't have to start its update during the miniscule space between the operations; if it starts any time between when kube-router starts its iptables-save and when kube-router starts its iptables-restore, then it will get the lock before kube-router's iptables-restore does:
Fortunately, it seems like
Ah, no, it wouldn't be actually. I didn't talk about inserting new rules and stuff in my example above because I had only seen the cleanup code, not the sync code. But you can pass both
So, write out all the old/modified/new rules for your own chains, and write out |
Fair enough. Although, to be more specific, most of the time is taken by Still this is not an argument for not changing it. I agree that we shouldn't be touching rules that are not ours where we can help it. I was more just trying to point out that this rare race condition is not what is forcing this issue right now and is unlikely to force this issue in the future.
I'm not sure that this is wholly possible. Its true that it is for 99% of the rules that kube-router works with where it is our own rules in our own chains. However, we do still have to operate on common chains like
I'll admit that this is a better approach than what I was considering, by trying to manipulate rules in place with individual I still think though that at the end of the day, it is important that we are able to detect when there is an incompatibility between iptables rules that are written in one version of the application and ones that are written in another version. For now, this was reserved to just a single module |
ah, yeah. In kube-proxy we do a handful of |
@jnummelin - Release v1.5.2 has been released that contains an update to the Alpine base image ( I've also added new documentation to the requirements section of the user-guide to help advise users about keeping user-space tooling versions in sync. Additionally we also opened #1372 to track changing the iptables-restore logic to use That being said, I think that there is still a potential for problems at some point in the future with conflicting user-spaces between the kube-router container and other containers / the host's user-space tooling. Unfortunately, I don't believe that there is anything further that kube-router can do (besides what we've already done above) to further reduce the likelihood of this occurring. I'll keep monitoring the upstream issue to see if they are able to give us any more information in the future. Thanks @danwinship / @jnummelin for tracking this down! For now, I'm going to close out this issue. |
What happened?
Running kubelet on a host with iptables-1.8.8 (nf_tables mode) does not work due to kube-router image uses iptables-1.8.7. kube-proxy ends up replace the rule
with
This leads to network stop working.
Problem is that iptables-save with iptables 1.8.7 does not work with iptables rules created with iptables 1.8.8 (nf_tables).
If I on the host manually (using iptables 1.8.8) do:
It shows the
-m --mark 0x8000/0x8000
.If I then use nsenter to the kube-router pod and do the same I get:
As you see, the
-m --mark 0x8000/0x8000
is lost and all packages are dropped, not only the marked ones.Now as kube-router’s NPC code does work using iptables-save and iptables-restore the end result is that the rules are broken during restore and thus all networking on the host goes bust.
Possible workarounds:
What did you expect to happen?
Network continue to work regardless of version of iptables installed on the host.
How can we reproduce the behavior you experienced?
Steps to reproduce the behavior:
Screenshots / Architecture Diagrams / Network Topologies
System Information (please complete the following information):
kube-router --version
): 1.5.1--run-router=true --run-firewall=true --run-service-proxy=false --bgp-graceful-restart=true
kubectl version
) : 1.25, 1.24, 1.23; I believe kube version is quite irrelevant hereLogs, other output, metrics
Please provide logs, other kind of output or observed metrics here.
Additional context
At first we thought this is an issue with kube-proxy and hence we've created issue in k8s repo too which has some discussion on this: kubernetes/kubernetes#112477
k8s folks created an issue on netfilter tracker: https://bugzilla.netfilter.org/show_bug.cgi?id=1632
kube-router slack channel has also open discussion on this: https://kubernetes.slack.com/archives/C8DCQGTSB/p1663231184884829
The text was updated successfully, but these errors were encountered: