-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround dns kernel bug #95
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
fe7dc9d
to
fafafae
Compare
I tested this locally and it works, thanks @aojea! |
see if @danwinship wants to take a look, otherwise I will merge end of the day, so we can have it in kind |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting a couple typos in comments. Thanks again for all your work on this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where's the regression test? 🙂
do you mean in kubernetes/kubernetes? this showed up in this e2e #12 |
This was fixed in the kernel in 6.12 (commit 8af79d3edb5f) However, users need a workaround to avoid hitting this bugs in existing kernels. Add a feature flag enabled by default to implement the workaround, that consists in processing the DNS packets on the prerouting hook, after dnat happens, so we can have the resolved IPs of the DNS server, and avoid to process them in the postrouting hook.
Add a workaround to the netfilter conntrack bug https://bugzilla.netfilter.org/show_bug.cgi?id=1766 fixed by torvalds/linux@8af79d3.
Since the race problem is caused by having two packets DNATed with the same tuple at the same time by different CPUs, it impact specially DNS resolvers that sends A and AAAA request in parallel (it seems is default glibc behavior), and this is very visible to users that notice DNS latency.
Instead of processing all packets on the POSROUTING hook, special case DNS and process them in PREROUTING after DNAT happens.
Services implementations that don't rely on conntrack can disable this behavior by setting the flag to false.