-
Notifications
You must be signed in to change notification settings - Fork 40.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix DNS latency of 5s when use iptables forward #62764
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: xiaoxubeii Assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/ok-to-test |
/assign @danwinship |
/unassign danwinship |
/assign mrhohn |
/retest |
@MrHohn for review and approval : ) |
Would be great if we can have a cluster wide knob for tweaking this defaultDNSOptions so user can place this workaround via that. I believe this might help mitigate the packet dropping bug for pod DNS resolution, but whether there would be any other side effect is unclear to me and that's why I'm hesitated. Another pod wide option would be something like below (similar to what you posted on weaveworks/weave#3287 (comment)):
The given options will be merged with the pre-set ones (e.g. |
I would just like to add here that the
|
Here is the workaround we are about to use: weaveworks/weave#3287 (comment) |
@MrHohn OK, that's a compromise settlement, i will close the pr : ) |
/close |
Since we run various helm charts, not every pod we run is under our control to be able to add a custom dnsConfig to specify single-request-reopen. A custom kubelet flag to enable this would help, but I think this should be enabled by kubelet by default. From my understanding of the option, single-request-reopen sounds pretty safe since it provides a fallback for the existing functionality to try a new socket. If underlying libraries don't support it like alpine, it will just be ignored. Anyway this PR could be reopened? |
I just posted a little write-up about our journey troubleshooting the issue, and how we are worked around it in production: https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/. @steven-sheehy Our workaround does not involve setting dnsConfig, nor does it require any change from the users. |
@Quentin-M I tried your workaround and it still occurs. Left a comment on your blog. We may be suffering from the SNAT race condition as well. Regardless, we at least need a cluster level option to tweak this. |
This workaround works for both SNAT and DNAT. Just a few thoughts:
You may have to adjust the latency by a few ms depending on your network conditions.
You also would need to make sure that you are applying it to the right network interfaces depending on your CNI/network configuration.
… On Jun 27, 2018, at 8:46 AM, Steven Sheehy ***@***.***> wrote:
@Quentin-M I tried your workaround and it still occurs. Left a comment on your blog. We may be suffering from the SNAT race condition as well. Regardless, we at least need a cluster level option to tweak this.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
What this PR does / why we need it:
Fix the dns latency of 5s when uses iptables forward.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #62628
Special notes for your reviewer:
Release note: