-
Notifications
You must be signed in to change notification settings - Fork 39.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve kubedns QPS to be closer to GCE DNS server performance #28366
Comments
It might just be SkyDNS. Weave also has a DNS server that they claim performs really well. It's very different, and would be pretty much a rewrite again, but maybe we should consider it medium term? |
@thockin Yes, it looks like kubedns/skydns performance is the bottleneck. Because, even when the queryfile contained only local service records and dnsmasq cache was set to 1, QPS didn't increase a lot. This case should have skydns servicing all requests with no forwarding required. @ArtfulCoder and I did some more measurements.
Side-Note: Cluster IP vs Endpoint/Pod IP had no visible impact on the QPS. |
@errordeveloper Is there a Docker image with the Weave DNS server that we can drop in and do a test ? It won't resolve service DNS IPs, but we will get an idea of external forwarding performance. Another option is to use the DNS server built into Consul. There is a new kubernetes/kube2consul project starting up. |
What else was sharing that machine? did you try giving it unlimited cpu/memory? |
Just the standard set to pods scheduled on minions - they shouldn't have been very busy. |
There's a -quiet option |
https://github.com/kubernetes/contrib/blob/master/exec-healthz/exechealthz.go#L43 |
@bprashanth Good tip - let me crank up the cpu limits on the DNS containers. The exechealth -q option is already there - this is the kubedns daemon going crazy printing every query and datastructure like so: I0701 19:23:27.115855 1 dns.go:583] Received ReverseRecord Request:1.0.0.10.in-addr.arpa.
I0701 19:23:29.114784 1 dns.go:439] Received DNS Request:kubernetes.default.svc.cluster.local., exact:false
I0701 19:23:29.114833 1 dns.go:539] records:[0xc8202b97a0], retval:[{10.0.0.1 0 10 10 false 30 0 /skydns/local/cluster/svc/default/kubernetes/3234633364383235}], path:[local cluster svc default |
you should be able to bring its log level down with --v |
Eliminated logs from kubedns and bumped up CPU to 1000m. Will leave testing this where it is, and if we get Weave or Consul DNS available, we can rerun the tests. |
I think ConsulDNS requires running Consul, which sounds pretty @smarterclayton - have you guys done DNS perf numbers? On Fri, Jul 1, 2016 at 4:40 PM, Girish Kalele notifications@github.com
|
[ Quoting notifications@github.com in "Re: [kubernetes/kubernetes] Improve..." ]
~500 qps sounds really slow. Is every request going to etcd? CoreDNS has much better caching, in the (non optimized version) I can easily do Longer term I think CoreDNS makes much more sense. Note the "SkyDNS" middleware Is more information available on where the time is spend.
Miek Gieben |
I've been out of the loop, but happy to help. Also: There's a container "Hack Room" in the 13th of July at GopherCon - would be a good time to sit down and work through things. 10:00am - 5:00pm |
Girish, have you take a profile? On Fri, Jul 1, 2016 at 4:54 PM, Brian Ketelsen notifications@github.com
|
@miekg This version of kubedns does not use etcd - in-memory lookup for local services. The dnsperf sample queryfile will be 100% hitting the forwarding to cloud DNS path. Container logs show i/o timeouts forwarding to GCE DNS, but there are no timeouts when dnsperf is configured to hit GCE DNS directly, so this is a skydns issue losing track of UDP ports or something IMO. The GCE DNS server directly is able to serve 12000+ QPS. I will grab profiling data but I tried a CPU limit increase from 0.1 core to 1.0 core and that did not significantly affect QPS.
|
[ Quoting notifications@github.com in "Re: [kubernetes/kubernetes] Improve..." ]
Ack.
Ack.
Could it be re-using ports so fast the the client code is using a new port while I would almost say, what does CoreDNS do... but I get that would be a hassle to /Miek Miek Gieben |
[ Quoting notifications@github.com in "Re: [kubernetes/kubernetes] Improve..." ]
Is SkyDNS running inside docker when you do this? When you directly hit GCEDNS |
Yes in both cases. |
Have you pulled a profile? Do you know how? On Fri, Jul 1, 2016 at 5:31 PM, Girish Kalele notifications@github.com
|
169.254.169.254 that's also the IP used when directly querying GCEDNS? |
Also SkyDNS might be throttled, while direct queries to gcedns might not. I have no way of checking that from here. |
Yes, 169.254.169.254 is the GCE DNS server in both cases. I got pprof data, but I don't understand the 60 ms total part, it was a 30 second profile. 60ms of 60ms total ( 100%)
Showing top 10 nodes out of 46 (cum >= 30ms)
flat flat% sum% cum cum%
10ms 16.67% 16.67% 10ms 16.67% runtime.futex
10ms 16.67% 33.33% 10ms 16.67% runtime.heapBitsSetType
10ms 16.67% 50.00% 20ms 33.33% runtime.mallocgc
10ms 16.67% 66.67% 10ms 16.67% runtime.mapaccess2
10ms 16.67% 83.33% 10ms 16.67% runtime.mapassign1
10ms 16.67% 100% 10ms 16.67% runtime.netpollunblock
0 0% 100% 10ms 16.67% k8s.io/kubernetes/cmd/kube-dns/app.(*KubeDNSServer).Run
0 0% 100% 30ms 50.00% k8s.io/kubernetes/pkg/client/cache.(*ListWatch).List
0 0% 100% 30ms 50.00% k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
0 0% 100% 30ms 50.00% k8s.io/kubernetes/pkg/client/cache.(*Reflector).RunUntil.func1 |
I am not sure how to interpret that - is that all you got? On Fri, Jul 1, 2016 at 7:48 PM, Girish Kalele notifications@github.com
|
We've done DNS perf, mostly to ensure we hit the DNS cache from skydns
properly in most cases based on TTL. If we fall through the cache or
cache isn't on then we're mostly GC limited due to the creation cost
of the skydns records. I've seen upwards of 30-40k qps in core
cases.
I haven't tested now that we've moved DNS to the nodes though, so will
follow up.
|
It seems like there is a new UDP socket created for every DNS forward request.
|
That might be a good optimization in skydns, but I don't see that being the Am I reading it right? Only 23% is accounted for by the top call-tree? On Tue, Jul 5, 2016 at 5:50 PM, Girish Kalele notifications@github.com
|
@thockin Yes, (fairly new to pprof output for Go) apart from the first entry, there are no obvious hotspots. Increasing the CPU allocation also doesn't help performance (if cgroup throttling was the issue which we wouldn't see in the pprof sampling (is this correct ?)). Then, it would seem that its the lost queries that are affecting the QPS. |
lost queries? Where are they getting lost? On Tue, Jul 5, 2016 at 10:31 PM, Girish Kalele notifications@github.com
|
[ Quoting notifications@github.com in "Re: [kubernetes/kubernetes] Improve..." ]
Are you implying there is a better way to do this (in Go)?
Why does SkyDNS even see multiple outstanding queries with the same query ID? /Miek Miek Gieben |
Me neither, sadly. Is this the go runtime or the "user" code in Go DNS that
|
Maybe setup a TCP connection to the upstream and use that, in stead of
|
Lost messages- kube-dns logs show "failure to forward request : i/o timeout" errors.
|
I actually meant if we were able to use a single UDP socket to every upstream DNS server, it would save a lot of churn. Possible would need to "NAT the query transaction ids" to correctly correlate upstream replies. I will try and force upstream TCP in kube-dns, see if that helps performance. |
[ Quoting notifications@github.com in "Re: [kubernetes/kubernetes] Improve..." ]
Interestingly this is something I'm exploring for CoreDNS (kicked off partially,
If kube-dns is still a lot like SkyDNS then that won't help: well you then have /Miek Miek Gieben |
@girishkalele i've seen those timeout before in DNS code. I've never been able to establish why (or where in the code) this happens. Is it possible to check what gcedns is returning? |
So we benchmarked: coredns/coredns#287 (comment) SkyDNS: Requests/Second: 38436.775 Discuss. |
@girishkalele can u share the command used to measure the QPS? |
One thing that may be useful is http://github.com/kubernetes/perf-tests/dns |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Should we keep this open? |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Ran a DNS performance test against the kubedns pod vs the native GCE DNS server.
kubedns performance was only around 10% that of going directly to the GCE DNS at "169.254.169.254".
The single kubedns pod seems to max out at around the same performance as provided by the GCE DNS to each unique IP address. Since all 10 Pods are running the same query-file, we would expect the dnsmasq caching would dramatically speed up performance.
Running against DNS cluster ip 10.0.0.10 is around 109 QPS
Running directly against 169.254.169.254 is 1256 QPS
Methodology:
2 n1-standard-2 instances as schedulable nodes.
Hairpin_veth mode is on.
1 kubedns pod.
10 DNSPerf client pods running as 10 parallel completions under one "Job".
The text was updated successfully, but these errors were encountered: