Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve kubedns QPS to be closer to GCE DNS server performance #28366

Closed
girishkalele opened this issue Jul 1, 2016 · 53 comments
Closed

Improve kubedns QPS to be closer to GCE DNS server performance #28366

girishkalele opened this issue Jul 1, 2016 · 53 comments
Assignees
Labels
area/dns lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@girishkalele
Copy link

Ran a DNS performance test against the kubedns pod vs the native GCE DNS server.
kubedns performance was only around 10% that of going directly to the GCE DNS at "169.254.169.254".

The single kubedns pod seems to max out at around the same performance as provided by the GCE DNS to each unique IP address. Since all 10 Pods are running the same query-file, we would expect the dnsmasq caching would dramatically speed up performance.

Running against DNS cluster ip 10.0.0.10 is around 109 QPS

  Queries per second:   137.991371
  Queries per second:   112.373670
  Queries per second:   97.577085
  Queries per second:   131.381920
  Queries per second:   35.410735
  Queries per second:   136.044372
  Queries per second:   49.806700
  Queries per second:   128.092776
  Queries per second:   133.084200
  Queries per second:   133.843242

Running directly against 169.254.169.254 is 1256 QPS

  Queries per second:   1233.658841
  Queries per second:   1292.178002
  Queries per second:   1204.961019
  Queries per second:   1278.054183
  Queries per second:   1297.116902
  Queries per second:   1209.742660
  Queries per second:   1250.562897
  Queries per second:   1233.336819
  Queries per second:   1303.879214
  Queries per second:   1255.794775

Methodology:

2 n1-standard-2 instances as schedulable nodes.
Hairpin_veth mode is on.
1 kubedns pod.
10 DNSPerf client pods running as 10 parallel completions under one "Job".

@girishkalele
Copy link
Author

cc @bprashanth @thockin @ArtfulCoder

@thockin
Copy link
Member

thockin commented Jul 1, 2016

It might just be SkyDNS. Weave also has a DNS server that they claim performs really well. It's very different, and would be pretty much a rewrite again, but maybe we should consider it medium term?

@errordeveloper

@girishkalele
Copy link
Author

@thockin Yes, it looks like kubedns/skydns performance is the bottleneck.

Because, even when the queryfile contained only local service records and dnsmasq cache was set to 1, QPS didn't increase a lot. This case should have skydns servicing all requests with no forwarding required.

@ArtfulCoder and I did some more measurements.

  1. Bump up dnsmasq cache size to 64K - the kubedns performance increased to ~ 220 QPS.
  2. Query file contains only local service A records ~ 5000 QPS (all dnsmasq-caching)
  3. Query file contains only local service A records and dnsmasq cache size = 1 ~ 300 QPS.

Side-Note: Cluster IP vs Endpoint/Pod IP had no visible impact on the QPS.

@girishkalele
Copy link
Author

@errordeveloper Is there a Docker image with the Weave DNS server that we can drop in and do a test ? It won't resolve service DNS IPs, but we will get an idea of external forwarding performance.

Another option is to use the DNS server built into Consul. There is a new kubernetes/kube2consul project starting up.

https://www.consul.io/docs/agent/dns.html

@bprashanth
Copy link
Contributor

2 n1-standard-2 instances as schedulable nodes.

What else was sharing that machine? did you try giving it unlimited cpu/memory?

@girishkalele
Copy link
Author

@bprashanth

Just the standard set to pods scheduled on minions - they shouldn't have been very busy.
One noticeable thing is that kubedns is logging very verbosely, I am going to build a test container with all that logging removed, I am not sure if the logging framework logs asynchronously or blocks and flushes the log stream synchronously.

@bprashanth
Copy link
Contributor

There's a -quiet option
Make sure you aren't maxing out on the 1/10th of a cpu core and how much ever memory that pod manifest gives you on a populated node.

@bprashanth
Copy link
Contributor

https://github.com/kubernetes/contrib/blob/master/exec-healthz/exechealthz.go#L43
you should be able to specify that through the "args" field in yaml directly

@girishkalele
Copy link
Author

@bprashanth Good tip - let me crank up the cpu limits on the DNS containers.

The exechealth -q option is already there - this is the kubedns daemon going crazy printing every query and datastructure like so:

I0701 19:23:27.115855       1 dns.go:583] Received ReverseRecord Request:1.0.0.10.in-addr.arpa.
I0701 19:23:29.114784       1 dns.go:439] Received DNS Request:kubernetes.default.svc.cluster.local., exact:false
I0701 19:23:29.114833       1 dns.go:539] records:[0xc8202b97a0], retval:[{10.0.0.1 0 10 10  false 30 0  /skydns/local/cluster/svc/default/kubernetes/3234633364383235}], path:[local cluster svc default 

@bprashanth
Copy link
Contributor

you should be able to bring its log level down with --v

@girishkalele
Copy link
Author

@bprashanth

Eliminated logs from kubedns and bumped up CPU to 1000m.
Marginally improved QPS from ~ 290 to ~ 341.

Will leave testing this where it is, and if we get Weave or Consul DNS available, we can rerun the tests.

@thockin
Copy link
Member

thockin commented Jul 1, 2016

I think ConsulDNS requires running Consul, which sounds pretty
heavy-weight. We could also engage with skydns folks on expected perf
(@bketelsen). We could also look at CoreDNS (@miekg) which is much newer
and riskier, but maybe a better fit longer term?

@smarterclayton - have you guys done DNS perf numbers?

On Fri, Jul 1, 2016 at 4:40 PM, Girish Kalele notifications@github.com
wrote:

@bprashanth https://github.com/bprashanth

Eliminated logs from kubedns and bumped up CPU to 1000m.
Marginally improved QPS from ~ 290 to ~ 341.

Will leave testing this where it is, and if we get Weave or Consul DNS
available, we can rerun the tests.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#28366 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AFVgVJA6pxFwnO6SU3fdf65OuZ3qs8w3ks5qRaVWgaJpZM4JDWO9
.

@miekg
Copy link

miekg commented Jul 1, 2016

[ Quoting notifications@github.com in "Re: [kubernetes/kubernetes] Improve..." ]

I think ConsulDNS requires running Consul, which sounds pretty
heavy-weight. We could also engage with skydns folks on expected perf
(@bketelsen). We could also look at CoreDNS (@miekg) which is much newer
and riskier, but maybe a better fit longer term?

@smarterclayton - have you guys done DNS perf numbers?

~500 qps sounds really slow. Is every request going to etcd?

CoreDNS has much better caching, in the (non optimized version) I can easily do
15K qps, but SkyDNS should be able to get to those as well.

Longer term I think CoreDNS makes much more sense. Note the "SkyDNS" middleware
in CoreDNS is a cleaned up version from CoreDNS - but yeah the project itself is
barely 2 months old.

Is more information available on where the time is spend.

On Fri, Jul 1, 2016 at 4:40 PM, Girish Kalele notifications@github.com
wrote:

@bprashanth https://github.com/bprashanth

Eliminated logs from kubedns and bumped up CPU to 1000m.
Marginally improved QPS from ~ 290 to ~ 341.

Will leave testing this where it is, and if we get Weave or Consul DNS
available, we can rerun the tests.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#28366 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AFVgVJA6pxFwnO6SU3fdf65OuZ3qs8w3ks5qRaVWgaJpZM4JDWO9
.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#28366 (comment)
/Miek

Miek Gieben

@bketelsen
Copy link
Contributor

I've been out of the loop, but happy to help. Also: There's a container "Hack Room" in the 13th of July at GopherCon - would be a good time to sit down and work through things.

10:00am - 5:00pm
Room: 4f
Container Technologies
Hack on and discuss the various container technologies written in and powered by Go

@thockin
Copy link
Member

thockin commented Jul 1, 2016

Girish, have you take a profile?

On Fri, Jul 1, 2016 at 4:54 PM, Brian Ketelsen notifications@github.com
wrote:

I've been out of the loop, but happy to help. Also: There's a container
"Hack Room" in the 13th of July at GopherCon - would be a good time to sit
down and work through things.

10:00am - 5:00pm
Room: 4f
Container Technologies
Hack on and discuss the various container technologies written in and
powered by Go


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#28366 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AFVgVCKDDlJirb9Mj0MQMQ6t8OeHPlakks5qRai6gaJpZM4JDWO9
.

@girishkalele
Copy link
Author

@miekg This version of kubedns does not use etcd - in-memory lookup for local services.

The dnsperf sample queryfile will be 100% hitting the forwarding to cloud DNS path. Container logs show i/o timeouts forwarding to GCE DNS, but there are no timeouts when dnsperf is configured to hit GCE DNS directly, so this is a skydns issue losing track of UDP ports or something IMO. The GCE DNS server directly is able to serve 12000+ QPS.

I will grab profiling data but I tried a CPU limit increase from 0.1 core to 1.0 core and that did not significantly affect QPS.

I0701 23:32:55.190432       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:58768->169.254.169.254:53: i/o timeout"
I0701 23:32:55.537034       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:38153->169.254.169.254:53: i/o timeout"
I0701 23:32:56.103012       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:40873->169.254.169.254:53: i/o timeout"
I0701 23:32:56.273982       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:49714->169.254.169.254:53: i/o timeout"

@miekg
Copy link

miekg commented Jul 2, 2016

[ Quoting notifications@github.com in "Re: [kubernetes/kubernetes] Improve..." ]

@miekg This version of kubedns does not use etcd - in-memory lookup for local services.

Ack.

The dnsperf sample queryfile will be 100% hitting the forwarding to cloud DNS
path. Container logs show i/o timeouts forwarding to GCE DNS, but there are no
timeouts when dnsperf is configured to hit GCE DNS directly, so this is a
skydns issue losing track of UDP ports or something IMO. The GCE DNS server
directly is able to serve 12000+ QPS.

Ack.

I will grab profiling data but I tried a CPU limit increase from 0.1 core to
1.0 core and that did not significantly affect QPS.

I0701 23:32:55.190432       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:58768->169.254.169.254:53: i/o timeout"
I0701 23:32:55.537034       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:38153->169.254.169.254:53: i/o timeout"
I0701 23:32:56.103012       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:40873->169.254.169.254:53: i/o timeout"
I0701 23:32:56.273982       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:49714->169.254.169.254:53: i/o timeout"

Could it be re-using ports so fast the the client code is using a new port while
a response is still inflight?

I would almost say, what does CoreDNS do... but I get that would be a hassle to
setup and configure, if it all possible.

/Miek

Miek Gieben

@miekg
Copy link

miekg commented Jul 2, 2016

[ Quoting notifications@github.com in "Re: [kubernetes/kubernetes] Improve..." ]

@miekg This version of kubedns does not use etcd - in-memory lookup for local services.

The dnsperf sample queryfile will be 100% hitting the forwarding to cloud DNS

path. Container logs show i/o timeouts forwarding to GCE DNS, but there are no
timeouts when dnsperf is configured to hit GCE DNS directly, so this is a
skydns issue losing track of UDP ports or something IMO. The GCE DNS server
directly is able to serve 12000+ QPS.

Is SkyDNS running inside docker when you do this? When you directly hit GCEDNS
are you running in docker as well?

@girishkalele
Copy link
Author

Yes in both cases.

@thockin
Copy link
Member

thockin commented Jul 2, 2016

Have you pulled a profile? Do you know how?

On Fri, Jul 1, 2016 at 5:31 PM, Girish Kalele notifications@github.com
wrote:

Yes in both cases.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#28366 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AFVgVGiT2QKf1AUkDd9UNkqVBUgM8fzkks5qRbFzgaJpZM4JDWO9
.

@bprashanth bprashanth added team/cluster sig/network Categorizes an issue or PR as relevant to SIG Network. labels Jul 2, 2016
@bprashanth bprashanth modified the milestones: next-candidate, v1.4 Jul 2, 2016
@miekg
Copy link

miekg commented Jul 2, 2016

I0701 23:32:56.273982       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:49714->169.254.169.254:53: i/o timeout"

169.254.169.254 that's also the IP used when directly querying GCEDNS?

@miekg
Copy link

miekg commented Jul 2, 2016

Also SkyDNS might be throttled, while direct queries to gcedns might not. I have no way of checking that from here.
To actually prove or disprove performance figures of SkyDNS we would need to do a standalone test. I'm actively working on testing and performance tuning CoreDNS, so I might give SkyDNS a whirl as well - time permitting.

@girishkalele
Copy link
Author

@thockin @miekg

Yes, 169.254.169.254 is the GCE DNS server in both cases.

I got pprof data, but I don't understand the 60 ms total part, it was a 30 second profile.
I started the pprof server and used go tool pprof <url> - is that going to work ?

60ms of 60ms total (  100%)
Showing top 10 nodes out of 46 (cum >= 30ms)
      flat  flat%   sum%        cum   cum%
      10ms 16.67% 16.67%       10ms 16.67%  runtime.futex
      10ms 16.67% 33.33%       10ms 16.67%  runtime.heapBitsSetType
      10ms 16.67% 50.00%       20ms 33.33%  runtime.mallocgc
      10ms 16.67% 66.67%       10ms 16.67%  runtime.mapaccess2
      10ms 16.67% 83.33%       10ms 16.67%  runtime.mapassign1
      10ms 16.67%   100%       10ms 16.67%  runtime.netpollunblock
         0     0%   100%       10ms 16.67%  k8s.io/kubernetes/cmd/kube-dns/app.(*KubeDNSServer).Run
         0     0%   100%       30ms 50.00%  k8s.io/kubernetes/pkg/client/cache.(*ListWatch).List
         0     0%   100%       30ms 50.00%  k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
         0     0%   100%       30ms 50.00%  k8s.io/kubernetes/pkg/client/cache.(*Reflector).RunUntil.func1

@thockin
Copy link
Member

thockin commented Jul 2, 2016

I am not sure how to interpret that - is that all you got?

On Fri, Jul 1, 2016 at 7:48 PM, Girish Kalele notifications@github.com
wrote:

@thockin https://github.com/thockin @miekg https://github.com/miekg

Yes, 169.254.169.254 is the GCE DNS server in both cases.

I got pprof data, but I don't understand the 60 ms total part, it was a 30
second profile.
I started the pprof server and used go tool pprof - is that going
to work ?

60ms of 60ms total ( 100%)Showing top 10 nodes out of 46 (cum >= 30ms) flat flat% sum% cum cum% 10ms 16.67% 16.67% 10ms 16.67% runtime.futex 10ms 16.67% 33.33% 10ms 16.67% runtime.heapBitsSetType 10ms 16.67% 50.00% 20ms 33.33% runtime.mallocgc 10ms 16.67% 66.67% 10ms 16.67% runtime.mapaccess2 10ms 16.67% 83.33% 10ms 16.67% runtime.mapassign1 10ms 16.67% 100% 10ms 16.67% runtime.netpollunblock 0 0% 100% 10ms 16.67% k8s.io/kubernetes/cmd/kube-dns/app.(_KubeDNSServer).Run 0 0% 100% 30ms 50.00% k8s.io/kubernetes/pkg/client/cache.(_ListWatch).List 0 0% 100% 30ms 50.00% k8s.io/kubernetes/pkg/client/cache.(_Reflector).ListAndWatch 0 0% 100% 30ms 50.00% k8s.io/kubernetes/pkg/client/cache.(_Reflector).RunUntil.func1


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#28366 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AFVgVA94Wglw6dDZaJdli3144eSOvznfks5qRdFhgaJpZM4JDWO9
.

@smarterclayton
Copy link
Contributor

smarterclayton commented Jul 2, 2016 via email

@girishkalele
Copy link
Author

It seems like there is a new UDP socket created for every DNS forward request.
There are a lot of connections to 169.254.169.254 brought up and torn down seen in netstat...we should be able to mux/demux multiple queries based on DNS query transaction id and not need a unique socket for every forwarded request ?

root@kubernetes-minion-group-jkok # docker exec cd79e7468ca2 netstat -uapn
udp        0      0 10.244.1.9:52097        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp        0      0 10.244.1.9:43941        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp        0      0 10.244.1.9:48098        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp        0      0 10.244.1.9:48306        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp        0      0 10.244.1.9:40209        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp        0      0 10.244.1.9:60743        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp        0      0 10.244.1.9:40295        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp        0      0 10.244.1.9:36321        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp        0      0 10.244.1.9:40520        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp        0      0 10.244.1.9:36662        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp        0      0 10.244.1.9:57258        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp        0      0 10.244.1.9:49131        169.254.169.254:53      ESTABLISHED 1/kube-dns
udp     1536      0 :::53                   :::*                                1/kube-dns
<<<snipped>>
root@kubernetes-minion-group-jkok # docker exec cd79e7468ca2 netstat -uapn
     72     505    6525

@thockin
Copy link
Member

thockin commented Jul 6, 2016

That might be a good optimization in skydns, but I don't see that being the
killer.

Am I reading it right? Only 23% is accounted for by the top call-tree?
That sounds pretty healthy to me, it means we have a very long tail of
functions each taking a small amount of CPU time. Am I misinterpreting?

On Tue, Jul 5, 2016 at 5:50 PM, Girish Kalele notifications@github.com
wrote:

It seems like there is a new UDP socket created for every DNS forward
request.
There are a lot of connections to 169.254.169.254 brought up and torn down
seen in netstat...we should be able to mux/demux multiple queries based on
DNS query transaction id and not need a unique socket for every forwarded
request ?

root@kubernetes-minion-group-jkok # docker exec cd79e7468ca2 netstat -uapn
udp 0 0 10.244.1.9:52097 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 0 0 10.244.1.9:43941 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 0 0 10.244.1.9:48098 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 0 0 10.244.1.9:48306 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 0 0 10.244.1.9:40209 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 0 0 10.244.1.9:60743 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 0 0 10.244.1.9:40295 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 0 0 10.244.1.9:36321 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 0 0 10.244.1.9:40520 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 0 0 10.244.1.9:36662 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 0 0 10.244.1.9:57258 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 0 0 10.244.1.9:49131 169.254.169.254:53 ESTABLISHED 1/kube-dns
udp 1536 0 :::53 :::* 1/kube-dns
<<>
root@kubernetes-minion-group-jkok # docker exec cd79e7468ca2 netstat -uapn
72 505 6525


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#28366 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AFVgVMoCIi3NMslilnXngA3lEyEw98P2ks5qSvu6gaJpZM4JDWO9
.

@girishkalele
Copy link
Author

@thockin Yes, (fairly new to pprof output for Go) apart from the first entry, there are no obvious hotspots. Increasing the CPU allocation also doesn't help performance (if cgroup throttling was the issue which we wouldn't see in the pprof sampling (is this correct ?)).

Then, it would seem that its the lost queries that are affecting the QPS.

@thockin
Copy link
Member

thockin commented Jul 6, 2016

lost queries? Where are they getting lost?

On Tue, Jul 5, 2016 at 10:31 PM, Girish Kalele notifications@github.com
wrote:

@thockin https://github.com/thockin Yes, (fairly new to pprof output
for Go) apart from the first entry, there are no obvious hotspots.
Increasing the CPU allocation also doesn't help performance (if cgroup
throttling was the issue which we wouldn't see in the pprof sampling (is
this correct ?)).

Then, it would seem that its the lost queries that are affecting the QPS.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#28366 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AFVgVC1foykb5w85lb2xy2WoIkaFK3fnks5qSz3HgaJpZM4JDWO9
.

@miekg
Copy link

miekg commented Jul 6, 2016

[ Quoting notifications@github.com in "Re: [kubernetes/kubernetes] Improve..." ]

It seems like there is a new UDP socket created for every DNS forward request.

Are you implying there is a better way to do this (in Go)?

There are a lot of connections to 169.254.169.254 brought up and torn down seen in netstat...we should be able to mux/demux multiple queries based on DNS query transaction id and not need a unique socket for every forwarded request ?

Why does SkyDNS even see multiple outstanding queries with the same query ID?
There is an option to suppress identical queries going upstream: https://godoc.org/github.com/miekg/dns#Client
(actually using the query ID - might be solving a longstanding TODO in the code
there)

/Miek

Miek Gieben

@miekg
Copy link

miekg commented Jul 6, 2016

Me neither, sadly. Is this the go runtime or the "user" code in Go DNS that
needs to be tweaked?
On 6 Jul 2016 1:02 a.m., "Tim Hockin" notifications@github.com wrote:

Man, that is just not what I expect to see. I don't know what it means..so
much time in syscall and futex..

On Tue, Jul 5, 2016 at 4:31 PM, Girish Kalele notifications@github.com
wrote:

And full pprof dumps at
https://gist.github.com/girishkalele/4f36ccf342e87eeba064e63b14ebe0aa


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<
#28366 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe/AFVgVI0GTByjo9q7FVWBOuA9bwIQHUDEks5qSuk_gaJpZM4JDWO9

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#28366 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAVkW62ANErXOJs82AmKRd4J9p4ZlHLpks5qSvCogaJpZM4JDWO9
.

@miekg
Copy link

miekg commented Jul 6, 2016

Maybe setup a TCP connection to the upstream and use that, in stead of
using udp every time?
On 6 Jul 2016 12:30 a.m., "Girish Kalele" notifications@github.com wrote:

@thockin https://github.com/thockin @miekg https://github.com/miekg

Pprof trace when kube-dns was verified active with 1000 millicores (not
the default 100m).

(pprof) top304650ms of 6370ms total (73.00%)Dropped 138 nodes (cum <= 31.85ms)Showing top 30 nodes out of 195 (cum >= 40ms) flat flat% sum% cum cum% 1460ms 22.92% 22.92% 1570ms 24.65% syscall.Syscall 300ms 4.71% 27.63% 300ms 4.71% runtime.futex 300ms 4.71% 32.34% 300ms 4.71% runtime.usleep 280ms 4.40% 36.73% 280ms 4.40% runtime.readvarint 230ms 3.61% 40.35% 430ms 6.75% runtime.mallocgc 200ms 3.14% 43.49% 600ms 9.42% runtime.pcvalue 170ms 2.67% 46.15% 170ms 2.67% runtime.epollwait 160ms 2.51% 48.67% 160ms 2.51% runtime.adjustpointers 130ms 2.04% 50.71% 130ms 2.04% runtime._ExternalCode 130ms 2.04% 52.75% 130ms 2.04% runtime.epollctl 130ms 2.04% 54.79% 130ms 2.04% syscall.RawSyscall 110ms 1.73% 56.51% 390ms 6.12% runtime.step 100ms 1.57% 58.08% 110ms 1.73% runtime.findfunc 100ms 1.57% 59.65% 1040ms 16.33% runtime.gentraceback 90ms 1.41% 61.07% 140ms 2.20% k8s.io/kubernetes/vendor/github.com/miekg/dns.UnpackDomainName 90ms 1.41% 62.48% 120ms 1.88% runtime.scanobject 80ms 1.26% 63.74% 80ms 1.26% runtime.heapBitsSetType 60ms 0.94% 64.68% 160ms 2.51% k8s.io/kubernetes/vendor/github.com/miekg/dns.packDomainName 60ms 0.94% 65.62% 60ms 0.94% runtime.memmove 60ms 0.94% 66.56% 60ms 0.94% runtime/internal/atomic.Xchg 50ms 0.78% 67.35% 50ms 0.78% runtime.(_mspan).sweep.func1 40ms 0.63% 67.97% 40ms 0.63% runtime.(_gcWork).tryGet 40ms 0.63% 68.60% 90ms 1.41% runtime.SetFinalizer 40ms 0.63% 69.23% 50ms 0.78% runtime.deferreturn 40ms 0.63% 69.86% 420ms 6.59% runtime.findrunnable 40ms 0.63% 70.49% 60ms 0.94% runtime.greyobject 40ms 0.63% 71.11% 90ms 1.41% runtime.heapBitsSweepSpan 40ms 0.63% 71.74% 40ms 0.63% runtime.mapaccess2_faststr 40ms 0.63% 72.37% 250ms 3.92% runtime.netpoll 40ms 0.63% 73.00% 40ms 0.63% runtime/internal/atomic.Store


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#28366 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAVkW9K5H8BT9ZhPhBCjO-HKbU7pSp5hks5qSukUgaJpZM4JDWO9
.

@girishkalele
Copy link
Author

@thockin

Lost messages- kube-dns logs show "failure to forward request : i/o timeout" errors.
From direct tests against the GCE DNS server, we saw no lost or dropped packets.
If we believe GCE DNS is responding, then these replies are getting lost somewhere.

I0701 23:32:55.190432       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:58768->169.254.169.254:53: i/o timeout"
I0701 23:32:55.537034       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:38153->169.254.169.254:53: i/o timeout"
I0701 23:32:56.103012       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:40873->169.254.169.254:53: i/o timeout"
I0701 23:32:56.273982       1 logs.go:41] skydns: failure to forward request "read udp 10.244.2.3:49714->169.254.169.254:53: i/o timeout"

@girishkalele
Copy link
Author

@miekg

I actually meant if we were able to use a single UDP socket to every upstream DNS server, it would save a lot of churn. Possible would need to "NAT the query transaction ids" to correctly correlate upstream replies.

I will try and force upstream TCP in kube-dns, see if that helps performance.

@miekg
Copy link

miekg commented Jul 6, 2016

[ Quoting notifications@github.com in "Re: [kubernetes/kubernetes] Improve..." ]

@miekg

I actually meant if we were able to use a single UDP socket to every upstream DNS server, it would save a lot of churn. Possible would need to "NAT the query transaction ids" to correctly correlate upstream replies.

Interestingly this is something I'm exploring for CoreDNS (kicked off partially,
by this discussion): coredns/coredns#184, but for TCP
sockets. Didn't completely realize this can (and maybe) should be done for UDP
socket as well - always assumed this is super check (at least when compared to
TCP).

I will try and force upstream TCP in kube-dns, see if that helps performance.

If kube-dns is still a lot like SkyDNS then that won't help: well you then have
a lot of TCP socket churn.

/Miek

Miek Gieben

@miekg
Copy link

miekg commented Jul 7, 2016

@girishkalele i've seen those timeout before in DNS code. I've never been able to establish why (or where in the code) this happens.

Is it possible to check what gcedns is returning?

@girishkalele girishkalele removed this from the v1.4 milestone Jul 15, 2016
@miekg
Copy link

miekg commented Sep 30, 2016

So we benchmarked: coredns/coredns#287 (comment)

SkyDNS: Requests/Second: 38436.775

Discuss.

@bprashanth bprashanth assigned bowei and unassigned girishkalele Oct 30, 2016
@thockin thockin added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed sig/network Categorizes an issue or PR as relevant to SIG Network. labels May 16, 2017
@alok87
Copy link
Contributor

alok87 commented Jun 8, 2017

@girishkalele can u share the command used to measure the QPS?

@bowei
Copy link
Member

bowei commented Jun 8, 2017

One thing that may be useful is http://github.com/kubernetes/perf-tests/dns

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 26, 2017
@errordeveloper
Copy link
Member

Should we keep this open?

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 19, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dns lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests