Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Routing fails randomly, version 0.10.x #2290

Closed
tairila opened this issue Mar 29, 2017 · 6 comments · Fixed by #2302
Closed

Routing fails randomly, version 0.10.x #2290

tairila opened this issue Mar 29, 2017 · 6 comments · Fixed by #2302

Comments

@tairila
Copy link

tairila commented Mar 29, 2017

Summary

I noticed that sometimes Kong routing to an API fails, this happens randomly. When trying to access an application through Kong the following error message comes to browser window “An unexpected error occurred". Earlier this was working fine with Kong version 0.9.7 and Cassandra 2.x.

[error] 126#0: *8877 [lua] responses.lua:101: before(): failed the initial dns/balancer resolve for 'xxx' with: dns query returned no results, client: xxx.xxx.xxx.xxx, server: kong, request: "GET /yyy HTTP/1.1", host: "xxx:8080"

The API creation command:
curl -X POST localhost:8001/apis/ -d 'name=xxx' -d 'upstream_url=http://xxx:8080' -d 'preserve_host=true' -d 'uris=/yyy' -d 'strip_uri=true'

Steps To Reproduce

Repeat GET request several times for an API.

Additional Details & Logs

Kong version 0.10.0 & 0.10.1
Cassandra 3.0.10

@Tieske
Copy link
Member

Tieske commented Mar 29, 2017

The message explains exactly what happens. Kong queries the dns server to resolve the hostname but does not receive a proper answer from that server.

As you can see here it will take the timeout and attempts settings from the resolv.conf configuration file.

If they are not set, it will be 5 attempts and a timeout of 2 seconds.

The failed the initial dns/balancer resolve message is generated here, whilst the dns query returned no results is generated in the dns lib here, when the nameserver returns a record, but an empty one.

When Kong resolves a name it will try to resolve in the following order 'last-successful-type', SRV, A, AAAA and finally CNAME

@Tieske
Copy link
Member

Tieske commented Mar 29, 2017

what do the DNS records look like, in that order?

@tairila
Copy link
Author

tairila commented Mar 30, 2017

It is following:

; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.4 <<>> mesos-ui.marathon.slave.mesos
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2190
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;mesos-ui.marathon.slave.mesos. IN A

;; ANSWER SECTION:
mesos-ui.marathon.slave.mesos. 60 IN A 10.254.4.45

;; Query time: 0 msec
;; SERVER: 10.254.20.255#53(10.254.20.255)
;; WHEN: Thu Mar 30 10:26:27 EEST 2017
;; MSG SIZE rcvd: 63

I noticed one thing with resolv.conf file though, the error comes when it has following nameservers:

nameserver 10.254.20.255
nameserver 10.254.20.175
nameserver 10.254.10.93
; generated by /usr/sbin/dhclient-script
search emea.xxx.net china.xxx.net apac.xxx.net americas.xxx.net
nameserver 10.131.39.252
nameserver 87.254.221.110

In this case only the first 3 are relevant ones and when I tested routing with having only those in resolv.conf file (removed everything else from it) it is working fine (no errors)!

@Tieske
Copy link
Member

Tieske commented Mar 30, 2017

interesting, I'd expect the resolver to pick the next nameserver on a retry, but maybe it doesn't and then fails while keep trying the same bad nameserver.

What is the response you get if you explcitly query those removed servers?

@Tieske
Copy link
Member

Tieske commented Mar 30, 2017

actually I don't think the resolv.conf parser will honour the MAXNS setting of 3. See https://linux.die.net/man/5/resolv.conf

That's probably why the bad nameserver was queried where it shouldn't have been.

@Tieske
Copy link
Member

Tieske commented Mar 30, 2017

fixed it in Kong/lua-resty-dns-client#7

Kong dependency needs to be updated after releasing new dns client version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants