-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bind9 FORMERR on AAAA record lookups when delegating subdomain to Consul #3439
Comments
#1301 may also resolve this as (assuming my understanding is correct) if Consul properly returns records for ns.domain then the SOA in the AAAA response will be valid. |
We've made some changes to the SOA and the NS responses of consul in 0.9.1 of which the gist is in here: #3353 (comment) However, there is a panic in the code that is fixed here: #3408 which is only on master right now. The panic is triggered when you query for the SOA record directly. We should have a 0.9.3 release out soon. Also, I agree that the SOA fields should be configurable and I'm going to pick this up after the config refactoring I'm working on. |
We have the same scenario and behaviour as described by @CVTJNII. We are using Bind, I'm having trouble to find a workaround. Do you guys have any ideias? Consul version: v1.3.0
|
Same issue consul 1.6.2 :( @CVTJNII , @magiconair did you found a workaround ? |
Hi @olivierHa, Would you mind providing a bit more detail about your environment, and the exact error you're seeing in the Consul & DNS server logs? I was able to successfully configure BIND to forward queries to Consul using standard DNS delegation as well as using a Thanks. |
it is happening the same to me, but only, with the
Please let me know how can I help. |
Any progress on this? We have consul running on a linux box with bind in front. This is being used by Windows DNS as conditional forwarder for amongst other things service.consul domain. So Windows DNS has conditional forwarder service.consul -> BIND Then we have linux box running docker that needs to pull images via proxy.service.consul. Now what happens is that the docker host queries for proxy.service.consul. It (nearly instantly) gets an A record, but it also wants AAAA due to v6 stack preference. AAAA doesn't exist however. It attempts to get this twice before falling back. Unfortunately at that time docker has killed the process as it's been waiting for over 20s for any kind of response. In this situation proxy.service.consul is forwarded to Windows DNS, which forwards to BIND, which forwards to consul. BIND sees this response: ; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.8 <<>> @a.b.c.d proxy.service.consul aaaa ;; OPT PSEUDOSECTION: ;; AUTHORITY SECTION: ;; Query time: 0 msec Which logs these lines in named query logs: 2022-02-15_07:34:43.13423 15-Feb-2022 08:34:43.133 DNS format error from a.b.c.d resolving proxy.service.consul/AAAA for client 127.0.0.1#35439: Name consul (SOA) not subdomain of zone service.consul -- invalid response Windows DNS gets this response: ; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.8 <<>> @localhost proxy.service.consul aaaa ;; OPT PSEUDOSECTION: ;; Query time: 17 msec Due to the SERVFAIL and not an empty NOERROR response (on both bind -> consul hosts - they're redundant), Windows DNS at that point falls back to using root hints. That traffic is dropped by firewall causing the huge delays, but it probably wouldn't help much if it were allowed as .consul isn't available externally. Best solution would be to get BIND to return a NOERROR instead of SERVFAIL imho. Easiest way to get that working seems to be to get a NOERROR back with a SOA pointing to a NS that resolves. Forwarding the entire consul. zone to consul doesn't help by the way, as it doesn't resolve ns.consul. |
Any progress on this issue? |
Consul version: v0.7.2
Server information:
Description of the Issue (and unexpected/desired result)
We are delegating a subdomain from Bind9 to Consul for service discovery. We have configured the datacenter and domain values properly and delegation of A records work. However, lookups of AAAA records for valid services fail. This problem is limited to AAAA lookups of valid services as the service has valid A records for ipv4, but no AAAA records as the backing servers currently do not have ipv6 addresses. Lookups of invalid services pass as Consul returns NXDOMAIN for invalid services.
In troubleshooting on the Bind9 side I see Bind is reporting FORMERR. Note the following log snippet is sanitized:
In reading about similar issues on the Bind9 user list I believe this is due to the SOA record being incorrect. See the following post with the comment "This one fails to return the CNAME to content.sjc1.site.voxcdn.net when the query type is AAAA so you get a unrelated SOA record." https://groups.google.com/forum/#!topic/comp.protocols.dns.bind/B-9RPmaJdjQ This makes sense for my error as I see the empty NOERROR response for the AAAA lookup returns a SOA record with ns.domain as the authorative nameserver, which is wrong.
Looking over the Consul docs I do not see how to configure the SOA record for the delegated domain in Consul, based on the docs at https://www.consul.io/docs/agent/options.html#dns_config I am under the impression ns.domain and postmaster.doman are hardcoded defaults. I see PR #1798 was opened to allow this record to be settable, but the author closed the PR without it being merged.
This is a nuisance problem as, while the A record lookup works, Bind is passing SERVFAIL to clients trying to look up AAAA records first because it is rejecting the response from Consul and as such cannot get a response itself. The clients retry on SERVFAIL until they timeout and fallback to the A record, adding about 10s to all API requests to services using Consul DNS in our environment.
The text was updated successfully, but these errors were encountered: