Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix answers for alt domain #11348

Merged
merged 9 commits into from
Oct 29, 2021
50 changes: 35 additions & 15 deletions agent/dns.go
Original file line number Diff line number Diff line change
Expand Up @@ -348,6 +348,20 @@ func serviceIngressDNSName(service, datacenter, domain string, entMeta *structs.
return serviceCanonicalDNSName(service, "ingress", datacenter, domain, entMeta)
}

// getResponseDomain returns alt-domain if it is configured and request is made with alt-domain,
// respects DNS case insensitivity
func (d *DNSServer) getResponseDomain(questionName string) string {
labels := dns.SplitDomainName(questionName)
domain := d.domain
for i := len(labels) - 1; i >= 0; i-- {
currentSuffix := strings.Join(labels[i:], ".") + "."
if strings.EqualFold(currentSuffix, d.domain) || strings.EqualFold(currentSuffix, d.altDomain) {
domain = currentSuffix
}
}
return domain
}

// handlePtr is used to handle "reverse" DNS queries
func (d *DNSServer) handlePtr(resp dns.ResponseWriter, req *dns.Msg) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbabuadze, @dhiaayachi : something I noticed based on this community member's question about PTR and alt_domain...

When looking for instances of .domain in this file, I noticed two instances (coincidentally related to PTR) which I'm not sure this PR addresses yet:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jkirschner-hashicorp ,

This is the only use-case which is not handled, and actually one I wanted to ask about.
Since query looks like this: "2.0.0.127.in-addr.arpa." we can return either alt or default domain. (We can not make decision based on incoming request as in other query types)

Probably a correct way is to return .alt-domain, since if somebody configured alt-domain they would expect it to be returned in PTR as well. But I will wait for your suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is trickier to fix then the other instances, we need to know which IP is associated with which domain. I think it's ok to keep it as the main domain and document it as a limitation

Copy link
Contributor

@jkirschner-hashicorp jkirschner-hashicorp Oct 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @dhiaayachi that we should leave this as-is for now. Since the query itself (in this case) can't specify the domain (it must be <ip>.in-addr.arpa.), we must assume that the primary domain (.domain) should be used, not the alternate (.alt_domain).

document it as a limitation

@dhiaayachi : were you thinking that we add that as a comment in the source code? I don't think we need to open an issue for it (as I don't see a way that could be supported with PTR queries).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkirschner-hashicorp I was thinking about adding it somewhere in the doc, may be as part of the altdomain description

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed edit to the doc.

Optionally I can add following section to dns.mdx :

Alternate Domain

If you are using -alt-domain option, Consul will respond based on the presence of it in your queries.
For exmaple if test-domain is configured as an alternative domain the following query:

$ dig @127.0.0.1 -p 8600  consul.service.test-domain SRV

will return:

;; QUESTION SECTION:
;consul.service.test-domain.	IN	SRV

;; ANSWER SECTION:
consul.service.test-domain. 0	IN	SRV	1 1 8300 machine.node.dc1.test-domain.

;; ADDITIONAL SECTION:
machine.node.dc1.test-domain. 0	IN	A	127.0.0.1
machine.node.dc1.test-domain. 0	IN	TXT	"consul-network-segment="

-> Note: Response to <ip>.in-addr.arpa. will always be returned with your default domain, as there is no way to identify queried domain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea (particularly the example). I'll need to spend a few minutes thinking about the right place for this to go in the linear flow of dns.mdx. Did you have an idea of where it should go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jkirschner-hashicorp ,
I'm not sure what would be the best place for it.
Maybe before
https://www.consul.io/docs/discovery/dns#caching,
but not as a sub section of
https://www.consul.io/docs/discovery/dns#service-lookups

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbabuadze : I think that's a good place for it. There are a few minor things I think we can improve first though, such as:

  • should have an opening sentence explaining what an alternative domain is / when you might use it with Consul
  • spelling fix: "exmaple"
  • change -alt-domain to alt_domain (to be consistent with how it's described elsewhere on the page

It will be easiest to discuss those changes further by commenting on a diff in a PR. However, given that everything else is already finished and approved, we have two choices:

  • Merge as is, then work with you separately on a small PR to make this docs improvement, OR
  • Discuss the above change in this PR

I'm inclined to do the first option (merge as is). What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkirschner-hashicorp I agree, let's move with the first option and I'll open small PR for docs improvement.

q := req.Question[0]
Expand Down Expand Up @@ -485,14 +499,14 @@ func (d *DNSServer) handleQuery(resp dns.ResponseWriter, req *dns.Msg) {

switch req.Question[0].Qtype {
case dns.TypeSOA:
ns, glue := d.nameservers(cfg, maxRecursionLevelDefault)
ns, glue := d.nameservers(req.Question[0].Name, cfg, maxRecursionLevelDefault)
m.Answer = append(m.Answer, d.soa(cfg, q.Name))
m.Ns = append(m.Ns, ns...)
m.Extra = append(m.Extra, glue...)
m.SetRcode(req, dns.RcodeSuccess)

case dns.TypeNS:
ns, glue := d.nameservers(cfg, maxRecursionLevelDefault)
ns, glue := d.nameservers(req.Question[0].Name, cfg, maxRecursionLevelDefault)
m.Answer = ns
m.Extra = glue
m.SetRcode(req, dns.RcodeSuccess)
Expand Down Expand Up @@ -550,7 +564,7 @@ func (d *DNSServer) addSOA(cfg *dnsConfig, msg *dns.Msg, questionName string) {
// nameservers returns the names and ip addresses of up to three random servers
// in the current cluster which serve as authoritative name servers for zone.

func (d *DNSServer) nameservers(cfg *dnsConfig, maxRecursionLevel int) (ns []dns.RR, extra []dns.RR) {
func (d *DNSServer) nameservers(questionName string, cfg *dnsConfig, maxRecursionLevel int) (ns []dns.RR, extra []dns.RR) {
out, err := d.lookupServiceNodes(cfg, serviceLookup{
Datacenter: d.agent.config.Datacenter,
Service: structs.ConsulServiceName,
Expand Down Expand Up @@ -578,14 +592,14 @@ func (d *DNSServer) nameservers(cfg *dnsConfig, maxRecursionLevel int) (ns []dns
d.logger.Warn("Skipping invalid node for NS records", "node", name)
continue
}

fqdn := name + ".node." + dc + "." + d.domain
respDomain := d.getResponseDomain(questionName)
fqdn := name + ".node." + dc + "." + respDomain
fqdn = dns.Fqdn(strings.ToLower(fqdn))

// NS record
nsrr := &dns.NS{
Hdr: dns.RR_Header{
Name: d.domain,
Name: respDomain,
Rrtype: dns.TypeNS,
Class: dns.ClassINET,
Ttl: uint32(cfg.NodeTTL / time.Second),
Expand Down Expand Up @@ -662,6 +676,9 @@ func (d *DNSServer) dispatch(remoteAddr net.Addr, req, resp *dns.Msg, maxRecursi
// have to deref to clone it so we don't modify (start from the agent's defaults)
var entMeta = d.defaultEnterpriseMeta

// Choose correct response domain
respDomain := d.getResponseDomain(req.Question[0].Name)

// Get the QName without the domain suffix
qName := strings.ToLower(dns.Fqdn(req.Question[0].Name))
qName = d.trimDomain(qName)
Expand Down Expand Up @@ -833,7 +850,7 @@ func (d *DNSServer) dispatch(remoteAddr net.Addr, req, resp *dns.Msg, maxRecursi
//check if the query type is A for IPv4 or ANY
aRecord := &dns.A{
Hdr: dns.RR_Header{
Name: qName + d.domain,
Name: qName + respDomain,
Rrtype: dns.TypeA,
Class: dns.ClassINET,
Ttl: uint32(cfg.NodeTTL / time.Second),
Expand All @@ -854,7 +871,7 @@ func (d *DNSServer) dispatch(remoteAddr net.Addr, req, resp *dns.Msg, maxRecursi
//check if the query type is AAAA for IPv6 or ANY
aaaaRecord := &dns.AAAA{
Hdr: dns.RR_Header{
Name: qName + d.domain,
Name: qName + respDomain,
Rrtype: dns.TypeAAAA,
Class: dns.ClassINET,
Ttl: uint32(cfg.NodeTTL / time.Second),
Expand Down Expand Up @@ -1535,13 +1552,14 @@ func findWeight(node structs.CheckServiceNode) int {
}
}

func (d *DNSServer) encodeIPAsFqdn(dc string, ip net.IP) string {
func (d *DNSServer) encodeIPAsFqdn(questionName string, dc string, ip net.IP) string {
ipv4 := ip.To4()
respDomain := d.getResponseDomain(questionName)
if ipv4 != nil {
ipStr := hex.EncodeToString(ip)
return fmt.Sprintf("%s.addr.%s.%s", ipStr[len(ipStr)-(net.IPv4len*2):], dc, d.domain)
return fmt.Sprintf("%s.addr.%s.%s", ipStr[len(ipStr)-(net.IPv4len*2):], dc, respDomain)
} else {
return fmt.Sprintf("%s.addr.%s.%s", hex.EncodeToString(ip), dc, d.domain)
return fmt.Sprintf("%s.addr.%s.%s", hex.EncodeToString(ip), dc, respDomain)
}
}

Expand Down Expand Up @@ -1623,13 +1641,14 @@ func (d *DNSServer) makeRecordFromNode(node *structs.Node, qType uint16, qName s
// Otherwise it will return a IN A record
func (d *DNSServer) makeRecordFromServiceNode(dc string, serviceNode structs.CheckServiceNode, addr net.IP, req *dns.Msg, ttl time.Duration) ([]dns.RR, []dns.RR) {
q := req.Question[0]
respDomain := d.getResponseDomain(q.Name)

ipRecord := makeARecord(q.Qtype, addr, ttl)
if ipRecord == nil {
return nil, nil
}

if q.Qtype == dns.TypeSRV {
nodeFQDN := fmt.Sprintf("%s.node.%s.%s", serviceNode.Node.Node, dc, d.domain)
nodeFQDN := fmt.Sprintf("%s.node.%s.%s", serviceNode.Node.Node, dc, respDomain)
answers := []dns.RR{
&dns.SRV{
Hdr: dns.RR_Header{
Expand Down Expand Up @@ -1664,7 +1683,7 @@ func (d *DNSServer) makeRecordFromIP(dc string, addr net.IP, serviceNode structs
}

if q.Qtype == dns.TypeSRV {
ipFQDN := d.encodeIPAsFqdn(dc, addr)
ipFQDN := d.encodeIPAsFqdn(q.Name, dc, addr)
answers := []dns.RR{
&dns.SRV{
Hdr: dns.RR_Header{
Expand Down Expand Up @@ -1833,11 +1852,12 @@ func (d *DNSServer) serviceSRVRecords(cfg *dnsConfig, dc string, nodes structs.C

answers, extra := d.nodeServiceRecords(dc, node, req, ttl, cfg, maxRecursionLevel)

respDomain := d.getResponseDomain(req.Question[0].Name)
resp.Answer = append(resp.Answer, answers...)
resp.Extra = append(resp.Extra, extra...)

if cfg.NodeMetaTXT {
resp.Extra = append(resp.Extra, d.generateMeta(fmt.Sprintf("%s.node.%s.%s", node.Node.Node, dc, d.domain), node.Node, ttl)...)
resp.Extra = append(resp.Extra, d.generateMeta(fmt.Sprintf("%s.node.%s.%s", node.Node.Node, dc, respDomain), node.Node, ttl)...)
}
}
}
Expand Down
Loading