Skip to content

net: pure Go dns does not handle UDP packets > 512 octets #13561

Closed
@waddles

Description

@waddles

I have found a bug in pure Go resolver that causes Dial to fail even though correct responses are returned from the server. This problem manifested itself as a failure when trying to search for docker images in docker 1.9.1 built on go1.4.3:

DEBU[0003] Calling GET /v1.21/images/search
INFO[0003] GET /v1.21/images/search?term=phusion
DEBU[0003] hostDir: /etc/docker/certs.d/docker.io
DEBU[0003] pinging registry endpoint https://index.docker.io/v1/
DEBU[0003] attempting v1 ping for registry endpoint https://index.docker.io/v1/
DEBU[0003] Index server: https://index.docker.io/v1/
ERRO[0003] Handler for GET /v1.21/images/search returned error: Get https://index.docker.io/v1/search?q=phusion: dial tcp: lookup index.docker.io on x.x.x.x:53: no such host
ERRO[0003] HTTP Error                                    err=Get https://index.docker.io/v1/search?q=phusion: dial tcp: lookup index.docker.io on x.x.x.x:53: no such host statusCode=404

Similar issues are #12712 and #12778

I have tested go1.3.0 and it did not exhibit this problem, but at least 1.4.3, 1.5.1 and 1.5.2 are affected.

Tested using the following code:

package main

import (
    "fmt"
    "io/ioutil"
    "log"
    "net/http"
    "os"
)

func main() {
    resp, err := http.Get(os.Args[1])
    if err != nil {
        log.Fatal(err)
    }
    page, err := ioutil.ReadAll(resp.Body)
    resp.Body.Close()
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("%s", page)
}

Note - we only run an IPv4 stack. IPv6 is disabled on the client

Test fails with pure Go resolver

# /tmp/fetch-1.5.2 https://index.docker.io/v1/search?q=phusion
2015/12/10 12:17:42 Get https://index.docker.io/v1/search?q=phusion: dial tcp: lookup index.docker.io on x.x.x.x:53: no such host
# docker search phusion
Error response from daemon: Get https://index.docker.io/v1/search?q=phusion: dial tcp: lookup index.docker.io on x.x.x.x:53: no such host

DNS packet capture: PureGO_correct_but_fails.txt

Here we see the AAAA response contains only CNAME RRs but no AAAA RRs. The resolver queries all 4 name servers and all 4 responses contain CNAME and A RRs which the resolver ignores. It then searches using the search domain (from /etc/resolv.conf) appended, finally determining that the host cannot be found.

Test succeeds when forced to use cGo resolver

(By setting LOCALDOMAIN in the environment, as described in the docs)

# LOCALDOMAIN= /tmp/fetch-1.5.2 https://index.docker.io/v1/search?q=phusion
{"num_pages": 11, "num_results": 257, "results": [{"is_automated": true, ...

DNS packet capture: CGO_correct_but_succeeds.txt

Here we see the same AAAA and A RRs are returned but this time the resolver accepts the A records and the dial call succeeds.

Summary

The name server returns correct responses for both A and AAAA queries but the pure Go resolver ignores the A response records when the AAAA response contains no RRs.

I would have also expected that since IPv6 is disabled, querying for AAAA records is redundant, but the C library does it too. That may be because many hosts will have both stacks enabled but only have IPv4 routing configured so querying AAAA may be valid and successful but will require an A query anyway. Unfortunately that means the DNS server's load gets doubled for every address resolution.

Recommendations

  • Review section 3. Expected Behaviour of RFC4074
  • Enable AAAA queries only if the IPv6 stack is enabled

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions