Skip to content
This repository has been archived by the owner on Mar 24, 2022. It is now read-only.

[support] guardian inside docker cannot access docker-local DNS #18

Closed
RoboPhred opened this issue May 14, 2016 · 18 comments
Closed

[support] guardian inside docker cannot access docker-local DNS #18

RoboPhred opened this issue May 14, 2016 · 18 comments
Labels

Comments

@RoboPhred
Copy link

RoboPhred commented May 14, 2016

Garden containers cannot resolve DNS names when the worker is ran inside Docker via docker-compose, which supplies its own DNS server on a loopback address for resolving the names of other containers.

On the UI side, it simply says no versions are available, but logs are showing error 500 on check, and hijacking the check container shows that all DNS lookups are being routed to the docker container's (local?) DNS of 127.0.0.11, and failing with "connection refused".

Is there a way to manually supply the DNS server for garden?
Even better, is there a way to have garden use the host's network directly, so that it will have access to additional dns names managed by docker?

Additional details:

Problem occurs on both v1.3.0-rc.9 and v1.3.0-rc.35

Relevant bits of the pipeline

resource_types:
- name: svn-resource
  type: docker-image
  source:
    repository: robophred/concourse-svn-resource
    tag: alpha

resources:
- name: src
  type: svn-resource
  source:
    repository: {{repository}}
    trust_server_cert: true
    username: {{username}}
    password: {{password}}

Hijacking the check container, it seems to be set up for docker-image. I manually sent a request to check "svn-resource":

/opt/resource # ./check
./check
{"source":{"repository":"robophred/concourse-svn-resource","tag":"alpha"}}
{"source":{"repository":"robophred/concourse-svn-resource","tag":"alpha"}}

failed to ping registry: 2 error(s) occurred:

* ping https: Get https://registry-1.docker.io/v2: dial tcp: lookup registry-1.docker.io on 127.0.0.11:53: read udp 127.0.0.1:53962->127.0.0.11:53: read: connection refused
* ping http: Get http://registry-1.docker.io/v2: dial tcp: lookup registry-1.docker.io on 127.0.0.11:53: read udp 127.0.0.1:36771->127.0.0.11:53: read: connection refused

More exploring shows that all dns resolution is aimed at 127.0.0.11, and failing. However, I can still ping by IP.

/opt/resource # nslookup www.google.com
nslookup www.google.com
Server:    127.0.0.11
Address 1: 127.0.0.11

nslookup: can't resolve 'www.google.com'
/opt/resource #
/opt/resource # ping www.google.com
ping www.google.com
ping: bad address 'www.google.com'
/opt/resource #
/opt/resource # ping 8.8.8.8
ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=52 time=11.523 ms
64 bytes from 8.8.8.8: seq=1 ttl=52 time=12.456 ms

Connecting to the docker container, network access works just fine

root@cc9867a25143:~/concourse# cat /etc/resolv.conf
search mycompany.com
nameserver 127.0.0.11
options ndots:0
root@cc9867a25143:~/concourse# ping www.google.com
PING www.google.com (216.58.193.196) 56(84) bytes of data.
64 bytes from lax02s23-in-f4.1e100.net (216.58.193.196): icmp_seq=1 ttl=53 time=10.9 ms
64 bytes from lax02s23-in-f4.1e100.net (216.58.193.196): icmp_seq=2 ttl=53 time=10.9 ms

Worker Dockerfile:

FROM ubuntu:14.04

RUN \
  apt-get update && \
  apt-get -y install \
    iptables \
    quota \
    ulogd \
    curl \
  && \
  apt-get clean

RUN \
  mkdir -p /root/concourse && \
  cd /root/concourse && \
  curl -OL https://github.com/concourse/bin/releases/download/v1.3.0-rc.35/concourse_linux_amd64 && \
  chmod +x concourse_linux_amd64

WORKDIR /root/concourse

COPY concourse-worker-exec .
COPY host_key.pub .
COPY worker_key .
RUN chmod +x concourse-worker-exec

ENTRYPOINT /root/concourse/concourse-worker-exec

concourse-worker-exec

#!/bin/sh

set -e

mkdir -p /tmp/concourse-workdir

mkdir /tmp/concourse-workdir/graph
mount -t tmpfs none /tmp/concourse-workdir/graph

mkdir /tmp/concourse-workdir/overlays
mount -t tmpfs none /tmp/concourse-workdir/overlays

./concourse_linux_amd64 worker \
  --work-dir /tmp/concourse-workdir \
  --tsa-host "${TSA_HOST}" \
  --tsa-public-key host_key.pub \
  --tsa-worker-private-key worker_key
@concourse-bot
Copy link

concourse-bot commented May 14, 2016

Hi there!

We use Pivotal Tracker to provide visibility into what our team is working on. A story for this issue has been automatically created.

The current status is as follows:

  • #119571925 [support] garden/guardian inside docker cannot access DNS
  • #126454419 reopened: [support] guardian inside docker cannot access docker-local DNS

This comment, as well as the labels on the issue, will be automatically updated as the status in Tracker changes.

@RoboPhred RoboPhred changed the title [support] garden/guardian inside docker cannot access DNS [support] guardian inside docker cannot access docker-local DNS May 14, 2016
@vito
Copy link
Contributor

vito commented May 16, 2016

You could try tacking --garden-dns-server flags to the concourse worker command. Make sure you're not using the 127 address as that'll always try to resolve in the container's namespace - is there some other address you can reference? Maybe the host container's IP?

@RoboPhred
Copy link
Author

RoboPhred commented May 16, 2016

Thanks, this should be what I need on the concourse side to get things moving again. I am still having issues on the docker side; docker seems to redirect 127.0.0.11 to some mystery socket, and simply pointing --garden-dns-server to the host container's ip doesn't seem to work. Time to brush up on my networking skills...

@vito
Copy link
Contributor

vito commented Jun 3, 2016

Any word on this @RoboPhred? I'm not sure there's much left to do on our end, now that you at least know the flag that you'd be setting. May close this soon.

@RoboPhred
Copy link
Author

I have not had time to figure out how to get the docker dns to the worker, but with the flag I should have everything I need once I revisit this issue. Closing for now, will open a new issue if I hit any other snags.

@cirocosta
Copy link
Contributor

cirocosta commented Jul 17, 2016

Hi! I'm currently facing this issue. In order to solve it i endup adding socat to the worker's container, forwarding 127.0.0.11:53 (where the embedded dns lives):

socat UDP4-RECVFROM:53,fork UDP4-SENDTO:127.0.0.11:53 &

but i can't get the worker's runc to benefit from that server (the forwarding works as expected, see https://github.com/cirocosta/expose-edns). The inner containers are capable of pinging the worker but i'm not sure if that's just some iptables trick, or something like that, as i can't resolve any host by simply nslookup <worker_ip> <host_i_want_to_resolve>.

Any ideas @vito ?

@vito vito added the question label Jul 17, 2016
@cirocosta
Copy link
Contributor

cirocosta commented Jul 17, 2016

Just to add more info, the use case is:

  • web and worker run on user-defined network n1
  • worker has a dns server on 0.0.0.0:53/udp forwarded from 127.0.0.11:53 (docker's embedded dns server)
  • worker has --garden-dns-server set to its hostname -i
  • registry is also on user-defined network n1
  • worker docker container is capable of resolving registry name as it has the dns server on 127.0.0.11:53working well
  • inner runc worker containers should be able to resolve registry as their nameserver in /etc/resolv.conf is properly set to worker ip

Thx!

@vito
Copy link
Contributor

vito commented Jul 17, 2016

What ends up being in /etc/resolv.conf in the runc worker container? As long as the address can resolve to the outer container it should work. But if it falls within a private local network range (127.x.x.x, or an IP conflicting with the Guardian container's network range) it won't work.

@cirocosta
Copy link
Contributor

in resolv.conf ends up the expected address: 192.168.48.3 (the user-define network is in 192.168.48.0/20). Guardian's ip range that you mean are those ip's that runc containers are being assigned right? So that's actually not the case (they're going to 10.254.0.13/30 as i understood) hmmm

Could you help me with understanding these rules?

root@3ca010dd57e0:/# iptables --list
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
w--input   all  --  anywhere             anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
w--forward  all  --  anywhere             anywhere            

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         

Chain w--default (1 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED

Chain w--forward (1 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            
w--instance-ofgd99so74q  all  --  10.254.0.14          anywhere            [goto] 
DROP       all  --  anywhere             anywhere            

Chain w--input (1 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
ACCEPT     all  --  anywhere             anywhere            

Chain w--instance-ofgd99so74q (1 references)
target     prot opt source               destination         
ACCEPT     all  --  10.254.0.12/30       10.254.0.12/30      
w--default  all  --  anywhere             anywhere            [goto] 

Chain w--instance-ofgd99so74q-log (0 references)
target     prot opt source               destination         
LOG        tcp  --  anywhere             anywhere             ctstate INVALID,NEW,UNTRACKED LOG level warning prefix "10cffbf5-f484-4637-573a-c5171"
RETURN     all  --  anywhere             anywhere    

Am i correct that packets from 10.254.0.14 are being dropped? If so, then the inner container wouldn't be capable of talking with worker, right?

@vito
Copy link
Contributor

vito commented Jul 17, 2016

Yeah, come to think of it it's probably defaulting to rejecting traffic to the host's network. Try passing the --garden-allow-host-access flag.

@cirocosta
Copy link
Contributor

cirocosta commented Jul 17, 2016

hmmm still not able to resolve, but the table changed to now:

root@102aa617e5ff:/# iptables --list
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
w--input   all  --  anywhere             anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
w--forward  all  --  anywhere             anywhere            

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         

Chain w--default (1 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED

Chain w--forward (1 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            
w--instance-ofi078afcmn  all  --  10.254.0.10          anywhere            [goto] 
DROP       all  --  anywhere             anywhere            

Chain w--input (1 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
ACCEPT     all  --  anywhere             anywhere            

Chain w--instance-ofi078afcmn (1 references)
target     prot opt source               destination         
ACCEPT     all  --  10.254.0.8/30        10.254.0.8/30       
w--default  all  --  anywhere             anywhere            [goto] 

Chain w--instance-ofi078afcmn-log (0 references)
target     prot opt source               destination         
LOG        tcp  --  anywhere             anywhere             ctstate INVALID,NEW,UNTRACKED LOG level warning prefix "6fd74f69-5a73-4002-77ab-032d3"
RETURN     all  --  anywhere             anywhere  

update: had other containers still around, updated with the correct table

@vito
Copy link
Contributor

vito commented Jul 17, 2016

paging dr. @julz

@cirocosta
Copy link
Contributor

actually, inner worker is capable of accessing the worker as a request from the inner container to a python server in the worker has had the request served successfully; strange that dns lookups are still not working, maybe it has something to do with udp, need to check that

@cirocosta
Copy link
Contributor

cirocosta commented Jul 18, 2016

Little update on this: got it working by explicitly binding to the desired interface in socat 👯 :

socat UDP4-RECVFROM:53,fork,bind=$(hostname -i) UDP4-SENDTO:127.0.0.11:53 &

Is there a way of doing that w/out socat? As we already have iptables in the worker container, could we do that forwarding using it? Or garden would blindly remove the rule when updating the table whenever new containers are created?

Thx

@cirocosta
Copy link
Contributor

Any ideas on this? I'd really like to remove socat in such setup

@vito
Copy link
Contributor

vito commented Aug 8, 2016

Consolidating this into cloudfoundry/guardian#42 - if you actually want Docker's DNS working it may be more productive to talk about it there, as it's primarily a question of how to configure Guardian in this way. I don't think it's possible to day.

Thanks for the info and updates, though - sorry I couldn't be of more help.

@vito vito closed this as completed Aug 8, 2016
gerhard added a commit to thechangelog/infrastructure that referenced this issue Sep 19, 2016
The default Linode DNS server fails to resolve FQDNs. The git check was
failing to clone from Github because github.com could not be resolved.

vmware-archive/bin#18

[#125480521]
@ionphractal
Copy link

Just for info, found this Gist helpful: https://gist.github.com/colthreepv/6b818cfcf296dc1b5c2cf15eb76a140e
I guess the ENV did the trick for me.

 concourse-worker:
...
    environment:
      CONCOURSE_TSA_HOST: concourse-web
      CONCOURSE_GARDEN_ADDRESS: concourse-worker
      CONCOURSE_BAGGAGECLAIM_ADDRESS: concourse-worker
      CONCOURSE_GARDEN_FORWARD_ADDRESS: concourse-worker
      CONCOURSE_BAGGAGECLAIM_FORWARD_ADDRESS: concourse-worker
      CONCOURSE_GARDEN_DNS_SERVER: 8.8.8.8
    dns:
      - 8.8.8.8
      - 8.8.4.4

@PatrickWolleb
Copy link

Setting iptables -I FORWARD -j ACCEPT on worker instance does the trick.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants