Unable to set `extra_hosts` when using consul-connect (bridged networking) #7746

spuder · 2020-04-19T02:38:02Z

Version

Nomad = 0.11.0
CNI Plugins = 0.8.4
Docker = 19.03.7, build 7141c199a2
Docker API = 1.40
OS = Ubuntu 18.04 (4.15.0-91-generic)

Problem

If you attempt to run a job that uses extra_hosts while using bridged networking, you will receive the following error.

      config {
        image = "bash"
        extra_hosts = [
          "foobar.example.com:127.0.0.1"
        ]

failed to create container: API error (400): conflicting options: 
custom host-to-IP mapping and the network mode

This is a major problem because it means any consul connect enabled job is not able to use custom hosts options.

I did find one related issue in docker where using --net=host and --add-hosts was mutually exclusive before docker api version 1.12. I'm not sure which docker api version nomad is using, but 1.40 is the latest

curl --unix-socket /var/run/docker.sock http://localhost/version | jq .ApiVersion
"1.40"

Steps to reproduce

Submit the following job

job "bash" {
  datacenters = ["dc1"]
  group "api" {
    network {
      mode = "bridge"
    }
    task "bash" {
      driver = "docker"
      config {
        image = "bash"
        args = ["/bin/sleep", "100000000"]
        extra_hosts = [
          "foobar.example.com:127.0.0.1"
        ]
      }
    }
  }
}

Workarounds

Don't use extra hosts
Bake the host entry into the docker container (Please add --add-host=[], --net options to docker build moby/moby#10324)
Update nomad to docker api 1.12 or newer?

Possibly Related:

The text was updated successfully, but these errors were encountered:

Gufran · 2020-04-22T10:44:07Z

I can confirm that this problem in not limited to just the extra_hosts attribute but also to dns_servers and other dns options.

I'm running Nomad v0.10.5 and Docker v18.06.1 with API version 1.38 and minimum version 1.12.

It looks like the problem was fixed in Docker API version v1.12.0, see:

Nomad is developed against Docker version 1.8.2 and 1.9 (Official docs), meaning API version 1.20 and above (See Docker version matrix).

For the time being I am unable to run connect enabled jobs with custom DNS servers because of this problem. The error I get is conflicting options: dns and the network mode.

Gufran · 2020-04-22T10:56:52Z

I tried to run a docker container using the command line and I am able to use --net=bridge with --dns=<ip> on the same machine where Nomad throws an error:

docker run --rm -it --net bridge --dns 1.1.1.1 ubuntu:20.04 cat /etc/resolv.conf
search us-east-1.compute.internal
nameserver 1.1.1.1
options timeout:2 attempts:5

docker run --rm -it --net bridge --dns 8.8.8.8 ubuntu:20.04 cat /etc/resolv.conf
search us-east-1.compute.internal
nameserver 8.8.8.8
options timeout:2 attempts:5

docker run --rm -it --net bridge --dns 8.8.8.8 --dns 1.1.1.1 ubuntu:20.04 cat /etc/resolv.conf
search us-east-1.compute.internal
nameserver 8.8.8.8
nameserver 1.1.1.1
options timeout:2 attempts:5

Gufran · 2020-04-23T06:52:55Z

So I captured the traffic between Nomad and the Docker socket and it turns out that the network mode is container, not bridge.
I don't understand everything yet but I suspect it has to do with the fact that Nomad is using CNI plugins to setup networking and there is an intermediate container acting as the network bridge and gateway.

The new information for me is that the network mode specified in the jobspec is used for some other purpose. I tried to run a container with this new configuration e.g. --net container:container-id --dns 1.1.1.1 and it failed with the same error docker: Error response from daemon: conflicting options: dns and the network mode.

Gufran · 2020-04-23T08:21:58Z

Did some more digging. Now I'm certain that this is because of the CNI based network setup.

Here is the call trace of network setup before the allocation is started:

client/allocrunner/allocRunner.Run(): client/allocrunner/alloc_runner.go#L298
client/allocrunner/allocRunner.prerun(): client/allocrunner/alloc_runner_hooks.go#L201
client/allocrunner/networkHook.PreRun(): client/allocrunner/network_hook.go#L76
client/allocrunner/bridgeNetworkConfigurator.Setup(): client/allocrunner/networking_bridge_linux.go#L161

In my opinion the final call to cni.Setup() should also be given the DNS configuration if specified in the jobspec. something like

dnsConfig := cni.DNS{
  Servers: []string{"1.1.1.1"},
  Searches: []string{},
  Options: []string{},
}

b.cni.Setup(ctx,
            alloc.ID,
            spec.Path,
            cni.WithCapabilityPortMap(getPortMapping(alloc)),
            cni.WithCapabilityDNS(dnsConfig))

should do the job just fine.

I can try this change locally in a while, but it'd be great if someone who knows the codebase can verify the correctness of this patch in the meantime.

Gufran · 2020-04-23T08:26:44Z

@nickethier could you offer some insight here please?

nickethier · 2020-04-28T19:18:52Z

Hey all I missed this one when linking issues but the dns part of this issue is merged and will be in the next major release. See: #7661

We're still evaluating the extra_hosts option as its not something CNI supports directly. Under bridge mode, the docker tasks are using network-mode=container: which I don't think works with the docker extra_hosts flag. The linked issues is for network_mode=host specifically.

nickethier · 2020-04-29T03:17:33Z

With regards to the extra_hosts option, would using a template block to write out an /etc/hosts file work? It's definitely not an ideal solution but might be a work around in the interim?

spuder · 2020-04-29T03:59:21Z

Thats a great idea. I think that may be a viable work around

127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters

task "app" {
      driver = "docker"
      config {
        image = "<%= ENV['CI_REGISTRY_IMAGE'] %>:<%= ENV['CI_COMMIT_SHA'] %>"
        volumes = [
          "local/etc/hosts:/etc/hosts",
.....


      template {
        data = <<EOH
127.0.0.1	localhost
127.0.1.1 dev-vault.example.com
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters
        EOH
        destination = "local/etc/hosts"
      }

donjon-matter · 2020-06-24T08:21:26Z

May be my problem is some what similar so I hope I can post it here.
When running nomad with consul connect the /etc/hosts may look like this:

127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback

But I expected something like:

127.0.0.1	localhost
**172.0.0.2	abcxyzaaa**
::1	localhost ip6-localhost ip6-loopback

The bold line is the host of Docker container. My application was running Java, and a lot of library is rely on Hostname which cause error when try to resolve abcxyzaaa

tgross · 2021-01-27T14:15:13Z

Looks like we've identified a workaround for the upstream issue. I'm going to mark this as a docs issue so that we can provide some official guidance in the networking and/or Connect docs for folks.

eihli · 2021-01-29T18:42:30Z

The workaround requires another workaround in the case of wanting to use the special host-gateway string in Docker's add-host. moby/moby#40007

I'm wanting to use extra_hosts = ["host.docker.internal:host-gateway"] but I'm hitting this error. So although templating in /etc/hosts is a workaround, it comes with the additional complexity of getting the address of the host gateway into the container.

Oloremo · 2021-02-04T23:38:03Z

This is an issue for us, we need the app running inside the container to be able to resolve that randomly assigned container hostname. We're running containers in bridge mode and seems like nothing really working for that case.

We tried to template the /etc/hosts with {{ env "HOSTNAME" }} but it returns nothing for some reason while other ENV vars work just fine.

Any ideas or workarounds are welcome.

Ilhicas · 2021-02-10T19:13:10Z

Just to leave a comment as this breaks a lot of Java based application which rely on hostname -i resolution which can't be done. We are hitting this issue and mixing it up with template with hostname -I to resolve and fix it in /etc/hosts, but this is not a viable/generic solution, requires a lot of tooling in the image running to make it available also changing entrypoint to run this at start time, which is far from ideal.

Legogris · 2021-02-16T07:42:54Z

What is the workaround actually? Templating into /etc doesn't work in docker.

spuder · 2021-02-20T00:05:53Z

The workaround is to create a new etc/hosts file at some arbitrary location like the nomad path '/local/etc/hosts' then doing a volume mount to overwrite '/etc/hosts' with '/local/etc/hosts'

 "local/etc/hosts:/etc/hosts",

Oloremo · 2021-03-08T14:09:20Z

containerd driver added that: Roblox/nomad-driver-containerd#69

tgross · 2021-06-16T14:35:26Z

The workaround is to create a new etc/hosts file at some arbitrary location like the nomad path '/local/etc/hosts' then doing a volume mount to overwrite '/etc/hosts' with '/local/etc/hosts'

#10766 will do that for the docker driver, and provides infrastructure for community task drivers to do the same. The exec/java driver has some complications on that (see #10768).

DejfCold · 2021-08-13T02:35:42Z

Hi, @tgross!
I've encountered this on v1.1.3.
What are all the requirements for the extra_hosts to be added? Just that it's group.network.mode=bridge and some task.config.extra_host=["hostname:ip"] is present?

I'm asking because I do have that and it's kinda not working.

To be exact, I have: (To be clear, I'm just asking what are the requirements for it to work, not how exactly should I fix my thing ... although that would be also appreciated :) )

// stuff
    group "freeipa" {
        network {
            mode = "bridge"
        }
        service {
            name = "freeipa"
            port = "443"
            connect {
                sidecar_service {}
            }
        }
        task "freeipa" {
            resources {
                memory = 2000
            }
            driver = "docker"
            config {
                image = "freeipa/freeipa-server:centos-8"
                args = [ "ipa-server-install", "-U", "-r", "DC1.CONSUL", "--no-ntp" ]
                sysctl = {
                    "net.ipv6.conf.all.disable_ipv6" = "0"
                }
                extra_hosts = ["freeipa.ingress.dc1.consul:127.0.0.1"]
            }
            env {
                HOSTNAME = "freeipa.ingress.dc1.consul"
                PASSWORD = "testtest"
            }
        }
    }
// stuff

which results in

[root@63ec9326ae5a /]# cat /etc/hosts
# this file was generated by Nomad
127.0.0.1 localhost
::1 localhost
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

# this entry is the IP address and hostname of the allocation
# shared with tasks in the task group's network
172.26.65.239 63ec9326ae5a
[root@63ec9326ae5a /]#

tgross · 2021-08-14T19:01:50Z

Hi @DejfCold!

Just that it's group.network.mode=bridge and some task.config.extra_host=["hostname:ip"] is present?

The requirements from driver.go#L963-L972 are:

group.network.mode = "bridge"
task.config.extra_hosts = ["hostname:ip"]
task.config.network_mode is left unset.

Your jobspec there looks ok to me. The tests in mount_unix_test.go look to cover this use case well. So this looks like it might be a bug. While I wrote this feature I'm no longer at HashiCorp as a Nomad maintainer, so I'd recommend opening a new issue describing the problem so that the maintainers will be sure to see it. Thanks!

github-actions · 2022-10-17T02:45:21Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

spuder changed the title ~~Bridged networking not compatible with docker extra_hosts~~ Unable to set extra_hosts when using consul-connect (bridged networking) Apr 19, 2020

shoenig added theme/dependencies Pull requests that update a dependency file theme/driver/docker labels Apr 28, 2020

spuder mentioned this issue Jul 20, 2020

Nomad fails to run job if using network mode bridge and dns options #8431

Closed

tgross added the theme/docs Documentation issues and enhancements label Jan 27, 2021

Ilhicas mentioned this issue Feb 11, 2021

hostname not populated in /etc/hosts for Docker tasks with Connect #8900

Closed

tgross self-assigned this Jun 9, 2021

tgross mentioned this issue Jun 16, 2021

docker: generate /etc/hosts file for bridge network mode #10766

Merged

tgross linked a pull request Jun 16, 2021 that will close this issue

docker: generate /etc/hosts file for bridge network mode #10766

Merged

tgross added this to the 1.1.2 milestone Jun 16, 2021

tgross removed theme/dependencies Pull requests that update a dependency file theme/docs Documentation issues and enhancements labels Jun 16, 2021

tgross added the type/enhancement label Jun 16, 2021

tgross closed this as completed in #10766 Jun 16, 2021

DejfCold mentioned this issue Aug 14, 2021

duplicate task.config.extra_hosts to Connect sidecar tasks #11056

Open

github-actions bot locked as resolved and limited conversation to collaborators Oct 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to set `extra_hosts` when using consul-connect (bridged networking) #7746

Unable to set `extra_hosts` when using consul-connect (bridged networking) #7746

spuder commented Apr 19, 2020 •

edited

Loading

Gufran commented Apr 22, 2020

Gufran commented Apr 22, 2020

Gufran commented Apr 23, 2020 •

edited

Loading

Gufran commented Apr 23, 2020

Gufran commented Apr 23, 2020

nickethier commented Apr 28, 2020

nickethier commented Apr 29, 2020

spuder commented Apr 29, 2020

donjon-matter commented Jun 24, 2020

tgross commented Jan 27, 2021

eihli commented Jan 29, 2021

Oloremo commented Feb 4, 2021

Ilhicas commented Feb 10, 2021

Legogris commented Feb 16, 2021

spuder commented Feb 20, 2021

Oloremo commented Mar 8, 2021

tgross commented Jun 16, 2021

DejfCold commented Aug 13, 2021

tgross commented Aug 14, 2021

github-actions bot commented Oct 17, 2022

Unable to set extra_hosts when using consul-connect (bridged networking) #7746

Unable to set extra_hosts when using consul-connect (bridged networking) #7746

Comments

spuder commented Apr 19, 2020 • edited Loading

Version

Problem

Steps to reproduce

Workarounds

Gufran commented Apr 22, 2020

Gufran commented Apr 22, 2020

Gufran commented Apr 23, 2020 • edited Loading

Gufran commented Apr 23, 2020

Gufran commented Apr 23, 2020

nickethier commented Apr 28, 2020

nickethier commented Apr 29, 2020

spuder commented Apr 29, 2020

donjon-matter commented Jun 24, 2020

tgross commented Jan 27, 2021

eihli commented Jan 29, 2021

Oloremo commented Feb 4, 2021

Ilhicas commented Feb 10, 2021

Legogris commented Feb 16, 2021

spuder commented Feb 20, 2021

Oloremo commented Mar 8, 2021

tgross commented Jun 16, 2021

DejfCold commented Aug 13, 2021

tgross commented Aug 14, 2021

github-actions bot commented Oct 17, 2022

Unable to set `extra_hosts` when using consul-connect (bridged networking) #7746

Unable to set `extra_hosts` when using consul-connect (bridged networking) #7746

spuder commented Apr 19, 2020 •

edited

Loading

Gufran commented Apr 23, 2020 •

edited

Loading