Docker cannot pull images #137

johannmayer · 2022-01-20T22:29:46Z

Hi,

i just installed colima on a MacBook Pro wit BigSur 11.6.2

colima version 0.3.2
git commit: 272db4732b90390232ed9bdba955877f46a50552

runtime: docker
arch: x86_64
client: v20.10.12
server: v20.10.11

When i want to pull in docker, I get an i/o timeout error. It seems that the colima system doesn't have internet connection.

docker pull maven Using default tag: latest Error response from daemon: Get "https://registry-1.docker.io/v2/": dial tcp: lookup registry-1.docker.io on 192.168.5.3:53: read udp 192.168.5.15:56157->192.168.5.3:53: i/o timeout

Are there any post-install steps to get a connection?

The text was updated successfully, but these errors were encountered:

abiosoft · 2022-01-21T04:52:19Z

are you behind a VPN connection?

johannmayer · 2022-01-21T06:20:37Z

Yes, i am behind a corporate VPN connection.

spkane · 2022-01-21T17:36:00Z

I am not on a VPN or using docker with colima, but I see a similar issue:

I get a DNS related error on my first build with nerdctl via containerd after I have started the alpine VM.
Simply re-running the command fixes things until I restart the VM.

First Try:

$ nerdctl build --namespace k8s.io --platform linux/amd64 -t test/test:local -f ./Dockerfile .
[+] Building 0.2s (4/4) FINISHED
...
error: failed to solve: alpine:latest: failed to do request: Head "https://registry-1.docker.io/v2/library/alpine/manifests/latest": dial tcp: lookup registry-1.docker.io on [::1]:53: read udp [::1]:45220->[::1]:53: read: connection refused
FATA[0000] unrecognized image format
FATA[0000] exit status 1

Second Try:

$ nerdctl build --namespace k8s.io --platform linux/amd64 -t test/test:local -f ./Dockerfile .
[+] Building 0.2s (4/4) FINISHED
...
[+] Building 9.7s (7/17)
 => [internal] load build definition from Dockerfile                                  0.1s
 => => transferring dockerfile: 580B                                                  0.1s
 => [internal] load .dockerignore                                                     0.1s
 => => transferring context: 306B                                                     0.1s
 => [internal] load metadata for docker.io/library/alpine:latest                      0.4s
 => [internal] load metadata for docker.io/library/golang:1.17
...

cschmatzler · 2022-01-24T16:51:16Z

I am running into the same error, without any VPN connection.

❯ colima version
colima version 0.3.2
git commit: 272db4732b90390232ed9bdba955877f46a50552

runtime: docker
arch: aarch64
client: v20.10.10
server: v20.10.11

starvsion · 2022-01-24T17:03:31Z

I resolved it by doing colima start --port-interface 127.0.0.1

Correction: colima start --port-interface 127.0.0.1 -s

but it fails after pulling in more data

niroowns · 2022-01-25T19:33:23Z

For those of us behind a VPN, how do I configure docker to use a proxy?

spkane · 2022-01-26T18:03:49Z

This is a good overview of DNS issues in Alpine and might be at the core of some of these DNS issues:

https://support.cloudbees.com/hc/en-us/articles/360040999471-UnknownHostException-caused-by-DNS-Resolution-issue-with-Alpine-Images

Their main fix was to migrate to RedHat's Universal Base Images (UBI) - https://developers.redhat.com/products/rhel/ubi

There is a workaround as well, that I will try when I have a bit of time to test it.

pensatocriminale · 2022-01-27T01:42:51Z

I am seeing this issue now too, after it had been working for me initially, e.g. -

% docker pull lscr.io/linuxserver-labs/daedalos
Using default tag: latest
Error response from daemon: Get "https://ghcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

and testing on multiple networks.

yoedusvany · 2022-02-01T18:37:16Z

Same here
docker pull hello-world
Using default tag: latest
error during connect: Post "http://%2FUsers%2Fxxxxxx%2F.colima%2Fdocker.sock/v1.41/images/create?fromImage=hello-word&tag=latest": EOF

AlexLombry · 2022-02-02T13:27:12Z

Hello, I have this error too : Error response from daemon: Get "https://registry-1.docker.io/v2/": dial tcp: lookup registry-1.docker.io on 192.168.5.3:53: read udp 192.168.5.15:33676->192.168.5.3:53: i/o timeout
Sometimes it's a timeout, sometimes another error.

I to install it on a macOS without VPN whatsoever, I don't understand the issue. I've also tested multiple configuration like Rancher desktop, minikube + hyperkit, podman etc and I have this issue only with Colima.

Someone found a solution about that ?

For instance if I run docker run hello-word it's working for almost 30 secondes after the start of colima.
And then it crashes and I finally get this error.
After that the error happen every times

wolf31o2 · 2022-02-11T00:52:08Z

It's Alpine. The musl DNS resolver is pretty terrible. It behaves differently from glibc in many ways.

abiosoft · 2022-02-13T18:35:53Z

It's Alpine. The musl DNS resolver is pretty terrible. It behaves differently from glibc in many ways.

I am just realising this

spkane · 2022-02-17T03:17:49Z

There are details about this here:
https://wiki.musl-libc.org/functional-differences-from-glibc.html#Name-Resolver/DNS

pedantic79 · 2022-02-18T02:01:17Z

I've been experiencing DNS failures randomly too. Especially, when having many queries in quick succession. Would having a caching dns server sit between the qemu dns and the containers help? I may try to set one up manually to see if it helps the situation.

jandubois · 2022-02-18T05:43:30Z

I'm not convinced the differences between glibc and musl are the root cause here; unless colima does something different, there should be only a single nameserver in /etc/resolv.conf, and it should point to the lima internal host resolver.

I found one bug with this very recently: we disable IPv6 lookups in Lima by default because they often end up not working. The issue was though that instead of responding with an empty response, we handed the request to the resolver on the host, which might then add some random error for the IPv6 query to our response.

In my specific test case, I got the right DNS information when I looked with nslookup or dig, but curl could not connect. So I guess the musl resolver could share some blame, but the main blame belongs on our own DNS implementation (at least for this particular case).

This should be fixed in the forthcoming lima 0.8.3 release. So I would appreciate if you could all re-test with that version (once released), and report back if this improved/fixed the situation!

abiosoft · 2022-02-18T05:47:14Z

I'm not convinced the differences between glibc and musl are the root cause here; unless colima does something different, there should be only a single nameserver in /etc/resolv.conf, and it should point to the lima internal host resolver.

This is the case in Colima as well, and the single nameserver is 192.168.5.3.

This should be fixed in the forthcoming lima 0.8.3 release. So I would appreciate if you could all re-test with that version (once released), and report back if this improved/fixed the situation!

Looking forward to it. Thanks.

navels · 2022-02-19T07:56:05Z

New colima user here, running into this right off the bat. lima version is 0.8.3, colima 0.3.3. This workaround fixed it for me: #140 (comment)

pedantic79 · 2022-02-21T20:22:10Z

I'm not convinced the differences between glibc and musl are the root cause here; unless colima does something different, there should be only a single nameserver in /etc/resolv.conf, and it should point to the lima internal host resolver.

This is the case in Colima as well, and the single nameserver is 192.168.5.3.

This should be fixed in the forthcoming lima 0.8.3 release. So I would appreciate if you could all re-test with that version (once released), and report back if this improved/fixed the situation!

Looking forward to it. Thanks.

@abiosoft Do we need to wait for a colima release for this? Running colima 0.3.3, and lima 0.8.3.

I experience this error:

Unable to connect to the server: dial tcp: lookup private.hostname.from.internal.company.com on 192.168.5.3:53: read udp 172.17.0.2:34738->192.168.5.3:53: i/o timeout

When I go into the VM:

dnn@overwatch ~ » colima ssh
colima:/Users/dnn$ nslookup private.hostname.from.internal.company.com
;; connection timed out; no servers could be reached

This happens because I'm running a script that is doing the same lookup over and over again very quickly. If I stop for a few minutes and try again, the DNS lookup is okay.

abiosoft · 2022-02-21T21:20:36Z

@pedantic79 a lima upgrade should be all that is required.

For troubleshooting purposes, can you kindly try this #140 (comment) and see if the behaviour is different? Note that it requires recreating the VM to see the effect i.e. colima delete (if exits) prior to starting.

rahul286 · 2022-02-23T16:51:56Z

I also faced the same issue but its resolved by specifying DNS resolver

colima start --dns 1.1.1.1

pedantic79 · 2022-02-24T14:24:39Z

@abiosoft Yes that seems to fix things. I ended up using 192.168.5.2, the host, since work runs a dns proxy on my laptop. This way I can resolve private addresses not on the public DNS.

abiosoft · 2022-03-20T14:26:37Z

Can anyone try the lastest development version and see if anything changes?

brew install --HEAD colima

navels · 2022-03-21T04:16:20Z

Nope. A reasonable test for me is to download a large-ish (~1.5 GB) image:

docker image rm localstack/localstack
docker pull localstack/localstack:latest

which will get part of the way through and then stall:

Using default tag: latest
latest: Pulling from localstack/localstack
69bf0018a85c: Pull complete
d99d2ad45cad: Pull complete
2f5e7e852b75: Pull complete
9bdba4da0515: Pull complete
6d148a48367a: Pull complete
4f136f6bab8f: Pull complete
abd3b9714a4d: Pull complete
50eebec84093: Pull complete
a7f30185d16d: Pull complete
a0e7ef63792a: Pull complete
6e070eb76685: Pull complete
6fb969c1cc11: Pull complete
6b72ad47a399: Pull complete
5a968b0e80e9: Pull complete
4f4fb700ef54: Pull complete
f7deb66a5a33: Pull complete
318d55565698: Pull complete
565ac449cbaa: Pull complete
973b9108c62f: Pull complete
abe7f386e549: Pull complete
6af74865c5fb: Pull complete
b4ff06af1df8: Pull complete
b93bdfca7413: Pull complete
6e0f2f6fe87b: Pull complete
348542de0a59: Pull complete
338328b1acd7: Pull complete
343ae7575c43: Retrying in 1 second
ecaf8f60df9e: Retrying in 1 second
c01474015845: Retrying in 1 second
31c659c48f0f: Waiting
b146a65269aa: Waiting
b19b566fb94a: Waiting

and subsequent attempts:

Error response from daemon: Get "https://registry-1.docker.io/v2/": dial tcp: lookup registry-1.docker.io on 192.168.5.3:53: read udp 192.168.5.15:56456->192.168.5.3:53: i/o timeout

making me wonder if I am getting throttled or running out of sockets or something.

Using docker desktop this pull is a breeze.

abiosoft · 2022-03-21T05:01:02Z

@navels l'd be interested in knowing if there are any specifics to your network connection as I am struggling to reproduce this.
I do get Retrying in x secs once in a while but the retries are successful and it never gets bad enough for the image pulling to terminate.

Can you kindly share the output of colima version ?

Thanks.

DannyAtDejero · 2022-03-21T13:24:10Z

@abiosoft I'm seeing the same timeout and lookup failure as @navels, only in my case it was triggered by pushing a number of images in quick succession instead of pulling a single large one. I've confirmed that docker pull localstack/localstack:latest often fails with endless retry messages for me as well.

% colima version
colima version HEAD-5e2e413
git commit: 5e2e41310e595553dcdc29ba45827d4030af37bb

Other details that might be helpful:

Restarting the colima VM with colima stop; colima start resolves the issue temporarily, allowing name lookups to complete again until another large push/pull
I noticed this issue against an azure container registry initially, and assumed they were rate limiting. Now that I see it happens with docker.io too, that seems less likely.

Ping output from within the VM used to be very strange with a constantly increasing round trip and DUP packets, but that appears to be fixed in this latest version. 👍

navels · 2022-03-21T15:12:04Z

> colima version
colima version HEAD-5e2e413
git commit: 5e2e41310e595553dcdc29ba45827d4030af37bb

runtime: docker
arch: aarch64
client: v20.10.13
server: v20.10.11

I have this problem at home and at work, on and off VPN. This is on an M1 Mac Pro. Network speeds are about the same at both locations: ~300 Mbps.

Aha . . . I just tried a few different configurations and it seems to happen with more CPUs. With 1-2 CPUs I didn't have any issues. With 3 I do. My normal configuration is 8 CPUs.

Double-checked my docker desktop config: 8 CPUs.

jasoncodes · 2022-03-21T21:49:15Z

I’ve ran into these DNS issues too and I’ve found changing my DNS to use the gateway of the VDE network works well for me. If you want to see if this workaround will work for you too, try running the following before your test:

colima ssh -- sudo sh -c 'echo nameserver 192.168.106.1 > /etc/resolv.conf'

This temporary patch can be reverted by restarting colima or running the above again with 192.168.5.3. I have the following in ~/.lima/_config/override.yaml to make this change persistent:

useHostResolver: false
dns:
  - 192.168.106.1

navels · 2022-03-21T23:09:00Z

Yep, yep, there are workarounds, just trying to help @abiosoft troubleshoot.

spkane · 2022-03-21T23:34:23Z

I am also still seeing issues with the use case that I reported in #137 (comment)

The first time I run something like:

nerdctl build --namespace k8s.io --platform linux/amd64 -t test/test:local -f ./Dockerfile . it fails with:

error: failed to solve: docker.io/golang:1.17: failed to do request: Head "https://registry-1.docker.io/v2/library/golang/manifests/1.17": dial tcp: lookup registry-1.docker.io on [::1]:53: read udp [::1]:41097->[::1]:53: read: connection refused

After another one or two tries (so likely after some short amount of time from the first attempt) it works and then continues to work.

abiosoft · 2022-03-24T04:54:14Z

@spkane can you try the last development version brew install --head colima and see if that improves anything?

abiosoft · 2022-03-24T04:57:56Z

@navels you likely weren't running colima with vde networking enabled as the fix for m1 devices just got pushed.
Can you try installing again brew install --HEAD colima and get rid of /opt/colima with sudo rm -rf /opt/colima.

Does that change anything?

navels · 2022-03-24T05:08:27Z

Unfortunately no change, fails with 3 CPUs.

colima version HEAD-3fc20b2

abiosoft · 2022-03-24T05:25:29Z

@navels are you able to see the IP address in the output of colima ls?

navels · 2022-03-24T05:28:57Z

Yep: 192.168.106.2

ramunasd · 2022-04-15T13:08:39Z

@abiosoft The latest HEAD has much more stable network on apple M1 CPU, with 4 cores enabled, although wrong DNS issue is still present.

colima version HEAD-37a6de0
git commit: 37a6de0ef4fe631c7b34e69697c5234a9cdd5541

runtime: docker
arch: aarch64
client: v20.10.14
server: v20.10.11

cognifloyd · 2022-04-20T23:04:48Z

Does anyone have Cisco AnyConnect installed?

I have an intel mac that I just upgraded from Catalina to Monterey.
Since the upgrade, I've been experiencing various network timeouts, but the dns issues in colima were the most pronounced as they blocked my use of docker pull. Outside of Colima, git was often hanging as well, so I didn't think it was a uniquely colima issue, so I kept looking after I found this issue.

I have Cisco AnyConnect installed which I occasionally use to connect to a VPN. After the Monterey update, "Cisco AnyConnect Socket Filter" showed up and asked for permission to run a new SystemExtension. I allowed it at that point, but I think that was the culprit behind all my network issues.
Here are some other issues people experienced with it: https://apple.stackexchange.com/questions/420773/the-process-com-cisco-anyconnect-macos-acsockext-hogs-mac-cpu-but-cannot-be-kill

This service is suspicious (to me) because its "features" are (based on the docs):

DNS proxy (aka: screw up the DNS by doing MITM crap)
App/Transparent proxy
Content filter

So, I just deleted Cisco AnyConnect Socket Filter (deleted it from the Applications) which removed the SystemExtension.
And, I stopped its annoying "notification" service from pestering me about it on reboot.

$ launchctl blame cisco
// this prints a list the services. You want the gui/...cisco.anyconnect.notification... one.
$ launchctl disable gui/<number>/application.com.cisco.anyconnect.notification.<number>.<number>
$ launchctl stop gui/<number>/application.com.cisco.anyconnect.notification.<number>.<number>
$ launchctl kill 9 gui/<number>/application.com.cisco.anyconnect.notification.<number>.<number>

After doing all of that (and another reboot), dns works in colima again!

navels · 2023-09-28T01:05:00Z

I stopped using colima a while ago but just tried this again and am not getting the errors, so either fixed in colima or the Mac networking stack (Sonoma on an M1 Pro).

* build: Lock GitHub runners' OS This was motivated by our macOS jobs failing [2] because colima is missing. It looks like this is because the latest versions of the macOS runner no longer have colima installed by default [1]. colima is now explicitly installed. [1] actions/runner-images#6216 [2] `/Users/runner/work/_temp/f19ffbff-27a9-4fc7-80b6-97791d2de141.sh: line 9: colima: command not found` * build: Lock Colima * build: Move macOS Docker installation to script * build: Move macOS libomp activation to script * build: Use latest Colima The > 0.6.0 releases actually fix the issue we have linked [1][2][3]. [1] abiosoft/colima#577 [2] https://github.com/jesse-c/MLServer/blob/c3acd60995a72141027eff506e4fd330fe824179/hack/install-docker-macos.sh#L18-L20 [3] > Switch to new user-v2 network. Fixes abiosoft/colima#648, abiosoft/colima#603, abiosoft/colima#577, abiosoft/colima#779, abiosoft/colima#137, abiosoft/colima#740.

huybuidev · 2024-11-14T02:23:09Z

@abiosoft I'm seeing the same timeout and lookup failure as @navels, only in my case it was triggered by pushing a number of images in quick succession instead of pulling a single large one. I've confirmed that docker pull localstack/localstack:latest often fails with endless retry messages for me as well.
% colima version
colima version HEAD-5e2e413
git commit: 5e2e41310e595553dcdc29ba45827d4030af37bb
Other details that might be helpful:

Restarting the colima VM with colima stop; colima start resolves the issue temporarily, allowing name lookups to complete again until another large push/pull

I noticed this issue against an azure container registry initially, and assumed they were rate limiting. Now that I see it happens with docker.io too, that seems less likely.

Ping output from within the VM used to be very strange with a constantly increasing round trip and DUP packets, but that appears to be fixed in this latest version. 👍

colima stop; colima start works for me after searching a while for the solution. Thank you very much!!!

mjkonarski-b mentioned this issue Jan 25, 2022

Network in containers breaks under bigger network load #140

Closed

navels mentioned this issue Mar 26, 2022

Unable to pull images from docker.io registry even after successful login (dial tcp: lookup registry-1.docker.io on 10.0.2.3:53: read udp 10.0.2.100:47063->10.0.2.3:53: i/o timeout) containerd/nerdctl#677

Open

dbarrosop mentioned this issue Apr 28, 2022

chore: improve build times using nix container by caching vips build nhost/hasura-storage#67

Closed

ShanePark mentioned this issue Sep 6, 2022

400 ShanePark/markdownBlog#60

Closed

abiosoft added this to the v0.6.0 milestone Nov 12, 2023

abiosoft mentioned this issue Nov 12, 2023

v0.6.0 refactor #848

Merged

abiosoft closed this as completed Nov 12, 2023

jesse-c mentioned this issue May 30, 2024

build: Lock GitHub runners' OS SeldonIO/MLServer#1765

Merged

Docker cannot pull images #137

Docker cannot pull images #137

Comments

johannmayer commented Jan 20, 2022 • edited Loading

abiosoft commented Jan 21, 2022

johannmayer commented Jan 21, 2022

spkane commented Jan 21, 2022 • edited Loading

cschmatzler commented Jan 24, 2022

starvsion commented Jan 24, 2022 • edited Loading

niroowns commented Jan 25, 2022

spkane commented Jan 26, 2022 • edited Loading

pensatocriminale commented Jan 27, 2022

yoedusvany commented Feb 1, 2022 • edited Loading

AlexLombry commented Feb 2, 2022 • edited Loading

wolf31o2 commented Feb 11, 2022

abiosoft commented Feb 13, 2022

spkane commented Feb 17, 2022

pedantic79 commented Feb 18, 2022

jandubois commented Feb 18, 2022

abiosoft commented Feb 18, 2022

navels commented Feb 19, 2022 • edited Loading

pedantic79 commented Feb 21, 2022

abiosoft commented Feb 21, 2022

rahul286 commented Feb 23, 2022 • edited Loading

pedantic79 commented Feb 24, 2022

abiosoft commented Mar 20, 2022

navels commented Mar 21, 2022 • edited Loading

abiosoft commented Mar 21, 2022

DannyAtDejero commented Mar 21, 2022

navels commented Mar 21, 2022 • edited Loading

jasoncodes commented Mar 21, 2022

navels commented Mar 21, 2022

spkane commented Mar 21, 2022 • edited Loading

abiosoft commented Mar 24, 2022

abiosoft commented Mar 24, 2022

navels commented Mar 24, 2022 • edited Loading

abiosoft commented Mar 24, 2022

navels commented Mar 24, 2022

ramunasd commented Apr 15, 2022

cognifloyd commented Apr 20, 2022 • edited Loading

navels commented Sep 28, 2023

huybuidev commented Nov 14, 2024

johannmayer commented Jan 20, 2022 •

edited

Loading

spkane commented Jan 21, 2022 •

edited

Loading

starvsion commented Jan 24, 2022 •

edited

Loading

spkane commented Jan 26, 2022 •

edited

Loading

yoedusvany commented Feb 1, 2022 •

edited

Loading

AlexLombry commented Feb 2, 2022 •

edited

Loading

navels commented Feb 19, 2022 •

edited

Loading

rahul286 commented Feb 23, 2022 •

edited

Loading

navels commented Mar 21, 2022 •

edited

Loading

navels commented Mar 21, 2022 •

edited

Loading

spkane commented Mar 21, 2022 •

edited

Loading

navels commented Mar 24, 2022 •

edited

Loading

cognifloyd commented Apr 20, 2022 •

edited

Loading