Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Desktop dies and disappears, all docker queries fail #13585

Open
rfay opened this issue Jul 5, 2023 · 6 comments
Open

Docker Desktop dies and disappears, all docker queries fail #13585

rfay opened this issue Jul 5, 2023 · 6 comments

Comments

@rfay
Copy link
Contributor

rfay commented Jul 5, 2023

Description

  • Docker Desktop just dies and goes away.
  • Any local docker ps command fails.
  • Restarting it typically fails (running it from the desktop or whatever)

Reproduce

This happens fairly often in DDEV's automated tests. It seems to happen most on the WSL2 tests, where docker is being used inside an Ubuntu WSL2 distro.

Expected behavior

It shouldn't die on us.

docker version

(done on Windows side because docker daemon is dead)

docker version
error during connect: this error may indicate that the docker daemon is not running: Get "http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.24/version": open //./pipe/docker_engine: The system cannot find the file specified.
Client:
 Cloud integration: v1.0.35
 Version:           24.0.2
 API version:       1.43
 Go version:        go1.20.4
 Git commit:        cb74dfc
 Built:             Thu May 25 21:53:15 2023
 OS/Arch:           windows/amd64
 Context:           default

After reboot, from inside wsl2 distro:

Client: Docker Engine - Community
 Cloud integration: v1.0.35
 Version:           24.0.2
 API version:       1.43
 Go version:        go1.20.4
 Git commit:        cb74dfc
 Built:             Thu May 25 21:52:17 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Desktop
 Engine:
  Version:          24.0.2
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.4
  Git commit:       659604f
  Built:            Thu May 25 21:52:17 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.21
  GitCommit:        3dce8eb055cbb6872793272b4f20ed16117344f8
 runc:
  Version:          1.1.7
  GitCommit:        v1.1.7-0-g860f061
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

docker info doesn't work when daemon is dead. Will update after reboot.
After reboot, from inside wsl2 distro:

Client: Docker Engine - Community
 Version:    24.0.2
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.0
    Path:     /usr/local/lib/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.19.1
    Path:     /usr/local/lib/docker/cli-plugins/docker-compose
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     /usr/local/lib/docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.20
    Path:     /usr/local/lib/docker/cli-plugins/docker-extension
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v0.1.0-beta.6
    Path:     /usr/local/lib/docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /usr/local/lib/docker/cli-plugins/docker-sbom
  scan: Docker Scan (Docker Inc.)
    Version:  v0.26.0
    Path:     /usr/local/lib/docker/cli-plugins/docker-scan
  scout: Command line tool for Docker Scout (Docker Inc.)
    Version:  0.16.1
    Path:     /usr/local/lib/docker/cli-plugins/docker-scout

Server:
 Containers: 3
  Running: 3
  Paused: 0
  Stopped: 0
 Images: 28
 Server Version: 24.0.2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
 runc version: v1.1.7-0-g860f061
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 5.15.90.1-microsoft-standard-WSL2
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 7.653GiB
 Name: docker-desktop
 ID: 57b710f3-4c66-475c-a2a6-4195de15f112
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support

Diagnostics ID

714965AA-305C-4811-B395-B1F87D1BCF18/20230705185519

Additional Info

No response

@djs55
Copy link

djs55 commented Jul 10, 2023

Hi @rfay , thanks for the diagnostic.

It looks like the wsl.exe subprocesses run by Docker terminated at the same time:

[2023-07-05T18:37:10.912369800Z][com.docker.backend.exe.wslengine][E] WSLBootstrap: running wsl-bootstrap: exit status 1
...
[2023-07-05T18:37:10.945303500Z][com.docker.backend.exe.wslengine][E] WSLKeepAlive: running wsl-keepalive in "docker-desktop-data": WSL engine terminated abruptly

Coincidentally the Hyper-V event logs have a series of messages around this time (I think they're in local time):

VmOperation: Vmbus pause completed,
vmName: 837EBC96-CB78-4061-9B8F-2291539609D7,
vmFriendlyName: 837EBC96-CB78-4061-9B8F-2291539609D7,
nicName: 837EBC96-CB78-4061-9B8F-2291539609D7--B0EED40F-37C8-4FD4-905D-C23CC9464260,
nicFriendlyName: ,
switchName: B95D0C5E-57D4-412B-B571-18A81A16E005,
switchFriendlyName: WSL,
delta (100 ns): 468","0","0",,"4","0","0","0","138330","Microsoft-Windows-Hyper-V-VmSwitch","67dc0d66-3695-47c0-9642-33f76f7bd7ad","Microsoft-Windows-Hyper-V-VmSwitch-Operational","9268","1160","<HOST>","S-1-5-83-1-2206121110-1080150904-2434961307-3607729747","7/5/2023 12:37:11 PM",,,"Microsoft-Windows-Hyper-V-VmSwitch-Operational","System.UInt32[]","System.Diagnostics.Eventing.Reader.EventBookmark","4","0","Guest vDev Operation End","System.Collections.ObjectModel.ReadOnlyCollection`1[System.String]","System.Collections.Generic.List`1[System.Diagnostics.Eventing.Reader.EventProperty]"
"Guest vDev Operation Begin

It's hard to tell for sure, but my guess is the WSL 2 VM terminated (or crashed), causing Docker to fail. I don't know why this would happen though.

@rfay
Copy link
Contributor Author

rfay commented Jul 10, 2023

Thanks for taking a look @djs55 - I do think that Docker Desktop would be better to stay alive after WSL2 crash, and show the problem?

While this happens fairly often in current DD, I note that our similar tests of docker-ce in WSL2 (Ubuntu) do have occasional WSL2 (or distro) crashes that are unexplained. I guess in both cases I can get some more monitoring and logging going inside those distros.

@rfay
Copy link
Contributor Author

rfay commented Jul 17, 2023

After a recent crash of this type I did wsl -d docker-desktop and dmesg just to see if there was anything interesting there. I see this apparent response to a SIGABRT (which of course may be a result of something else happening).

3.508104] pci 7dee:00:00.0: [1af4:1049] type 00 class 0x010000
[    3.510901] pci 7dee:00:00.0: reg 0x10: [mem 0x9ffe0c000-0x9ffe0cfff 64bit]
[    3.512790] pci 7dee:00:00.0: reg 0x18: [mem 0x9ffe0d000-0x9ffe0dfff 64bit]
[    3.515245] pci 7dee:00:00.0: reg 0x20: [mem 0x9ffe0e000-0x9ffe0efff 64bit]
[    3.516881] WSL (1) WARNING: /usr/share/zoneinfo/America/Denver not found. Is the tzdata package installed?
[    3.523073] pci_bus 7dee:00: busn_res: [bus 00-ff] end is updated to 00
[    3.523844] pci 7dee:00:00.0: BAR 0: assigned [mem 0x9ffe0c000-0x9ffe0cfff 64bit]
[    3.525586] pci 7dee:00:00.0: BAR 2: assigned [mem 0x9ffe0d000-0x9ffe0dfff 64bit]
[    3.527781] pci 7dee:00:00.0: BAR 4: assigned [mem 0x9ffe0e000-0x9ffe0efff 64bit]
[    3.966629] misc dxg: dxgk: dxgkio_query_adapter_info: Ioctl failed: -22
[    3.969912] misc dxg: dxgk: dxgkio_query_adapter_info: Ioctl failed: -22
[    3.971360] misc dxg: dxgk: dxgkio_query_adapter_info: Ioctl failed: -22
[    3.972494] misc dxg: dxgk: dxgkio_query_adapter_info: Ioctl failed: -2
[    4.409692] potentially unexpected fatal signal 6.
[    4.410314] CPU: 0 PID: 161 Comm: Xwayland Not tainted 5.15.90.1-microsoft-standard-WSL2 #1
[    4.410944] RIP: 0033:0x7f0ea8df0e6c
[    4.411237] Code: ff ff 0f 46 ea eb 99 0f 1f 80 00 00 00 00 b8 ba 00 00 00 0f 05 89 c5 e8 32 d5 04 00 44 89 e2 89 ee 89 c7 b8 ea 00 00 00 0f 05 <89> c5 f7 dd 3d 00 f0 ff ff b8 00 00 00 00 0f 47 c5 48 83 ec 80 5b
[    4.412648] RSP: 002b:00007fff85fb2870 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea
[    4.413254] RAX: 0000000000000000 RBX: 00007f0ea8743980 RCX: 00007f0ea8df0e6c
[    4.413862] RDX: 0000000000000006 RSI: 000000000000001a RDI: 000000000000001a
[    4.414477] RBP: 000000000000001a R08: 00007fff85fb2938 R09: 0000000000000000
[    4.415165] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006
[    4.415907] R13: 00007fff85fb4e00 R14: 00007f0ea6a09da0 R15: 0000000000000000
[    4.416565] FS:  00007f0ea8743980 GS:  0000000000000000

@rfay
Copy link
Contributor Author

rfay commented Aug 11, 2023

I see progress on this in 4.22.0, with a nice message:

image

And as a result of the message, I see wsl -l -v showing all distros stopped. The next thing we want to know is... what causes this. It's certainly not wsl --shutdown on this unattended test runner.

@rfay
Copy link
Contributor Author

rfay commented Sep 19, 2023

I'm thinking this may be memory exhausted. I'll try the new experimental commands to manage memory better in WSL2 2.0.0, https://devblogs.microsoft.com/commandline/windows-subsystem-for-linux-september-2023-update/

@rfay
Copy link
Contributor Author

rfay commented Sep 25, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants