Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing remote server via DOCKER_HOST eats all memory #3528

Open
turkerdev opened this issue Apr 2, 2022 · 10 comments
Open

Accessing remote server via DOCKER_HOST eats all memory #3528

turkerdev opened this issue Apr 2, 2022 · 10 comments

Comments

@turkerdev
Copy link

Accessing remote server via SSH and running command eats all the memory.
Using the same command in server itself has no problem.

For instance,

I have a docker compose file in my local, if I run the command below, it eats all the memory and server shuts down.

DOCKER_HOST=ssh://blabla docker compose up

but, if I copy the same compose file to server and run the docker compose up command only uses ~50MB memory.

@thaJeztah
Copy link
Member

Can you provide more details, otherwise this may be difficult to look into;

  • can you provide the output of docker version
  • can you provide the output of DOCKER_HOST=ssh://blabla docker info
  • if your local machine is running macOS or Windows and have Docker Desktop installed, does the problem also reproduce if you use DOCKER_HOST=ssh://blabla com.docker.cli compose up (so using com.docker.cli instead of docker?)
  • does the problem reproduce if you call the docker compose component directly (in "standalone" mode)? you can do so by using the compose binary directly (it's likely installed in /usr/local/lib/docker/cli-plugins/, but this path may depend on how you installed); DOCKER_HOST=ssh://blabla /usr/local/lib/docker/cli-plugins/docker-compose up
  • can you provide the docker compose file you're using? if the compose file depends on provide source code or non-public images, are you able to provide a "minimal" docker compose file to reproduce the issue (that doesn't depend on your private source and non-public images)?

@turkerdev
Copy link
Author

my local uses docker desktop, but the issue also exist when I run the same command with gitlab ci. also yes using com.docker.cli reproduces the issue.

here is a video of the issue

docker version from server

Client:
 Version:           20.10.7
 API version:       1.41
 Go version:        go1.15.14
 Git commit:        f0df350
 Built:             Wed Nov 17 03:05:36 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.7
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.15.14
  Git commit:       b0f5bc3
  Built:            Wed Nov 17 03:06:14 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.6
  GitCommit:        d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc:
  Version:          1.0.0
  GitCommit:        84113eef6fc27af1b01b3181f31bbaf708715301
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

DOCKER_HOST=... docker info

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.8.1)
  compose: Docker Compose (Docker Inc., v2.3.3)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 28
  Running: 1
  Paused: 0
  Stopped: 27
 Images: 40
 Server Version: 20.10.7
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc version: 84113eef6fc27af1b01b3181f31bbaf708715301
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.10.102-99.473.amzn2.x86_64
 Operating System: Amazon Linux 2
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 965.5MiB
 Name: ip-172-31-39-226.eu-central-1.compute.internal
 ID: ROM7:G3CD:UZ5W:OC3Q:347K:BD5Y:RDOY:NU4R:JHIW:L5Q6:BBNW:7XLN
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
services:
  mongo:
    image: mongo
  postgres:
    image: postgres
  redis:
    image: redis
  nginx:
    image: nginx
  node:
    image: node

@turkerdev
Copy link
Author

Also, I noticed that using docker stack deploy has no issues. it works as it is supposed to be.

@afshin-deriv
Copy link

afshin-deriv commented May 3, 2022

Memory usage by dockerd occurs because running docker/docker-compose without -d (even with -d only for few seconds), server creates many sshd threads that consume big chunk of memory:

Client 
---
$ DOCKER_HOST=ssh://<User-name>@<Server-IP> docker compose up

Server
---
$ pstree -p $(pgrep -f '/usr/sbin/sshd -D')

sshd(246356)─┬─sshd(337077)───sshd(337114)───docker(337115)─┬─{docker}(337116)
             │                                              ├─{docker}(337117)
             │                                              ├─{docker}(337118)
             │                                              ├─{docker}(337119)
             │                                              ├─{docker}(337120)
             │                                              ├─{docker}(337121)
             │                                              ├─{docker}(337122)
             │                                              ├─{docker}(337123)
             │                                              ├─{docker}(337124)
             │                                              └─{docker}(337125)
             ├─sshd(337133)───sshd(337170)───docker(337171)─┬─{docker}(337172)
             │                                              ├─{docker}(337173)
             │                                              ├─{docker}(337174)
             │                                              ├─{docker}(337175)
             │                                              ├─{docker}(337176)
             │                                              ├─{docker}(337177)
             │                                              ├─{docker}(337178)
             │                                              ├─{docker}(337179)
             │                                              ├─{docker}(337180)
             │                                              └─{docker}(337181)
             ├─sshd(337182)───sshd(337219)───docker(337220)─┬─{docker}(337221)
.
.
.

@thaJeztah
Copy link
Member

Hm.. right, yes, so it would be attaching to each container in the compose stack to stream the output; I can imaging that causing more overhead, especially with ssh here. Wondering if we can make it reuse connections or something along those lines.

/cc @AkihiroSuda @ndeloof perhaps you have ideas?

@AkihiroSuda
Copy link
Collaborator

Maybe we should re-revert this (with some fix)?

@afshin-deriv
Copy link

afshin-deriv commented May 5, 2022

I will work on this

@afshin-deriv
Copy link

afshin-deriv commented May 8, 2022

I don’t think this issue is related to cli neither solve by this #2303


  1. Killing extra ssh processes on Docker server don’t reduce memory usage:

Client

export DOCKER_HOST=ssh://<User-name>@<Server-IP>

cat > docker-compose.yaml <<EOF
 services:
   mongo:
     image: mongo
   postgres:
     image: postgres
   redis:
     image: redis
   nginx:
     image: nginx
   node:
     image: node
EOF

docker-compose up

Server

sudo pstree -p $(pgrep -f '/usr/sbin/sshd -D')
 sshd(5156)─┬─sshd(825648)───sshd(825707)───bash(825708)───sudo(941496)───sudo(941497)───pstree(941498)
           ├─sshd(936369)───sshd(936406)───docker(936407)─┬─{docker}(936408)
           │                                              ├─{docker}(936409)
           │                                              ├─{docker}(936410)
           │                                              ├─{docker}(936411)
           │                                              ├─{docker}(936412)
           │                                              ├─{docker}(936413)
           │                                              ├─{docker}(936414)
           │                                              ├─{docker}(936415)
           │                                              ├─{docker}(936416)
           │                                              └─{docker}(936417)
           ├─sshd(938070)───sshd(938147)───docker(938260)─┬─{docker}(938262)
           │                                              ├─{docker}(938263)
           │                                              ├─{docker}(938264)
           │                                              ├─{docker}(938265)


sudo kill -9 938070 938309 ... <last ssh processID> ## from second docker ssh connections
  1. Running same commands over ssh consume less memory footprint as Docker, below commands roughly consume same amount of Ram on Server:
$ for i in `seq 10`;
> do ssh -nttf  <user-name>@<docker-server-ip> "docker run -it busybox top" 2>&1 &
> done


$ for i in `seq 60`;
> do ssh -nttf  <user-name>@<docker-server-ip> "top" 2>&1 &
> done

@nullableVoidPtr
Copy link

nullableVoidPtr commented May 24, 2022

I can speak to this; the way docker works over SSH remote appears to be:

  • Client machine executes docker-cli with ssh://
  • Client docker-cli uses the client machine's ssh binary to connect to ->
  • The remote machine sshd server, which then receives an exec (SSH protocol request) from the client to:
    • execute the undocumented command docker system dial-stdio as the SSH user
    • which then turns stdin and stdout into basically a stream transport for the dockerd REST API.

In summary:
client docker-cli <-stdio-> ssh <-tcp-> sshd <-stdio-> remote docker-cli <-unix/npipe-> dockerd

While I myself am not too familiar with compose's internals, I'd think that an docker compose up command with many images may create multiple SSH connections, which appear as forks of the remote sshd process.

I'm currently workshopping a somewhat better solution here at the moment. I haven't made a PR pending further testing, potential cross-platform issues, and error-handling, but also implementation on Docker CLI here.
The high-level overview of my changes I plan to make (so far) is:

  • Sidestep dial-stdio by serving the REST API directly on an separate listener with SSH acting as an encrypted transport (courtesy of golang.org/x/crypto/ssh)
  • Have a SSH dialler native to Docker CLI which doesn't rely on an external ssh client binary
  • Multiplex concurrent connections (if needed?[1]) to the same remote host using SSH session channels (of which there can be multiple under the single TCP/SSH connection)
  • Optional use of SSH user keys and host keys certificate to provide mutual authentication, a la TLS.

Hopefully with this architecture, there's less memory overhead as there would hypothetically be just the one process, dockerd, which handles concurrent connections from Docker CLI clients.

[1] I'm not too certain if this is actually needed, but it is a nice feature. I've already pushed code on my fork to take an accepted ssh.Conn, and pass it to a goroutine which continuously demultiplexes session channel requests into a net.Conn interface for the apiserver to Accept and run with.

@TheSilkky
Copy link

TheSilkky commented Jul 22, 2023

I'm using a remote SSH Docker context on MacOS running Docker Desktop to deploy stacks to my server, here's the output of docker info on my local system:

Client:
 Version:    24.0.2
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.10.5
    Path:     /Users/ellie/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.18.1
    Path:     /Users/ellie/.docker/cli-plugins/docker-compose
  deployx: Deploy a new stack or update an existing stack (aaraney)
    Version:  0.0.1
    Path:     /Users/ellie/.docker/cli-plugins/docker-deployx
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     /Users/ellie/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.19
    Path:     /Users/ellie/.docker/cli-plugins/docker-extension
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v0.1.0-beta.4
    Path:     /Users/ellie/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/ellie/.docker/cli-plugins/docker-sbom
  scan: Docker Scan (Docker Inc.)
    Version:  v0.26.0
    Path:     /Users/ellie/.docker/cli-plugins/docker-scan
  scout: Command line tool for Docker Scout (Docker Inc.)
    Version:  v0.12.0
    Path:     /Users/ellie/.docker/cli-plugins/docker-scout

Server:
 Containers: 2
  Running: 0
  Paused: 0
  Stopped: 2
 Images: 27
 Server Version: 24.0.2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
 runc version: v1.1.7-0-g860f061
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.49-linuxkit-pr
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 7.668GiB
 Name: docker-desktop
 ID: 7c813daa-98e6-446a-9a03-0b4ec69bf2e1
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

I left my computer on overnight and when I checked my servers metrics I noticed sshd was using almost 6 GB of memory. There was hundreds of these ssh sessions and docker system dial-stdio processes running on my server:

root       11881  0.0  0.1  25484  9472 ?        Ss   04:27   0:00 sshd: ellie [priv]
ellie      11887  0.0  0.0  25624  6412 ?        S    04:27   0:00 sshd: ellie@notty
ellie      11889  0.0  0.2 1180192 22836 ?       Ssl  04:27   0:00 docker system dial-stdio

Does anyone have some insight on this? My system is just constantly creating these sessions for no reason, when I'm not even using the Docker context. There's also a fairly recent forum post about this: Docker Continuously Making Unnecessary SSH Connections to Remote Servers

EDIT: Exiting Docker Desktop closes all of the ssh sessions and exits all the dial-stdio processes on the remote server, however if you leave Docker running it just continuously creates those sessions, eventually leading to a situation where it will use all of the servers memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants