Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executing a simple command like ls from host to the container is very slow #1194

Closed
2 of 3 tasks
blaisewang opened this issue Jan 26, 2021 · 4 comments
Closed
2 of 3 tasks

Comments

@blaisewang
Copy link

  • This is a bug report
  • This is a feature request
  • I searched existing issues before opening this one

The health status monitor always reports that the container service status is abnormal. But the real service is not affected. The Docker logs show that the execution of the health check command has timed out, so the container status is abnormal. Then I found, a simple command like docker exec -it {container name} ls would take several seconds to complete.

image

The output of strace -r {docker command} and strace -c {docker command} show that the root cause for the slowness may come from the Docker daemon.

stracer.log

image

This problem does not always occur, the command execution time is normal in a low-load development environment.
The load in the production environment is not very high, ~20% CPU average load, ~40% memory usage.

image

image

I tried to reproduce this problem in the development environment by sending massive HTTP requests to our Nginx service from another machine. Only 5% of the docker exec commands are slower than normal.

Here's the goroutine stack trace of Docker daemon when the Docker exec command became unresponsive.

goroutine-stacks-2021-01-22T181928+0800.log

Expected behavior

The same time should take form the host to the container as from the container's terminal.

Actual behavior

The time for the host to the container command is much slower than the container's terminal command.

Steps to reproduce the behavior

I am not sure how to reproduce this problem 100% with a minimum reproduction example. I can only provide strace result and stack trace result. I can provide more details if anyone needs any further information.

Output of docker version:

#docker version
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:20:43 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:28:38 2018
  OS/Arch:          linux/amd64
  Experimental:     false

Output of docker info:

#docker info
Containers: 6
 Running: 6
 Paused: 0
 Stopped: 0
Images: 15
Server Version: 18.06.1-ce
Storage Driver: overlay
 Backing Filesystem: extfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-327.ali2012.alios7.x86_64
Operating System: Alibaba Group Enterprise Linux Server 7.2 (Paladin)
OSType: linux
Architecture: x86_64
CPUs: 80
Total Memory: 377.1GiB
Name: e21g04007.cloud.g04.amtest75
ID: L4BA:I66E:R2QU:NIOM:M3NI:AIEK:C635:YFEM:ZQIX:EBZO:DZHD:FQDD
Docker Root Dir: /home/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 67
 Goroutines: 80
 System Time: 2021-01-26T16:45:31.865996997+08:00
 EventsListeners: 0
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.)

Physical.

image

image

@blaisewang
Copy link
Author

blaisewang commented Jan 26, 2021

In #466, it seems that the reason for the slowness is due to the problem introduced in libseccomp 2.4.0 ( seccomp/libseccomp#153). So I built the dockerd 18.06.3 with the latest version libseccomp 2.5.1 (Dockerfile),replaced the binary file in the dev environment, but it didn't work.

@blaisewang
Copy link
Author

blaisewang commented Jan 26, 2021

Here's the strace output for docker daemon with docker command docker exec -it mario echo {3} which takes roughly 8.9 seconds.

dockerd.log

And here's the strace output for docker-containerd-shim of the mario container at the same time.

shim.log

What interesting in these logs is that it took roughly 5 seconds for mario's docker-containerd-shim from receiving the command to execute which is echo {3} to write the output (search 1611729050.188456 and 1611729055.136259 in the shim.log). For docker daemon, it received the result from mario's docker-containerd-shim at 1611729055.136523, but didn't send it back until 1611729058.971479.

@f-squirrel
Copy link

I experience the same issue with 20.10.12 on Ubuntu 20.04.
Is there any workaround?

@blaisewang
Copy link
Author

Aa this issue only appears in the production environment, and I cannot constantly reproduce the issue in the dev environment, I haven't found any workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants