Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processes are running slow in docker container on ubuntu 18.04 #738

Open
2 of 3 tasks
Aayush04 opened this issue Jul 25, 2019 · 36 comments
Open
2 of 3 tasks

Processes are running slow in docker container on ubuntu 18.04 #738

Aayush04 opened this issue Jul 25, 2019 · 36 comments

Comments

@Aayush04
Copy link

  • This is a bug report
  • This is a feature request
  • I searched existing issues before opening this one

Processes are taking time when running them in a docker container on ubuntu 18 machine. But the same process with the same docker version is running fine on ubuntu 16 machine.

I have a node application listening on some port. Accepting get requests on the path "/" and "/docker" which simply runs a command "whoami" in the host machine and in docker container respectively and returns the result. The same node application with the same docker container is running on both the machines (ubuntu16 and ubuntu18).

Now, First I tried sending 20 concurrent get request with path "/" to both the machines. And both the machines executed the command in avg of
35-40ms. With this, I concluded that the spawning process without docker has the same behaviour.

Now, Second I tried sending 20 concurrent get request with path "/docker" to both the machines. Here, ubuntu16 machine took Max 4.3 seconds and Avg 3 seconds. But ubuntu18 machine taking Max 10seconds and Avg 9 seconds.

I tried the above test multiple times. and concluded when running processes concurrently inside docker the time taken to execute is almost double in ubuntu18 machine as compare to ubuntu16.

Output of docker version:

Client:
 Version:           18.09.8
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        0dd43dd87f
 Built:             Wed Jul 17 17:40:56 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.8
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       0dd43dd
  Built:            Wed Jul 17 17:07:25 2019
  OS/Arch:          linux/amd64
  Experimental:     false

Output of docker info which are common to both machine:

Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 2
Server Version: 18.09.8
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.296GiB
Name: myhostname
ID: LLLO:OMTS:PNNM:T3MP:AD2F:UMDG:IIZK:OGBO:3ZLL:YDBX:ONAO:AY5G
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 27
 Goroutines: 42
 System Time: 2019-07-25T15:25:54.991694211+05:30
 EventsListeners: 0
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support

Output of docker info which specific to ubuntu16:

Kernel Version: 4.4.0-112-generic
Operating System: Ubuntu 16.04.3 LTS
Total Memory: 7.303GiB
ID: FOFI:RW7N:RZSP:HHKH:BKS3:LMWL:TC2J:W7V2:222Y:Q2AU:XMU3:KLU7

Output of docker info which specific to ubuntu18:

Kernel Version: 4.15.0-1040-aws
Operating System: Ubuntu 18.04.2 LTS
Total Memory: 7.296GiB
ID: LLLO:OMTS:PNNM:T3MP:AD2F:UMDG:IIZK:OGBO:3ZLL:YDBX:ONAO:AY5G

Additional environment details (AWS, VirtualBox, physical, etc.)

Machine1: Giving better performance.
node    v10.16.0
docker  18.09.8
ubuntu  16.04.3 LTS, xenial
AWS instance type c4.xlarge

Machine2: Giving poor performance.
node    v10.16.0
docker  18.09.8
ubuntu  18.04.2 LTS, bionic
AWS instance type c4.xlarge

docker image

CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
aa0a31cd86a8        ubuntu:18.04        "bash"              2 days ago          Up 25 hours                             ubuntu

The command I am running in docker

docker exec ubuntu whoami

ubuntu18 machine Data:

1. Data of time taken in execution

2019-07-25 14:07:32.559 INFO  uid: 715a6af0-aeb7-11e9-a5a9-2fffd4e800d1 time: 1008 result: {"success":true,"data":"root"}
2019-07-25 14:07:32.941 INFO  uid: 7178c860-aeb7-11e9-a5a9-2fffd4e800d1 time: 1191 result: {"success":true,"data":"root"}
2019-07-25 14:07:40.363 INFO  uid: 71767e70-aeb7-11e9-a5a9-2fffd4e800d1 time: 8628 result: {"success":true,"data":"root"}
.
.
.
2019-07-25 14:07:41.970 INFO  uid: 718af0d0-aeb7-11e9-a5a9-2fffd4e800d1 time: 10101 result: {"success":true,"data":"root"}

2. logs of command strace -t docker exec ubuntu whoami
Screen Shot 2019-07-25 at 3 14 42 PM

3. result of perf top --sort comm,dso
Screen Shot 2019-07-25 at 3 19 47 PM

So, I need help in debugging what is wrong with docker on the ubuntu18 machine. Or if there is any limitation with docker on ubuntu18 or maybe some machine or docker configuration issue.

@mbdas
Copy link

mbdas commented Aug 19, 2019

Aayush did you find anything? We are starting to see something similar on latest Ubuntu 16 update with some 4.15 kernel and updated packages (could be same as ubuntu 18)

@crosbymichael
Copy link

Seems to be something in the docker engine causing this, not runc or containerd.

Test with Docker:

(for i in `seq 1 10`; do (time sudo docker exec test true &); done)
michael|~/development/gocode/src/github.com/containerd/containerd >
real    0m0.241s
user    0m0.031s
sys     0m0.008s

real    0m0.435s
user    0m0.026s
sys     0m0.012s

========> LONG PAUSE HERE BEFORE REST OF OUTPUT

real    0m1.887s
user    0m0.034s
sys     0m0.011s

real    0m1.907s
user    0m0.030s
sys     0m0.013s

real    0m1.923s
user    0m0.030s
sys     0m0.014s

real    0m1.940s
user    0m0.031s
sys     0m0.008s

real    0m1.961s
user    0m0.026s
sys     0m0.015s

real    0m1.987s
user    0m0.029s
sys     0m0.011s

real    0m2.003s
user    0m0.032s
sys     0m0.009s

real    0m2.025s
user    0m0.030s
sys     0m0.006s

With Containerd:

(for i in `seq 1 10`; do (time sudo ctr t  exec --exec-id $i sh true & ); done)
michael|~/development/gocode/src/github.com/containerd/containerd >
real    0m0.329s
user    0m0.015s
sys     0m0.019s

real    0m0.327s
user    0m0.011s
sys     0m0.018s

real    0m0.336s
user    0m0.011s
sys     0m0.017s

real    0m0.337s
user    0m0.012s
sys     0m0.018s

real    0m0.350s
user    0m0.019s
sys     0m0.011s

real    0m0.354s
user    0m0.010s
sys     0m0.017s

real    0m0.357s
user    0m0.014s
sys     0m0.014s

real    0m0.365s
user    0m0.012s
sys     0m0.015s

real    0m0.377s
user    0m0.018s
sys     0m0.015s

real    0m0.372s
user    0m0.015s
sys     0m0.012s

@Aayush04
Copy link
Author

Aayush04 commented Aug 20, 2019

Hello @mbdas, I found that when I installed docker via binary. That is if I install docker from the static binaries as mentioned in this link I started getting better performance as one in ubuntu 16.

But when I am installing the docker through apt-get command as mentioned here, then I am getting poor performance. I tried this test with docker 18 and 19 both but found the same behaviour.

Still, I did not get any clue for whats is the difference between installation via apt-get vs installation via static binaries.

@mbdas
Copy link

mbdas commented Aug 22, 2019

An engineer in the team found the issue. libseccomp2 was causing this. Once we downgraded the lib, it was back to normal. docker needs to look at what changed in this lib.

@severfire
Copy link

I do have similar issue. Ubuntu 19.04. Docker version 19.03.2.
WordPress container runs slow. Found out that disabling WordFence helped.
Yet still funny, cause on production it works good.
Damn strange :-)

@alcaamado
Copy link

@mbdas What is the version you have installed?

@mbdas
Copy link

mbdas commented Nov 14, 2019

2.2.3-3ubuntu3 . We have pinned this version for now.

@pfalcon
Copy link

pfalcon commented Apr 2, 2020

I now see a case when a simple "docker build" with a Dockerfile containing "apt-get install" for a dozen packages runs for 20 minutes before hanging completely. Just reducing that to:

FROM ubuntu:18.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update
RUN apt-get -y install shared-mime-info

Leads to:

Setting up libicu60:amd64 (60.2-3ubuntu3.1) ...
Setting up libglib2.0-0:amd64 (2.56.4-0ubuntu0.18.04.6) ...
No schema files found: doing nothing.
Setting up libxml2:amd64 (2.9.4+dfsg1-6.1ubuntu1.3) ...
Setting up libglib2.0-data (2.56.4-0ubuntu0.18.04.6) ...
Setting up shared-mime-info (1.9-2) ...
^C
real	4m39.560s
user	0m0.136s
sys	0m0.065s

opnfv-github pushed a commit to opnfv/opnfvdocs that referenced this issue Apr 26, 2020
* Update docs/submodules/releng from branch 'master'
  to 9fc7ba4d38afdd7816145b771e579b518506cc2a
  - Try bypassing libseccomp performance issues
    
    docker/for-linux#738
    https://forums.docker.com/t/performance-issues-after-dist-upgrade-on-ubuntu-16-04lts/76998
    
    Change-Id: I17502628ccaa7868f0605c649f0549d9fd6f9dd2
    Signed-off-by: Cédric Ollivier <cedric.ollivier@orange.com>
opnfv-github pushed a commit to opnfv/opnfvdocs that referenced this issue Apr 26, 2020
* Update docs/submodules/releng from branch 'master'
  to 9fc7ba4d38afdd7816145b771e579b518506cc2a
  - Try bypassing libseccomp performance issues
    
    docker/for-linux#738
    https://forums.docker.com/t/performance-issues-after-dist-upgrade-on-ubuntu-16-04lts/76998
    
    Change-Id: I17502628ccaa7868f0605c649f0549d9fd6f9dd2
    Signed-off-by: Cédric Ollivier <cedric.ollivier@orange.com>
opnfv-github pushed a commit to opnfv/releng that referenced this issue Apr 26, 2020
docker/for-linux#738
https://forums.docker.com/t/performance-issues-after-dist-upgrade-on-ubuntu-16-04lts/76998

Change-Id: I17502628ccaa7868f0605c649f0549d9fd6f9dd2
Signed-off-by: Cédric Ollivier <cedric.ollivier@orange.com>
opnfv-github pushed a commit to opnfv/opnfvdocs that referenced this issue Apr 26, 2020
* Update docs/submodules/releng from branch 'master'
  to 9fc7ba4d38afdd7816145b771e579b518506cc2a
  - Try bypassing libseccomp performance issues
    
    docker/for-linux#738
    https://forums.docker.com/t/performance-issues-after-dist-upgrade-on-ubuntu-16-04lts/76998
    
    Change-Id: I17502628ccaa7868f0605c649f0549d9fd6f9dd2
    Signed-off-by: Cédric Ollivier <cedric.ollivier@orange.com>
opnfv-github pushed a commit to opnfv/opnfvdocs that referenced this issue Apr 27, 2020
* Update docs/submodules/releng from branch 'master'
  to b7322ba6abc373d8e50ef29a7c2fe25a5d736042
  - Try bypassing libseccomp performance issues (OVN)
    
    docker/for-linux#738
    https://forums.docker.com/t/performance-issues-after-dist-upgrade-on-ubuntu-16-04lts/76998
    
    Change-Id: I892fba7d96b7fc957b7368009ab56b3e0a2ee1a8
    Signed-off-by: Cédric Ollivier <cedric.ollivier@orange.com>
opnfv-github pushed a commit to opnfv/releng that referenced this issue Apr 27, 2020
docker/for-linux#738
https://forums.docker.com/t/performance-issues-after-dist-upgrade-on-ubuntu-16-04lts/76998

Change-Id: I892fba7d96b7fc957b7368009ab56b3e0a2ee1a8
Signed-off-by: Cédric Ollivier <cedric.ollivier@orange.com>
opnfv-github pushed a commit to opnfv/opnfvdocs that referenced this issue Apr 27, 2020
* Update docs/submodules/releng from branch 'master'
  to b7322ba6abc373d8e50ef29a7c2fe25a5d736042
  - Try bypassing libseccomp performance issues (OVN)
    
    docker/for-linux#738
    https://forums.docker.com/t/performance-issues-after-dist-upgrade-on-ubuntu-16-04lts/76998
    
    Change-Id: I892fba7d96b7fc957b7368009ab56b3e0a2ee1a8
    Signed-off-by: Cédric Ollivier <cedric.ollivier@orange.com>
opnfv-github pushed a commit to opnfv/opnfvdocs that referenced this issue Apr 27, 2020
* Update docs/submodules/releng from branch 'master'
  to b7322ba6abc373d8e50ef29a7c2fe25a5d736042
  - Try bypassing libseccomp performance issues (OVN)
    
    docker/for-linux#738
    https://forums.docker.com/t/performance-issues-after-dist-upgrade-on-ubuntu-16-04lts/76998
    
    Change-Id: I892fba7d96b7fc957b7368009ab56b3e0a2ee1a8
    Signed-off-by: Cédric Ollivier <cedric.ollivier@orange.com>
@fabmars
Copy link

fabmars commented Apr 29, 2020

Same problem with Ubuntu 18 LTS (docker 19.03.8 bionic) and also after it's been updated to 20 LTS (docker 19.03.8 eoan).
Simply launching mysql:8 takes ~40 seconds whereas the machine is a monster and totally not loaded.

This only specifics about that server is it's got a mirror raid with mechanical disks.
I didn't have this issue on the same machine one year ago.

I tried to tweak the scheduler, without success.

After I found this issue and this thread, I tried with seccomp:unconfined to no avail. Not sure what to do now.

@hicder
Copy link

hicder commented Jun 7, 2020

I'm seeing the same thing in PopOS 20.04 as well - building C++ project with ninja doesn't utilize all the cores like it did when running in Ubuntu 18.04. My current docker version is 19.03.11, build 42e35e61f3

@pfalcon
Copy link

pfalcon commented Jun 7, 2020

I now see a case when a simple "docker build" with a Dockerfile containing "apt-get install" for a dozen packages runs for 20 minutes before hanging completely.

For reference, this slowdown in my case vanished in the same non-deterministic way as they appeared. My take-away is that Docker performance re: image building can just be non-deterministic, and not every time it's due to something was done on your side (or rather, it's not the case that you can just do something specific on your side to make it go away).

@hicder
Copy link

hicder commented Jun 7, 2020

fwiw, I don't think it's libseccomp.so. The ubuntu host that runs 18.04 has libseccomp.so.2.4.1, but it runs fine. My PopOS 20.04 has libseccomp.so.2.4.3, but is very slow.

@alexphelps
Copy link

alexphelps commented Aug 4, 2020

Has anyone found any solutions? I'm running PopOS 20.04 and docker containers run 30% slower than they should. When running a python app test suite in docker my machine is 30% slower than my colleague with same CPU running Windows/Ubuntu 18.04 in WSL. I could accept a small difference, actually I'd expect PopOS to be faster, but 30% difference doesnt quite make sense.

@fabmars
Copy link

fabmars commented Aug 4, 2020

I gave up and changed servers

@timglabisch
Copy link

timglabisch commented Oct 16, 2020

i can confirm that running docker on PopOS 20.04 is around 30% slower than it should. @alexphelps did you found a solution?

@cpuguy83
Copy link
Collaborator

cpuguy83 commented Oct 16, 2020

I am fairly certain this is related to libseccomp.
Can you try downgrading to libseccomp 2.3?

@cpuguy83
Copy link
Collaborator

Relevant seccomp issue: seccomp/libseccomp#153

@cpuguy83
Copy link
Collaborator

You could also try running with --security-opt seccomp:unconfined

@alexgit2k
Copy link

alexgit2k commented Mar 4, 2021

Can confirm the massive loss of performance:

# time docker run -it --rm perl /usr/local/bin/perl -e 'for($|++,$_++,$a++;;$,+=$a/$_,$_++,$a*=-1,$_++){if($i++>50000000){printf"%.16f\n",$,*4;exit;}}'
3.1415926735902504

real    0m14,809s
user    0m0,019s
sys     0m0,009s

With --security-opt seccomp:unconfined:

# time docker run -it --rm --security-opt seccomp:unconfined perl /usr/local/bin/perl -e 'for($|++,$_++,$a++;;$,+=$a/$_,$_++,$a*=-1,$_++){if($i++>50000000){printf"%.16f\n",$,*4;exit;}}'
3.1415926735902504

real    0m5,746s
user    0m0,015s
sys     0m0,015s

On the Host itself:

# time perl -e 'for($|++,$_++,$a++;;$,+=$a/$_,$_++,$a*=-1,$_++){if($i++>50000000){printf"%.16f\n",$,*4;exit;}}'
3.1415926735902504

real    0m5,327s
user    0m5,322s
sys     0m0,001s

@lmeyerov
Copy link

Seeing the same ~3X speedup with seccomp:unconfined..

@mbdas
Copy link

mbdas commented Mar 24, 2021

As noted before in the thread, only way out is downgrade libseccomp to a pretty old version and make it work and ensure it remains pinned and no OS update replaces it. I am curious if containerd (now that K8s directly uses it) or podman (from redhat end) which supports seccomp, whether they use a different lib to implement seccomp profile and hence no issue in those setups. Clearly no one in docker community is interested to resolve this.

@cpuguy83
Copy link
Collaborator

@mbdas As I understand it libseccomp 2.5 addresses these performance issues.

@lmeyerov
Copy link

lmeyerov commented Mar 24, 2021

I'd be happy to try upgrading to confirm the fix. Any guidance on how? (apt-get? fresh docker install at some version? basic googling did not make this obvious)

mrchapp pushed a commit to mrchapp/ci-job-configs that referenced this issue Apr 20, 2021
… 15min.

Some apparent docker problems, a-la
docker/for-linux#738

Change-Id: I214f694d35dbef1205f578d816864e30113d6fd3
Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
@maxime-michel
Copy link

I use libseccomp2 (2.5.1-1ubuntu1~18.04.1) and see the same ratio when comparing the two commands suggested above.

@ediezh
Copy link

ediezh commented Apr 28, 2021

I just updated our CI server to ubuntu 20.04.2 and have noticed the same problem. A lot of our CI tests start to fail due to timeout. I checked the libseccomps version is 2.5.1-1ubuntu1~20.04.1.

@edwardcrompton
Copy link

I'm also able to reproduce the problem illustrated by the two commands above. Here's my setup:
libseccomp2 (2.5.1-1ubuntu1~20.04.1 amd64)
Ubuntu 20.04.2 LTS
Docker version 20.10.6, build 370c289

@cpuguy83
Copy link
Collaborator

Please make sure you are using runc version rc92 and not rc93 (latest).

@maxime-michel
Copy link

maxime-michel commented Apr 30, 2021

I'm not sure how I'd be able to do that considering rc92 is not an available version in apt for me, but it looks like rc94 is coming out very soon with the following, so thanks for the hint.

@edwardcrompton
Copy link

edwardcrompton commented May 19, 2021

Having upgraded to the latest version of runc, I'm still seeing a big discrepancy in performance between the two commands @alexgit2k gave us. The ratio of the time taken for the two commands is smaller but the time taken to run without the --security-opt seccomp:unconfined is still more than double the time it takes to run with it.

# runc --version
runc version 1.0.0-rc94
spec: 1.0.2-dev
go: go1.14.15
libseccomp: 2.5.1
# time docker run -it --rm perl /usr/local/bin/perl -e 'for($|++,$_++,$a++;;$,+=$a/$_,$_++,$a*=-1,$_++){if($i++>50000000){printf"%.16f\n",$,*4;exit;}}'
3.1415926735902504
docker run -it --rm perl /usr/local/bin/perl -e   0.02s user 0.02s system 0% cpu 18.732 total
# time docker run -it --rm --security-opt seccomp:unconfined perl /usr/local/bin/perl -e 'for($|++,$_++,$a++;;$,+=$a/$_,$_++,$a*=-1,$_++){if($i++>50000000){printf"%.16f\n",$,*4;exit;}}'
3.1415926735902504
docker run -it --rm --security-opt seccomp:unconfined perl /usr/local/bin/per  0.03s user 0.01s system 0% cpu 7.732 total

@vaskozl
Copy link

vaskozl commented Jun 24, 2021

This isn't limited to docker and I'm seeing it with the lastest containerd (1.4.6 and 1.5.2) and libseccomp (2.5.1):

moby/moby#41389

I've done some rough testing on a few different systems and it appears there is a 15%-50% hit in performance prior to kernel 5.11.

Upgrading your kernel may help if that is an option for you.

@vaskozl
Copy link

vaskozl commented Jul 6, 2021

A soluytion other than running with --security-opt seccomp:unconfined, seems to be using the spectre_v2_user=off spec_store_bypass_disable=off or otherwise blanket mitigations=off linux kernel option.

http://mamememo.blogspot.com/2020/05/cpu-intensive-rubypython-code-runs.html

My guess is that running with spectre_v2_user=off spec_store_bypass_disable=off is more secure than --security-opt seccomp:unconfined where maximum performance is desired.

@nadabyte
Copy link

I get similar issue. My setup:

Ubuntu 22.04.3 LTS
libseccomp2:amd64 2.5.3-2ubuntu2
Docker version 25.0.1, build 29cf629

@nikhedonia
Copy link

I was facing a similar issue but managed to narrow down the issue in my system.
I'm using Ubuntu 23.04 and Docker version 24.0.5, build 24.0.5-0ubuntu1.

Docker was unusuably slow in build, starting and stoping containers.
The issue started when I added a data-dir in /etc/docker/deamon.conf to move docker out of my root partition.

After reverting this change everthing became several magnitudes faster.

Is anyone aware why the data directory might trigger this bug?

@cpuguy83
Copy link
Collaborator

Yeah my guess is you were running containers under qemu because they were for the wrong platform.

@nikhedonia
Copy link

@cpuguy83 unsure if you are referring to me, but if so how would the data-dir change the target platform?

I did just set the data-dir as described here:
https://docs.docker.com/config/daemon/#daemon-data-directory

I don't have QEmu installed on my system (or is docker pulling an image for it in the background too?)

@cpuguy83
Copy link
Collaborator

If you run a container with a --platform it will happily go and try to execute it.
If you then later run again without --platform it will still use the image from when you did have --platform.

People often register qemu with binfmt_misc for non-native platforms (this is usually done with docker run --privileged multiarch/binfmt or other similar images).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests