Skip to content

dind-rootless: undeterministic behavior for "long" running processes (cgroup deleted: unknown) #184

@jappenzesr

Description

@jappenzesr

Issue Description
It seems that "long" running processes end most of the time with "docker: Error response from daemon: cgroups: cgroup deleted: unknown.". I identified the issue by trying to run an nginx container, and manage to boil it down to various runs of the sleep command "docker run -it alpine sleep n"

/ # docker run alpine sleep 0.00001
/ # docker run alpine sleep 0.0001
/ # docker run alpine sleep 0.001
docker: Error response from daemon: cgroups: cgroup deleted: unknown.
ERRO[0000] error waiting for container: context canceled
/ # docker run alpine sleep 0.001
docker: Error response from daemon: cgroups: cgroup deleted: unknown.
ERRO[0000] error waiting for container: context canceled
/ # docker run alpine sleep 0.001
docker: Error response from daemon: cgroups: cgroup deleted: unknown.
ERRO[0000] error waiting for container: context canceled
/ # docker run alpine sleep 0.001
/ # 

Note that the last run of the sleep command with the same argument executed successfully.
The same error message is observed when trying to run nginx as daemon, or ubuntu in interactive mode.

Context
We are running a large-scale build platform (based on Atlassian Bamboo) with containerized build agents run on top of kubernetes. Docker build pipelines could be easily enabled by running a dind sidecar container in the same pod as the main agent. However, the dind sidecar container needs to be executed in privileged security context which leads to security issues (as an example). In this context, rootless dind would lead to a significant improvement.

Setup
Container A running 19.03.1-dind-rootless as dind-server.

docker run --privileged --name dind-server -d \
    --network some-network --network-alias docker \
    -e DOCKER_TLS_CERTDIR=/certs \
    -v certs:/certs \
    docker:19.03.1-dind-rootless --experimental

Container B running 19.03.01-dind as dind-client.

docker run --name dind-client --network some-network \
    -e DOCKER_TLS_CERTDIR=/certs \
    -v certs:/certs:ro \
    -it docker:19.03 /bin/sh

Host: Ubuntu 16.04.6 , Docker CE 18.09.6

Commands are executed in dind-client.

/ # docker version
Client: Docker Engine - Community
 Version:           19.03.1
 API version:       1.40
 Go version:        go1.12.5
 Git commit:        74b1e89e8a
 Built:             Thu Jul 25 21:17:37 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.1
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.5
  Git commit:       74b1e89e8a
  Built:            Thu Jul 25 21:27:55 2019
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          v1.2.6
  GitCommit:        894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc:
  Version:          1.0.0-rc8
  GitCommit:        425e105d5a03fabd737a126ad93d62a9eeede87f
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683
/ #

Server Log
log.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions