Skip to content

configuring cgroup: write /sys/fs/cgroup/cgroup.subtree_control: device or resource busy (runsc inside docker?) #8111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
prattmic opened this issue Oct 21, 2022 · 4 comments
Assignees
Labels
type: bug Something isn't working

Comments

@prattmic
Copy link
Member

prattmic commented Oct 21, 2022

Description

On Go's CI instances, runsc fails with:

running container: creating container: cannot set up cgroup for root: configuring cgroup: write /sys/fs/cgroup/cgroup.subtree_control: device or resource busy

The same failure occurs with runsc -network=none do /bin/ls and runsc -rootless -network=none do /bin/ls (rootless mode is what we actually want to use).

Rootless mode could likely be made to work by adding EBUSY to https://cs.opensource.google/gvisor/gvisor/+/master:runsc/container/container.go;l=1368;drc=7bb273341ee026d49e144b987b22ef561448bee3. However, it is not clear to me if this is something that can/should be fixed more thoroughly.

I believe we get EBUSY because there are already tasks in the cgroup:

$ cat /sys/fs/cgroup/cgroup.procs 
1
18
28
34
16964

My read of cgroup_subtree_control_write -> cgroup_vet_subtree_control_enable in the Linux kernel is that the cgroup must be empty to enable subtrees.

Steps to reproduce

This is easy to reproduce on Go's CI system, but that is a pain to use (you need to upload a pending Go CL and run TryBots).

These CI instances are GCE instances with Container-Optimized OS running a Debian bullseye container. I'm not certain exactly how this container is run (e.g., is docker actually used?), but our instance config can be seen here.

It is possible that this would reproduce inside docker on any cgroup v2 system, but I'm not sure.

runsc version

8e6a0b9

docker version (if using docker)

No response

uname

Linux buildlet-linux-amd64-bullseye-rnb7df8bc 5.15.65+ #1 SMP Sat Oct 15 09:12:54 UTC 2022 x86_64 GNU/Linux

kubectl (if using Kubernetes)

No response

repo state (if built from source)

release-20221017.0-4784-g8e6a0b996

runsc debug logs (if available)

I1021 20:28:18.496544   17002 main.go:216] ***************************
I1021 20:28:18.496612   17002 main.go:217] Args: [./runsc.exe -debug -debug-log=/tmp/runsc -network=none do /bin/ls]
I1021 20:28:18.496637   17002 main.go:218] Version VERSION_MISSING
I1021 20:28:18.496664   17002 main.go:219] GOOS: linux
I1021 20:28:18.496681   17002 main.go:220] GOARCH: amd64
I1021 20:28:18.496701   17002 main.go:221] PID: 17002
I1021 20:28:18.496722   17002 main.go:222] UID: 0, GID: 0
I1021 20:28:18.496745   17002 main.go:223] Configuration:
I1021 20:28:18.496769   17002 main.go:224]              RootDir: /var/run/runsc
I1021 20:28:18.496791   17002 main.go:225]              Platform: ptrace
I1021 20:28:18.496809   17002 main.go:226]              FileAccess: exclusive, overlay: false
I1021 20:28:18.496828   17002 main.go:227]              Network: none, logging: false
I1021 20:28:18.496847   17002 main.go:228]              Strace: false, max size: 1024, syscalls: 
I1021 20:28:18.496865   17002 main.go:229]              LISAFS: true
I1021 20:28:18.496882   17002 main.go:230]              Debug: true
I1021 20:28:18.496899   17002 main.go:231]              Systemd: false
I1021 20:28:18.496922   17002 main.go:232] ***************************
D1021 20:28:18.500014   17002 specutils.go:75] Spec:
{
  "ociVersion": "",
  "process": {
    "user": {
      "uid": 0,
      "gid": 0
    },
    "args": [
      "/bin/ls"
    ],
    "env": [
      "SHELL=/bin/bash",
      "PWD=/workdir/gvisor",
      "LOGNAME=root",
      "MOTD_SHOWN=pam",
      "HOME=/root",
      "SSH_CONNECTION=::1 51668 ::1 2200",
      "TERM=tmux-256color",
      "GO_TEST_TIMEOUT_SCALE=5",
      "USER=root",
      "GO_TEST_SHORT=0",
      "SHLVL=2",
      "GO_BUILDER_NAME=linux-amd64-longtest",
      "SSH_CLIENT=::1 51668 2200",
      "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/workdir/go/bin",
      "MAIL=/var/mail/root",
      "SSH_TTY=/dev/pts/0",
      "OLDPWD=/workdir",
      "GOPATH=/workdir/gopath",
      "_=./runsc.exe"
    ],
    "cwd": "/workdir/gvisor"
  },
  "root": {
    "path": "/"
  },
  "hostname": "buildlet-linux-amd64-bullseye-rnb7df8bc",
  "linux": {
    "namespaces": [
      {
        "type": "network"
      }
    ]
  }
}
I1021 20:28:18.500219   17002 do.go:416] Changing configuration RootDir to "/tmp/runsc-do3875684630"
D1021 20:28:18.500346   17002 container.go:180] Create container, cid: runsc-049061, rootDir: "/tmp/runsc-do3875684630"
D1021 20:28:18.500468   17002 container.go:238] Creating new sandbox for container, cid: runsc-049061
D1021 20:28:18.500528   17002 cgroup.go:410] New cgroup for pid: self, *cgroup.cgroupV2: &{Mountpoint:/sys/fs/cgroup Path:/runsc-049061 Controllers:[cpuset cpu io memory hugetlb pids rdma] Own:[]}
D1021 20:28:18.500579   17002 cgroup_v2.go:129] Installing cgroup path "/sys/fs/cgroup/runsc-049061"
D1021 20:28:18.500618   17002 cgroup_v2.go:174] Deleting cgroup "/sys/fs/cgroup/runsc-049061"
D1021 20:28:18.500663   17002 container.go:699] Destroy container, cid: runsc-049061
W1021 20:28:18.500725   17002 util.go:64] FATAL ERROR: creating container: cannot set up cgroup for root: configuring cgroup: write /sys/fs/cgroup/cgroup.subtree_control: device or resource busy
W1021 20:28:18.500950   17002 main.go:274] Failure to execute command, err: 1
@prattmic prattmic added the type: bug Something isn't working label Oct 21, 2022
@prattmic
Copy link
Member Author

cc @avagin

@prattmic

This comment was marked as resolved.

@prattmic
Copy link
Member Author

Ah, I forgot to build runsc with CGO_ENABLED=0. Disregard #8111 (comment), it runs fine in rootless mode with the EBUSY case added to containers.go.

prattmic added a commit to prattmic/gvisor that referenced this issue Oct 21, 2022
When runsc is running inside of an existing container,
writing to /sys/fs/cgroup/cgroup.subtree_control fails with EBUSY
because the cgroup is not empty.

It is likely a more general bug that we fail here, but in rootless mode
cgroups aren't required anyways, so we can workaround the issue by
simply ignoring it in rootless mode.

For google#8111.
@manninglucas
Copy link
Contributor

Thanks for reporting, this looks like it could be related to #7999 and some other cgroups related crashes that have been popping up.

@manninglucas manninglucas closed this as not planned Won't fix, can't repro, duplicate, stale Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants