Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"cgroup is not set: internal libpod error" when adding container to existing pod while rootless #10800

Closed
ahwayakchih opened this issue Jun 28, 2021 · 30 comments · Fixed by #12828
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@ahwayakchih
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

With podman v3.2.x, when running rootless on Alpine Linux (no systemd, cgroups switched to v2/"unified"), adding container to a existing pod that was stopped and started at least once, started showing errors like this:

Error: pod ec578f75616cc98c4c449e5f6590e6bc8e9309ab1032d3def6d42a381ad17527 cgroup is not set: internal libpod error

Steps to reproduce the issue:

  1. Create pod with some container.

  2. Start the pod.

  3. Stop the pod.

  4. Try adding container (AFAIK does not matter if it's "temporary", i.e., with --rm, or not) to a pod:

podman run --pod testpod --rm docker.io/alpine sh -c "date"

Describe the results you received:

Error information about cgroup:

Error: pod ec578f75616cc98c4c449e5f6590e6bc8e9309ab1032d3def6d42a381ad17527 cgroup is not set: internal libpod error

Describe the results you expected:

No error, output returned and exit with status 0.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

podman version 3.2.2

Output of podman info --debug:

(paste your output here)

Package info (e.g. output of rpm -q podman or apt list podman):

host:
  arch: amd64
  buildahVersion: 1.21.0
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: Unknown
    path: /usr/bin/conmon
    version: 'conmon version 2.0.29, commit: b388b959974dee50d451f88949b3499c3ca6ca42'
  cpus: 1
  distribution:
    distribution: alpine
    version: 3.14.0
  eventLogger: file
  hostname: localhost
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 10000
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 10000
  kernel: 5.10.43-0-virt
  linkmode: dynamic
  memFree: 3857145856
  memTotal: 4134678528
  ociRuntime:
    name: crun
    package: Unknown
    path: /usr/bin/crun
    version: |-
      crun version 0.20.1
      commit: 38271d1c8d9641a2cdc70acfa3dcb6996d124b3d
      spec: 1.0.0
      +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /tmp/podman-run-1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.1.10
      commit: baa2bc5ff12fe6db646c1f4f3f966526c0eba5a0
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.1
  swapFree: 1342173184
  swapTotal: 1342173184
  uptime: 32m 42.89s
registries:
  search:
  - docker.io
store:
  configFile: /home/me/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: Unknown
      Version: |-
        fuse-overlayfs: version 1.6
        fusermount3 version: 3.10.4
        FUSE library version 3.10.4
        using FUSE kernel interface version 7.31
  graphRoot: /home/me/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 9
  runRoot: /tmp/podman-run-1000/containers
  volumePath: /home/me/.local/share/containers/storage/volumes
version:
  APIVersion: 3.2.2
  Built: 1624674245
  BuiltTime: Sat Jun 26 04:24:05 2021
  GitCommit: 94b97c166e51039997c5fd0658793af2cff0cb06
  GoVersion: go1.16.5
  OsArch: linux/amd64
  Version: 3.2.2

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

Podman is run on Alpine Linux within QEmu (KVM). There is no systemd there, only OpenRC. Cgroups switched from v1 to v2 (v2 only, not a "mix of both" type of thing).

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 28, 2021
@mheon
Copy link
Member

mheon commented Jun 28, 2021

I'll take this one. Vague suspicion it is fixed in main with the patches from @cdoern to change cgroup parent for pods to the infra container's cgroup.

@mheon mheon self-assigned this Jun 28, 2021
@cdoern
Copy link
Contributor

cdoern commented Jun 28, 2021

@mheon, I can't seem to recreate this when running locally on my machine. output is as follows:

[charliedoern@fedora podman]$ ~/Documents/podman/bin/podman pod create
f84ec9a4e1774b2797ecdcfc8642c4a3f5425703247fa9496751d829358692c6
[charliedoern@fedora podman]$ ~/Documents/podman/bin/podman pod stop f84ec9a4e1774b2797ecdcfc8642c4a3f5425703247fa9496751d829358692c6
f84ec9a4e1774b2797ecdcfc8642c4a3f5425703247fa9496751d829358692c6
[charliedoern@fedora podman]$ ~/Documents/podman/bin/podman run --pod f84ec9a4e1774b2797ecdcfc8642c4a3f5425703247fa9496751d829358692c6 --rm alpine sh -c "date"
Mon Jun 28 20:23:51 UTC 2021
[charliedoern@fedora podman]$ 

@mheon
Copy link
Member

mheon commented Jun 28, 2021

You'll need to use --cgroup-manager=cgroupfs on all your commands since you're on Fedora, which will default to the systemd cgroup manager.

@cdoern
Copy link
Contributor

cdoern commented Jun 28, 2021

You'll need to use --cgroup-manager=cgroupfs on all your commands since you're on Fedora, which will default to the systemd cgroup manager.

[charliedoern@fedora podman]$ ~/Documents/podman/bin/podman pod create --cgroup-manager=cgroupfs
9e7e1c4a5955e47bf6fe93b1aaffff4b5938cc96a7dab00f8dec246c0216c173
[charliedoern@fedora podman]$ ~/Documents/podman/bin/podman pod stop 9e7e1c4a5955e47bf6fe93b1aaffff4b5938cc96a7dab00f8dec246c0216c173
9e7e1c4a5955e47bf6fe93b1aaffff4b5938cc96a7dab00f8dec246c0216c173
[charliedoern@fedora podman]$ ~/Documents/podman/bin/podman run --pod 9e7e1c4a5955e47bf6fe93b1aaffff4b5938cc96a7dab00f8dec246c0216c173 --cgroup-manager=cgroupfs --rm alpine sh -c "date"
Mon Jun 28 20:51:53 UTC 2021

this is running on my branch w/ the pod cgroup changes. Seems like it might be more of a miscommunication in the cgroup where the pod cgroup is set but like mentioned in irc, the other containers in the pod are in a different cgroup? not sure.

@ahwayakchih
Copy link
Author

OK, i've retried from start and it looks like system reboot changes something, not just pod create, start and stop.
Everything is run on Alpine Linux:

localhost:~$ uname -a
Linux localhost 5.10.43-0-virt #1-Alpine SMP Fri, 11 Jun 2021 07:41:12 +0000 x86_64 Linux
localhost:~$ cat /etc/os-release 
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.14.0
PRETTY_NAME="Alpine Linux v3.14"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://bugs.alpinelinux.org/"

Right after create, and also after stop and start, it seems to work ok:

localhost:~$ podman ps -a
CONTAINER ID  IMAGE       COMMAND     CREATED     STATUS      PORTS       NAMES
localhost:~$ podman pod create -n "testing"
5de0103847ff96e38d3afecd5dcd120e45f39e8296cb2e6b7181eb114747c27a
localhost:~$ podman create --pod testing --name testing_one docker.io/alpine /bin/sh -c 'while true; do date; sleep 1; done'
54c281c8ca82bac43dda95b707cfd4c93186bc25486dba1b12577df2b9e33fbe
localhost:~$ podman run --rm --pod testing docker.io/alpine echo lalala
lalala
localhost:~$ podman pod start testing
5de0103847ff96e38d3afecd5dcd120e45f39e8296cb2e6b7181eb114747c27a
localhost:~$ podman run --rm --pod testing docker.io/alpine echo lalala
lalala
localhost:~$ podman pod stop testing
5de0103847ff96e38d3afecd5dcd120e45f39e8296cb2e6b7181eb114747c27a
localhost:~$ podman run --rm --pod testing docker.io/alpine echo lalala
lalala
localhost:~$ podman ps -a
CONTAINER ID  IMAGE                            COMMAND               CREATED             STATUS                      PORTS       NAMES
e4f82accd3cc  k8s.gcr.io/pause:3.5                                   About a minute ago  Up 4 seconds ago                        5de0103847ff-infra
54c281c8ca82  docker.io/library/alpine:latest  /bin/sh -c while ...  52 seconds ago      Exited (137) 7 seconds ago              testing_one
localhost:~$ podman pod stop testing
5de0103847ff96e38d3afecd5dcd120e45f39e8296cb2e6b7181eb114747c27a
localhost:~$ podman ps -a
CONTAINER ID  IMAGE                            COMMAND               CREATED             STATUS                       PORTS       NAMES
e4f82accd3cc  k8s.gcr.io/pause:3.5                                   About a minute ago  Exited (0) 2 seconds ago                 5de0103847ff-infra
54c281c8ca82  docker.io/library/alpine:latest  /bin/sh -c while ...  About a minute ago  Exited (137) 22 seconds ago              testing_one
localhost:~$ podman pod start testing
5de0103847ff96e38d3afecd5dcd120e45f39e8296cb2e6b7181eb114747c27a
localhost:~$ podman run --rm --pod testing docker.io/alpine echo lalala
lalala

It stops working after reboot:

localhost:~$ podman ps -a
CONTAINER ID  IMAGE                            COMMAND               CREATED        STATUS      PORTS       NAMES
e4f82accd3cc  k8s.gcr.io/pause:3.5                                   2 minutes ago  Created                 5de0103847ff-infra
54c281c8ca82  docker.io/library/alpine:latest  /bin/sh -c while ...  2 minutes ago  Created                 testing_one
localhost:~$ podman run --rm --pod testing docker.io/alpine echo lalala
Error: pod 5de0103847ff96e38d3afecd5dcd120e45f39e8296cb2e6b7181eb114747c27a cgroup is not set: internal libpod error
localhost:~$ podman pod start testing
5de0103847ff96e38d3afecd5dcd120e45f39e8296cb2e6b7181eb114747c27a
localhost:~$ podman ps -a
CONTAINER ID  IMAGE                            COMMAND               CREATED        STATUS            PORTS       NAMES
e4f82accd3cc  k8s.gcr.io/pause:3.5                                   3 minutes ago  Up 2 seconds ago              5de0103847ff-infra
54c281c8ca82  docker.io/library/alpine:latest  /bin/sh -c while ...  2 minutes ago  Up 2 seconds ago              testing_one
localhost:~$ podman run --rm --pod testing docker.io/alpine echo lalala
Error: pod 5de0103847ff96e38d3afecd5dcd120e45f39e8296cb2e6b7181eb114747c27a cgroup is not set: internal libpod error

Don't know if it matters or not, but i'm logging into Alpine (which is running inside QEmu) system through SSH.

Podman does not seem to create subgroup inside its cgroup:

localhost:~$ ls -la /sys/fs/cgroup/podman
total 0
drwxr-xr-x    2 root     root             0 Jun 29 17:53 .
dr-xr-xr-x    4 root     root             0 Jun 29 17:53 ..
-r--r--r--    1 root     root             0 Jun 29 17:55 cgroup.controllers
-r--r--r--    1 root     root             0 Jun 29 17:55 cgroup.events
-rw-r--r--    1 root     root             0 Jun 29 17:55 cgroup.freeze
-rw-r--r--    1 root     root             0 Jun 29 17:55 cgroup.max.depth
-rw-r--r--    1 root     root             0 Jun 29 17:55 cgroup.max.descendants
-rw-r--r--    1 root     root             0 Jun 29 17:53 cgroup.procs
-r--r--r--    1 root     root             0 Jun 29 17:55 cgroup.stat
-rw-r--r--    1 root     root             0 Jun 29 17:55 cgroup.subtree_control
-rw-r--r--    1 root     root             0 Jun 29 17:55 cgroup.threads
-rw-r--r--    1 root     root             0 Jun 29 17:55 cgroup.type
-rw-r--r--    1 root     root             0 Jun 29 17:55 cpu.max
-r--r--r--    1 root     root             0 Jun 29 17:55 cpu.stat
-rw-r--r--    1 root     root             0 Jun 29 17:55 cpu.weight
-rw-r--r--    1 root     root             0 Jun 29 17:55 cpu.weight.nice
-rw-r--r--    1 root     root             0 Jun 29 17:55 cpuset.cpus
-r--r--r--    1 root     root             0 Jun 29 17:55 cpuset.cpus.effective
-rw-r--r--    1 root     root             0 Jun 29 17:55 cpuset.cpus.partition
-rw-r--r--    1 root     root             0 Jun 29 17:55 cpuset.mems
-r--r--r--    1 root     root             0 Jun 29 17:55 cpuset.mems.effective
-r--r--r--    1 root     root             0 Jun 29 17:55 hugetlb.2MB.current
-r--r--r--    1 root     root             0 Jun 29 17:55 hugetlb.2MB.events
-r--r--r--    1 root     root             0 Jun 29 17:55 hugetlb.2MB.events.local
-rw-r--r--    1 root     root             0 Jun 29 17:55 hugetlb.2MB.max
-r--r--r--    1 root     root             0 Jun 29 17:55 hugetlb.2MB.rsvd.current
-rw-r--r--    1 root     root             0 Jun 29 17:55 hugetlb.2MB.rsvd.max
-rw-r--r--    1 root     root             0 Jun 29 17:55 io.latency
-rw-r--r--    1 root     root             0 Jun 29 17:55 io.max
-r--r--r--    1 root     root             0 Jun 29 17:55 io.stat
-r--r--r--    1 root     root             0 Jun 29 17:55 memory.current
-r--r--r--    1 root     root             0 Jun 29 17:55 memory.events
-r--r--r--    1 root     root             0 Jun 29 17:55 memory.events.local
-rw-r--r--    1 root     root             0 Jun 29 17:55 memory.high
-rw-r--r--    1 root     root             0 Jun 29 17:55 memory.low
-rw-r--r--    1 root     root             0 Jun 29 17:55 memory.max
-rw-r--r--    1 root     root             0 Jun 29 17:55 memory.min
-rw-r--r--    1 root     root             0 Jun 29 17:55 memory.oom.group
-r--r--r--    1 root     root             0 Jun 29 17:55 memory.stat
-r--r--r--    1 root     root             0 Jun 29 17:55 memory.swap.current
-r--r--r--    1 root     root             0 Jun 29 17:55 memory.swap.events
-rw-r--r--    1 root     root             0 Jun 29 17:55 memory.swap.high
-rw-r--r--    1 root     root             0 Jun 29 17:55 memory.swap.max
-r--r--r--    1 root     root             0 Jun 29 17:55 pids.current
-r--r--r--    1 root     root             0 Jun 29 17:55 pids.events
-rw-r--r--    1 root     root             0 Jun 29 17:55 pids.max

There is also sshd group, but it contains exactly the same structure as podman, no subgroups.

localhost:~$ ls -la /sys/fs/cgroup/
total 0
dr-xr-xr-x    4 root     root             0 Jun 29 17:53 .
drwxr-xr-x    7 root     root             0 Jun 29 17:53 ..
-r--r--r--    1 root     root             0 Jun 29 17:53 cgroup.controllers
-rw-r--r--    1 root     root             0 Jun 29 17:53 cgroup.max.depth
-rw-r--r--    1 root     root             0 Jun 29 17:53 cgroup.max.descendants
-rw-r--r--    1 root     root             0 Jun 29 17:53 cgroup.procs
-r--r--r--    1 root     root             0 Jun 29 17:53 cgroup.stat
-rw-r--r--    1 root     root             0 Jun 29 17:53 cgroup.subtree_control
-rw-r--r--    1 root     root             0 Jun 29 17:53 cgroup.threads
-r--r--r--    1 root     root             0 Jun 29 17:53 cpu.stat
-r--r--r--    1 root     root             0 Jun 29 17:53 cpuset.cpus.effective
-r--r--r--    1 root     root             0 Jun 29 17:53 cpuset.mems.effective
-r--r--r--    1 root     root             0 Jun 29 17:53 io.stat
-r--r--r--    1 root     root             0 Jun 29 17:53 memory.stat
drwxr-xr-x    2 root     root             0 Jun 29 17:53 podman
drwxr-xr-x    2 root     root             0 Jun 29 17:53 sshd

BTW removing the pod also shows some error, but does in fact remove it:

localhost:~$ podman pod rm testing
WARN[0000] Error updating pod 8a79fe35b778aef06b29ee7c0cac48e2085e28517d0493ab8330fe3c769945a8 conmon cgroup PID limit: open /sys/fs/cgroup/conmon/pids.max: no such file or directory 
8a79fe35b778aef06b29ee7c0cac48e2085e28517d0493ab8330fe3c769945a8

Let me know if i can provide some more info to help fix this problem :).

@ahwayakchih
Copy link
Author

@cdoern @mheon is there any other info i could provide?

@mheon
Copy link
Member

mheon commented Jul 6, 2021

No, I think we're set. I'm presently working on related changes to pod cgroups. I will tackle this once I am done with those.

@github-actions
Copy link

github-actions bot commented Aug 6, 2021

A friendly reminder that this issue had no activity for 30 days.

@cdoern
Copy link
Contributor

cdoern commented Aug 6, 2021

@mheon @giuseppe any progress on pod cgroup fix?

@jyennaco
Copy link

+1 on this issue, our containers running rootless podman on Red Hat 8 stopped working once we updated from 3.0.2 to 3.2.3. Same error message as in the original port. The workaround mentioned above, adding --cgroup-manager=cgroupfs to the podman pod create seems to be working for now. Keeping an eye out for a fix. Thanks!

@CardboardAgent
Copy link

CardboardAgent commented Sep 6, 2021

Hello, I am using cgroupManager: cgroupfs and cgroupVersion: v1 and receive the same message Error: pod ec3b24b899065e35067becb5d432a4ec91bf7a1d27a29d419e05857cb32876bd cgroup is not set: internal libpod error whlie;

  • trying to start a container using podman-compose -f docker-compose.yaml up
  • trying to start the service created by podman generate systemd --new --file

after a machine restart.
Output of podman version:

Version:      3.2.3
API Version:  3.2.3
Go Version:   go1.15.2
Built:        Thu Jan  1 01:00:00 1970
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.21.3
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: 'conmon: /usr/libexec/podman/conmon'
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.27, commit: '
  cpus: 4
  distribution:
    distribution: ubuntu
    version: "20.04"
  eventLogger: journald
  hostname: klett19
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 33
      size: 1
    - container_id: 1
      host_id: 500000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 33
      size: 1
    - container_id: 1
      host_id: 500000
      size: 65536
  kernel: 5.4.0-81-generic
  linkmode: dynamic
  memFree: 2418159616
  memTotal: 4071444480
  ociRuntime:
    name: crun
    package: 'crun: /usr/bin/crun'
    path: /usr/bin/crun
    version: |-
      crun version 0.20.1.5-925d-dirty
      commit: 0d42f1109fd73548f44b01b3e84d04a279e99d2e
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/33/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 1.1.8
      commit: unknown
      libslirp: 4.3.1-git
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.4.3
  swapFree: 3221221376
  swapTotal: 3221221376
  uptime: 80h 34m 33.44s (Approximately 3.33 days)
registries:
  search:
  - docker.io
  - quay.io
store:
  configFile: /home/www-data/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 0
    stopped: 1
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: 'fuse-overlayfs: /usr/bin/fuse-overlayfs'
      Version: |-
        fusermount3 version: 3.9.0
        fuse-overlayfs: version 1.5
        FUSE library version 3.9.0
        using FUSE kernel interface version 7.31
  graphRoot: /home/www-data/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 4
  runRoot: /run/user/33/containers
  volumePath: /home/www-data/.local/share/containers/storage/volumes
version:
  APIVersion: 3.2.3
  Built: 0
  BuiltTime: Thu Jan  1 01:00:00 1970
  GitCommit: ""
  GoVersion: go1.15.2
  OsArch: linux/amd64
  Version: 3.2.3

@stellarpower
Copy link
Contributor

Would it be possible to have a bit more info on what this error means/boils down to? I've seen it crop up before, and just had it appear now, but I think for an unrelated issue - but hard to diagnose as I don't really know what the internal issue is, without an intimate knowledge of cgroups. (@CardboardAgent I equally had an issue after a machine restart; my deployment is complex and via ansible and it seems that removing the pod and starting again is solving the issue. For whatever reason that specific container had been removed and needed recreating, but I could restart others no problem. I'm on Focal and Podman 3.3.0)

@giuseppe
Copy link
Member

giuseppe commented Sep 13, 2021

[charliedoern@fedora podman]$ ~/Documents/podman/bin/podman pod create --cgroup-manager=cgroupfs
9e7e1c4a5955e47bf6fe93b1aaffff4b5938cc96a7dab00f8dec246c0216c173
[charliedoern@fedora podman]$ ~/Documents/podman/bin/podman pod stop 9e7e1c4a5955e47bf6fe93b1aaffff4b5938cc96a7dab00f8dec246c0216c173
9e7e1c4a5955e47bf6fe93b1aaffff4b5938cc96a7dab00f8dec246c0216c173
[charliedoern@fedora podman]$ ~/Documents/podman/bin/podman run --pod 9e7e1c4a5955e47bf6fe93b1aaffff4b5938cc96a7dab00f8dec246c0216c173 --cgroup-manager=cgroupfs --rm alpine sh -c "date"
Mon Jun 28 20:51:53 UTC 2021

the --cgroup-manager flag must be to podman command before the run sub-command.

Even with that though, I am not able to reproduce locally.

I think we can drop the check for an empty cgroup, not sure why we have it:

Can anyone who is experiencing the issue try the following patch?

$ git diff
diff --git a/libpod/runtime_ctr.go b/libpod/runtime_ctr.go
index 7d3891f6e..99fcc9053 100644
--- a/libpod/runtime_ctr.go
+++ b/libpod/runtime_ctr.go
@@ -343,9 +343,6 @@ func (r *Runtime) setupContainer(ctx context.Context, ctr *Container) (_ *Contai
                                        if err != nil {
                                                return nil, errors.Wrapf(err, "error retrieving pod %s cgroup", pod.ID())
                                        }
-                                       if podCgroup == "" {
-                                               return nil, errors.Wrapf(define.ErrInternal, "pod %s cgroup is not set", pod.ID())
-                                       }
                                        canUseCgroup := !rootless.IsRootless() || isRootlessCgroupSet(podCgroup)
                                        if canUseCgroup {
                                                ctr.config.CgroupParent = podCgroup

@giuseppe
Copy link
Member

anyone had a chance to try the patch above?

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Oct 29, 2021

I am assuming this is fixed, Closing. Reopen if I am mistaken.

@rhatdan rhatdan closed this as completed Oct 29, 2021
@mheon
Copy link
Member

mheon commented Oct 29, 2021

It's not fixed.

@mheon mheon reopened this Oct 29, 2021
@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Dec 18, 2021

@giuseppe What is the state of this one?

@giuseppe
Copy link
Member

I am not able to reproduce locally, so I cannot verify my patch.

I can open a PR with my patch, as I am not sure why we have that check in place anyway

@umohnani8
Copy link
Member

Was able to reproduce this with podman v3.3.1 on Rhel 8.5 and the patch suggested by @giuseppe above fixes the problem!
@giuseppe is opening a PR, which should be backported to 3.2 and 3.3 once merged.

giuseppe added a commit to giuseppe/libpod that referenced this issue Jan 12, 2022
rootless containers do not use cgroups on cgroupv1 or if using
cgroupfs, so improve the check to account for such configuration.

Closes: containers#10800
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2028243

[NO NEW TESTS NEEDED] it requires rebooting and the rundir on a non
tmpfs file system.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@cdoern
Copy link
Contributor

cdoern commented Jan 13, 2022

@giuseppe @mheon does this mean I can start working on the resource limit pod creation flags again?

@mheon
Copy link
Member

mheon commented Jan 14, 2022

No, we still don't have support for setting resource limits, this just resolves a bug related to cgroups.

We should really talk about this during planning and get it prioritized, though.

@wswind
Copy link

wswind commented Jan 26, 2022

Meet the same issue with wsl2 fedora without systemd , after restart wsl, podman run --pod oldpod would reproduce this problem. Everything is fine with root user .

@giuseppe
Copy link
Member

Meet the same issue with wsl2 fedora without systemd , after restart wsl, podman run --pod oldpod would reproduce this problem. Everything is fine with root user .

the fix didn't get into a release yet.

@m8ram
Copy link

m8ram commented Mar 16, 2022

Ran into to this problem as well. Presumably after applying a number of updates that included podman.
After removing and recreating the pod with the --cgroup-manager=cgroupfs option the containers start correctly again.

@karta0807913
Copy link
Contributor

karta0807913 commented Mar 25, 2022

Hi everyone, I have the same problem.

I already set the --cgroup-manager=cgroupfs options.

It works sometimes, but I get the same problem now.

$ podman pod inspect alb
{
     "Id": "21847f246e2322bd5603d1776530497e8d5eba12413f85681867468fc3262dbb",
     "Name": "alb",
     "Created": "2022-03-25T11:12:05.966783305+08:00",
     "CreateCommand": [
          "podman",
          "pod",
          "create",
          "--cgroup-manager=cgroupfs",
          "-p",
          "3107:3306",
          "-p",
          "8080:8080",
          "--name",
          "alb"
     ],
     "State": "Degraded",
     "Hostname": "",
     "CreateCgroup": true,
     "CgroupParent": "/libpod_parent",
     "CreateInfra": true,
     "InfraContainerID": "40485cedc90045999f894413ecfa03005b651a5ed2475124be719d7a6689a4bf",
     "InfraConfig": {
          "PortBindings": {
               "3306/tcp": [
     {
          "HostIp": "",
          "HostPort": "3107"
     }
],
               "8080/tcp": [
     {
          "HostIp": "",
          "HostPort": "8080"
     }
]
          },
          "HostNetwork": true,
          "StaticIP": "",
          "StaticMAC": "",
          "NoManageResolvConf": false,
          "DNSServer": null,
          "DNSSearch": null,
          "DNSOption": null,
          "NoManageHosts": false,
          "HostAdd": null,
          "Networks": null,
          "NetworkOptions": null,
          "pid_ns": "private",
          "userns": "host"
     },
     "SharedNamespaces": [
          "uts",
          "ipc",
          "net"
     ],
     "NumContainers": 10,
     "Containers": [...],
}
$ podman run -d --name alb-nginx --cap-add CAP_NET_RAW --pod alb nginx
Error: pod 21847f246e2322bd5603d1776530497e8d5eba12413f85681867468fc3262dbb cgroup is not set: internal libpod error

umohnani8 pushed a commit to umohnani8/libpod that referenced this issue Apr 21, 2022
rootless containers do not use cgroups on cgroupv1 or if using
cgroupfs, so improve the check to account for such configuration.

Closes: containers#10800
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2028243

[NO NEW TESTS NEEDED] it requires rebooting and the rundir on a non
tmpfs file system.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
@dosmanak
Copy link

Hello I am getting this error on podman 3.4.7.
I deleted container from pod and try to recreate it.

$ rpm -q podman:podman-3.4.7-1.fc35.x86_64

$ podman pod inspect tomcat_db

{
     "Id": "1372ec70da4080a1ca2df4a28ffba380e9f0c0547793995401139870eefe764f",
     "Name": "tomcat_db",
     "Created": "2022-07-08T14:34:33.625028233+02:00",
     "CreateCommand": [
          "podman",
          "pod",
          "create",
          "-p",
          "8080",
          "--name",
          "tomcat_db"
     ],
     "State": "Running",
     "Hostname": "",
     "CreateCgroup": true,
     "CgroupParent": "/libpod_parent",
     "CreateInfra": true,
     "InfraContainerID": "c08364533b75022f2f1b224a549524d16417aacdc20bfcd7d2ecff81a262e1be",
     "InfraConfig": {
          "PortBindings": {
               "8080/tcp": [
     {
          "HostIp": "",
          "HostPort": "43621"
     }
]
          },
          "HostNetwork": true,
          "StaticIP": "",
          "StaticMAC": "",
          "NoManageResolvConf": false,
          "DNSServer": null,
          "DNSSearch": null,
          "DNSOption": null,
          "NoManageHosts": false,
          "HostAdd": null,
          "Networks": null,
          "NetworkOptions": null,
          "pid_ns": "private",
          "userns": "host"
     },
     "SharedNamespaces": [
          "ipc",
          "net",
          "uts"
     ],
     "NumContainers": 3,
     "Containers": [
          {
               "Id": "c01d5ffa803effd927791cd36a4559e80f809376550f28cf1babc2d6d39d9070",
               "Name": "db",
               "State": "running"
          },
          {
               "Id": "c08364533b75022f2f1b224a549524d16417aacdc20bfcd7d2ecff81a262e1be",
               "Name": "1372ec70da40-infra",
               "State": "running"
          },
          {
               "Id": "c8d9193a815c93b583673e60f873ba4f7d1492d5405de52b422e6d7c343d618c",
               "Name": "redis",
               "State": "running"
          }
     ]
}

$ podman run --pod tomcat_db --name tomcat -d tomcat

Error: pod 1372ec70da4080a1ca2df4a28ffba380e9f0c0547793995401139870eefe764f cgroup is not set: internal libpod error

I restarted the podman service and even computer which did not help.

@mheon
Copy link
Member

mheon commented Jul 14, 2022

This has been fixed upstream. Can you try a more recent Podman?

@dosmanak
Copy link

This has been fixed upstream. Can you try a more recent Podman?

In version 3.4.7 which I have, it already has been refactored somehow
https://github.com/containers/podman/blob/v3.4.7/libpod/container_internal_linux.go#L2501

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.