Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubelet cgroup settings not optimal #205

Closed
jnummelin opened this issue Sep 22, 2020 · 2 comments · Fixed by #209
Closed

Kubelet cgroup settings not optimal #205

jnummelin opened this issue Sep 22, 2020 · 2 comments · Fixed by #209
Assignees
Labels
area/worker bug Something isn't working
Milestone

Comments

@jnummelin
Copy link
Member

Kubelet dumps the following warning:

time="2020-09-22 17:50:43" level=info msg="W0922 17:50:43.193471     101 server.go:609] failed to get the kubelet's cgroup: cpu and memory cgroup hierarchy not unified.  cpu: /system.slice/containerd.service/user.slice/user-0.slice/session-5275.scope, memory: /system.slice/containerd.service.  Kubelet system container metrics may be missing." component=kubelet

Also related:

time="2020-09-22 17:51:55" level=info msg="W0922 17:51:55.702776     101 pod_container_manager_linux.go:200] failed to delete cgroup paths for [kubepods burstable podb8d84777-73d7-4d42-ad37-7ff3e14aff99] : unable to destroy cgroup paths for cgroup [kubepods burstable podb8d84777-73d7-4d42-ad37-7ff3e14aff99] : Failed to remove paths: map[blkio:/sys/fs/cgroup/blkio/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 cpu:/sys/fs/cgroup/cpu,cpuacct/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 cpuacct:/sys/fs/cgroup/cpu,cpuacct/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 cpuset:/sys/fs/cgroup/cpuset/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 devices:/sys/fs/cgroup/devices/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 freezer:/sys/fs/cgroup/freezer/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 hugetlb:/sys/fs/cgroup/hugetlb/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 memory:/sys/fs/cgroup/memory/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 net_cls:/sys/fs/cgroup/net_cls,net_prio/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 net_prio:/sys/fs/cgroup/net_cls,net_prio/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 perf_event:/sys/fs/cgroup/perf_event/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 pids:/sys/fs/cgroup/pids/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99 systemd:/sys/fs/cgroup/systemd/kubepods/burstable/podb8d84777-73d7-4d42-ad37-7ff3e14aff99]" component=kubelet

We probably should try to detect if we're running under systemd setup and set the cgroups properly on kubelet.

@jnummelin jnummelin added bug Something isn't working area/worker labels Sep 22, 2020
@jnummelin jnummelin added this to the 0.4.0 milestone Sep 22, 2020
@jnummelin
Copy link
Member Author

The failure to remove cgroups is coming from the fact that footloose does a forced bind mount of the host cgroup fs into the containers in RO mode. 🤦 Getting rid of that (with custom built footloose) does get rid of the error, but does not seem to help with flakyness of the sig-network testing on footloose.

@trawler trawler self-assigned this Sep 25, 2020
@trawler
Copy link
Contributor

trawler commented Sep 25, 2020

The first problem is caused by a mismatch in the assigned Cgroups for memory and cpu:

root@node1:~# systemd-cgls cpu
Controller cpu; Control group /system.slice/containerd.service:
├─user.slice
│ └─user-0.slice
│   ├─session-1992.scope
│   │ ├─ 77 sshd: root@pts/3
│   │ ├─ 93 -bash
│   │ ├─650 systemd-cgls cpu
│   │ └─651 pager
│   ├─user@0.service
│   │ ├─79 /lib/systemd/systemd --user
│   │ └─80 (sd-pam)
│   └─session-2002.scope
│     ├─122 mke worker CmFwaVZlcnNpb246IHYxCmNsd
│     ├─146 /var/lib/mke/bin/containerd --root=/var/lib/mke/containerd --state=/run/mke/containerd --address=/run/mke/containerd.sock --config=/etc/mke/containerd.toml
│     ├─148 HERE ===>>>> /var/lib/mke/bin/kubelet --root-dir=/var/lib/mke/kubelet --volume-plugin-dir=/usr/libexec/mke/kubelet-plugins/volume/exec --container-runtime=remote --container-runtime-endpoint=unix:///run/mke/containerd.sock --config=/var/lib/mke/kubelet-config.yaml etc etc 
│     ├─242 /var/lib/mke/bin/containerd-shim-runc-v2 -namespace k8s.io -id 97fd3f289a5314d2ddbafc6ed292ad24726e0ff43878bad77266c22c55e117f6 -address /run/mke/containerd.sock
│     └─300 /var/lib/mke/bin/containerd-shim-runc-v2 -namespace k8s.io -id 8a43f5ea6eafefa9531412a03b3267a98b5d7c1ecc5985a771ef5aa8f8456a80 -address /run/mke/containerd.sock
└─system.slice
root@node1:~# systemd-cgls memory
Controller memory; Control group /system.slice/containerd.service:
.
.
.
├─  79 /lib/systemd/systemd --user
├─  80 (sd-pam)
├─  93 -bash
├─ 122 mke worker CmFwaVZlcnNpb246IHYxCmNsdX....
├─ 146 /var/lib/mke/bin/containerd --root=/var/lib/mke/containerd --state=/run/mke/containerd --address=/run/mke/containerd.sock --config=/etc/mke/containerd.toml
├─ 148  HERE ===>>>> /var/lib/mke/bin/kubelet --root-dir=/var/lib/mke/kubelet --volume-plugin-dir=/usr/libexec/mke/kubelet-plugins/volume/exec --container-runtime=remote --container-runtime-endpoint=unix:///run/mke/containerd.sock --config=/var/lib/mke/kubelet-config.yaml --bootstrap-kubeconfig=/var/lib/mke/kubelet-bootstrap.conf --
kubeconfig=/var/lib/mke/kubelet.conf

jnummelin pushed a commit that referenced this issue Sep 28, 2020
fixes #205

Signed-off-by: Karen Almog <kalmog@mirantis.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/worker bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants