-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue joining cgroups cpuset with kernel scheduler task "random" distribution #3922
Open
Tracked by
#4114
Milestone
Comments
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
May 18, 2024
This allows to set initial and final CPU affinity for a process being run in a container, which is needed to solve the issue described in [1]. [1] opencontainers/runc#3922 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
May 18, 2024
This allows to set initial and final CPU affinity for a process being run in a container, which is needed to solve the issue described in [1]. [1] opencontainers/runc#3922 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
May 19, 2024
This allows to set initial and final CPU affinity for a process being run in a container, which is needed to solve the issue described in [1]. [1] opencontainers/runc#3922 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
May 19, 2024
This allows to set initial and final CPU affinity for a process being run in a container, which is needed to solve the issue described in [1]. [1] opencontainers/runc#3922 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
Jun 2, 2024
This allows to set initial and final CPU affinity for a process being run in a container, which is needed to solve the issue described in [1]. [1] opencontainers/runc#3922 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
Jun 11, 2024
This allows to set initial and final CPU affinity for a process being run in a container, which is needed to solve the issue described in [1]. [1] opencontainers/runc#3922 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
Jun 11, 2024
This allows to set initial and final CPU affinity for a process being run in a container, which is needed to solve the issue described in [1]. [1] opencontainers/runc#3922 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is going to be implemented via opencontainers/runtime-spec#1253 |
Moving to 1.3.0 since it's a spec issue, and we agreed to move it in the 1.2.0 mega-thread. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
A customer reported us an issue when attempting to join a running container inside
kubernetes
(kubectl exec
...). The container is running a real time application taking advantage of cores allocated to this container, the application uses the first CPU core of the allocated range as a slow thread (SCHED_OTHER policy) responsible for spawning RT threads (running under SCHED_FIFO policy) each running on a core.They have configured kubernetes to ensure that it will allocate CPU cores within a specific range (all marked as isolated CPUs), they are using the kubernetes CPU manager with the static policy and have excluded all housekeeping CPUs from being allocated to a pod/container. Their machine is configured like this:
Customer has used this configuration successfully until RHEL 8.4, but with the introduction of this patch in 8.4, a random CPU assignment/scheduling occurs when a process enter (
runc
in this context) in a cgroup cpuset, before the patch addition,runc
was always scheduled on the first CPU core of the cgroup cpuset, it worked fine as the first CPU core was used by a slow thread running under SCHED_OTHER policy, since the introduction of the kernel patch,runc
is randomly scheduled on a core that can be fully taken by a RT threads running under SCHED_FIFO policy and withkernel.sched_rt_runtime_us=-1
there is no room forrunc
execution and the process get stuck, when it occurs it was observed that some other processes become unresponsive, so farsystemd
pid 1 was also stuck in a kernel call toproc_cgroup_show
.This is a corner case issue but serious enough to lock down a system.
Steps to reproduce the issue
Please find in attachment an archive with a reproducer based on vagrant/libvirt.
Decompress the archive and run
vagrant up && vagrant halt && vagrant up
Then run a vagrant VM terminal with
vagrant ssh
and execute:In another vagrant VM terminal, run
./reproducer.sh exec sh
, the command should stuck and the system also, you shouldn't be able to open another vagrant terminal withvagrant ssh
until the command in the first terminal is interrupted.If you retry by running
./reproducer.sh run 2-3,5
in the first terminal but now./reproducer.sh exec-patch sh
in the second terminal, the system is now operating correctly (PR patch on going)cpuset-issue-runc-repro.tar.gz
Describe the results you received and expected
The system stucks instead of operating correctly
What version of runc are you using?
runc 1.0.2 (but doesn't really matter here)
Host OS information
RHEL 8.X
Host kernel information
RHEL 8.X kernels
The text was updated successfully, but these errors were encountered: