Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined (and potentially incorrect) behavior when pids limit is set to 0 #4014

Open
haircommander opened this issue Sep 12, 2023 · 5 comments · May be fixed by #4015
Open

Undefined (and potentially incorrect) behavior when pids limit is set to 0 #4014

haircommander opened this issue Sep 12, 2023 · 5 comments · May be fixed by #4015
Labels

Comments

@haircommander
Copy link
Contributor

Description

CRI-O sets its internal pids limit to 0, attempting to "unset" it. I can't find concrete language in the current spec, but seeing opencontainers/runtime-spec#234 implies this should set the pids.max to "max".

However, since the pids limit has been supported in runc, it ignores when PidsLimit is set to 0, which causes values to populate that are unexpected, and potentially are incorrect.

For reference, crun sets the value to max when PidsLimit is set to 0

Steps to reproduce the issue

  1. create a container with pids limit 0
  2. check its pids.max

Describe the results you received and expected

the value seems to not be deterministic, and is not max

What version of runc are you using?

1.1.7, but poking through the code it seems to always have been the case

Host OS information

No response

Host kernel information

No response

kolyshkin added a commit to kolyshkin/runc that referenced this issue Sep 22, 2023
For opencontainers#4014.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin
Copy link
Contributor

OK, this only happens when systemd cgroup manager is used; when TasksMax is not set explicitly, the value is derived from the parent.

@kolyshkin
Copy link
Contributor

OK, I think the spec is not really written well in this part.

The current doc (since opencontainers/runtime-spec@488f174) says is "default is no limit". Here "default" means the default Go value (of 0), and "no limit" can be interpreted as "do not set the limit" (i.e. ignore) or "set the limit to unlimited".

Older version of doc (since opencontainers/runtime-spec#234), which @haircommander refers to, says "A value <= 0 indicates no limit".

There were some other changes in the runtime spec, in particular moving from int64 to *int64 and back.

Overall, this resulted in spec being vague, as a result, runc actually treats 0 as "do not change anything" and crun treats 0 as "no limit".

I think what we should do is

  • change the spec to be more clear;
  • fix runtimes according to the spec.

@cyphar
Copy link
Member

cyphar commented Jan 12, 2024

0 in runc.spec should be a limit of 0, which the kernel treats like a limit of 1 in most cases. Unset in config.json should mean leave. I looked at this very closely recently, I think the actual spec text is not very unclear but there are issues that make the spec unclear and we need to fix it

IMHO libcontainer API changes are not a big deal. We do not technically support that anyway.

I have a spec PR, I will push when I get back from teaching students.

@kolyshkin
Copy link
Contributor

I have a spec PR, I will push when I get back from teaching students.

@cyphar please (I just came across a new bug in this area, and we really need to fix this in spec and all runtimes).

kolyshkin added a commit to kolyshkin/podman that referenced this issue Feb 26, 2025
Since commit c25cc72 ("Allow a value of -1 to set unlimited pids
limit") podman converts the pids-limit value of -1 to 0 for OCI spec.

Unfortunately, different runtimes (crun and runc) treat pids.limit=0
differently, and the runtime-spec definition is somewhat vague
(see [1]).

Long term fix belongs to runtime-spec and then runtimes should follow
it.

Short term fix is do not convert -1 to 0 (as all runtimes treat -1 as
unlimited).

Fixes: RHEL-80973

[1]: opencontainers/runc#4014 (comment)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin added a commit to kolyshkin/podman that referenced this issue Feb 27, 2025
Since commit c25cc72 ("Allow a value of -1 to set unlimited pids
limit") podman converts the pids-limit value of -1 to 0 for OCI spec.

Unfortunately, different runtimes (crun and runc) treat pids.limit=0
differently, and the runtime-spec definition is somewhat vague
(see [1]).

Long term fix belongs to runtime-spec and then runtimes should follow
it.

Short term fix is do not convert -1 to 0 (as all runtimes treat -1 as
unlimited).

[NO TESTS NEEDED] -- this is covered by test added in commit 553e53d.

Fixes: https://issues.redhat.com/browse/RHEL-80973

[1]: opencontainers/runc#4014 (comment)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin added a commit to kolyshkin/podman that referenced this issue Feb 27, 2025
Since commit c25cc72 ("Allow a value of -1 to set unlimited pids
limit") podman converts the pids-limit value of -1 to 0 for OCI spec.

Unfortunately, different runtimes (crun and runc) treat pids.limit=0
differently, and the runtime-spec definition is somewhat vague
(see [1]).

Long term fix belongs to runtime-spec and then runtimes should follow
it.

Short term fix is do not convert -1 to 0 (as all runtimes treat -1 as
unlimited).

[NO NEW TESTS NEEDED] -- this is covered by test added in commit 553e53d.

Fixes: https://issues.redhat.com/browse/RHEL-80973

[1]: opencontainers/runc#4014 (comment)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@cyphar
Copy link
Member

cyphar commented Feb 27, 2025

Sorry, it completely fell of my radar. opencontainers/runtime-spec#1279 is the spec PR.

I should clarify what I wrote above -- nothing in the original spec wording has justified the interpretation that 0 is supposed to be treated as a magic value (either in the way cri-o has treated it with 0 being set to mean "max", or in the way runc has treated it as meaning "do nothing").

To be fair, I didn't help at all with this confusion because the Go bindings I added made a few mistakes (partially due to my own uncertainty on what the right behaviour was at the time). The runc interpretation that 0 is a no-op value comes from the fact that the Pids field was int64 rather than *int64 and so 0 is the no-op value if unspecified. It should've been an *int64 at the start -- though I think this was at the time when the Go community was finally figuring out that pointers are necessary for almost all JSON structures and so it probably was mixed up in that whole mess. And my comment on the declaration of Pids didn't help the confusion either (implying that 0 was an invalid value), though this was because I had sent a patch to change the kernel API to reject 0 but that wasn't accepted. (The nspawn folks even listed this as one of the confusing points of the spec.)

It also doesn't help that when I added support back in opencontainers/runtime-spec#64, we didn't have spec-level text for each cgroup (that was only added in opencontainers/runtime-spec#199) and so you could make a decent argument about which interpretation was more authoritative at the time. And I suspect that I also added runc's incorrect behaviour at the time as well.

So, to avoid any further confusion -- pids.limit should be treated like every other cgroup limit in the runtime-spec. -1 means "no limit" and any other value should be passed to the kernel directly (at least any >=0 value, I think the handling of <-1 values is kinda mixed between runtimes). The no-op behaviour should only happen if the limit wasn't provided.

openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/podman that referenced this issue Mar 3, 2025
Since commit c25cc72 ("Allow a value of -1 to set unlimited pids
limit") podman converts the pids-limit value of -1 to 0 for OCI spec.

Unfortunately, different runtimes (crun and runc) treat pids.limit=0
differently, and the runtime-spec definition is somewhat vague
(see [1]).

Long term fix belongs to runtime-spec and then runtimes should follow
it.

Short term fix is do not convert -1 to 0 (as all runtimes treat -1 as
unlimited).

[NO NEW TESTS NEEDED] -- this is covered by test added in commit 553e53d.

Fixes: https://issues.redhat.com/browse/RHEL-80973

[1]: opencontainers/runc#4014 (comment)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
TomSweeneyRedHat pushed a commit to TomSweeneyRedHat/podman that referenced this issue Mar 6, 2025
Since commit c25cc72 ("Allow a value of -1 to set unlimited pids
limit") podman converts the pids-limit value of -1 to 0 for OCI spec.

Unfortunately, different runtimes (crun and runc) treat pids.limit=0
differently, and the runtime-spec definition is somewhat vague
(see [1]).

Long term fix belongs to runtime-spec and then runtimes should follow
it.

Short term fix is do not convert -1 to 0 (as all runtimes treat -1 as
unlimited).

[NO NEW TESTS NEEDED] -- this is covered by test added in commit 553e53d.

Fixes: https://issues.redhat.com/browse/RHEL-80973

[1]: opencontainers/runc#4014 (comment)

Fixes: https://issues.redhat.com/browse/RHEL-82424,
https://issues.redhat.com/browse/RHEL-82425
In the RHEL 9.6 and 10.0 ZeroDay streams

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: tomsweeneyredhat <tsweeney@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants