-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancing SCC to Gate Runtime Classes #562
Conversation
- Changing behavior of the default runtime class. | ||
- Adding additional SCCs for various customzied behavior using AllowedRuntimeClasses. | ||
|
||
## Proposal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to describe how the ordering of SCCs will be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, I don't really understand what you mean by this. Do you have a pointer to a similar description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/openshift/apiserver-library-go/blob/master/pkg/securitycontextconstraints/util/sort/bypriority.go is the order @deads2k is talking about, defined through points https://github.com/openshift/apiserver-library-go/blob/6f1013f42f98d74a6b368d670215493f8425194f/pkg/securitycontextconstraints/util/sort/byrestrictions.go#L107. It determines which SCC is selected if multiple are matching SCCs are available. In general, "the most restricted SCC" should be chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably you want some subset order. If one is not the subset of the other, maybe set size matters, and if that is equal you could take something arbitrary but deterministic like alphabetically comparing the alphabetically first element in each set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I have added a section and some ideas. A subset order makes sense, possibly weighing each value similar to how RunAsUser values are weighted https://github.com/openshift/apiserver-library-go/blob/6f1013f42f98d74a6b368d670215493f8425194f/pkg/securitycontextconstraints/util/sort/byrestrictions.go#L129..L134
Trying to think about what runtimeclasses do, it seem that they fall into a few categories
For cases of 2, the benefit to adding to SCC is clear and results in distinct SCCs created for each runtime class that can be granted to users. For cases of 3, the inclusion in SCC seems unnecessary and potentially confusing since it's not necessarily honoring the pod security policy and it would be seemingly simpler to control access to the runtimeclass as a whole. For cases of 1, we can end up with significant fanout as we introduce duplicate SCCs to try to restrict access while keeping the fundamental SCC mutations consistent for different runtime classes. Cases 1 and 3 appear to be well suited towards a switch that allows or denies access based on user and/or service account like SCC today, but only making a decision based on runtimeclass. After that decision is made, SCC could be made to apply. This would give control but avoid a proliferation of SCCs. I could also imagine cases of 3 wanting to skip SCC altogether and simply write their own custom admission to force values to match what the runtimeclass "means". I think builds would fall into this category since they are well controlled and have a limit set of shapes. |
All of the examples we've added are just that. I am compelled by your position that SCCs are not the right tool to force builder pods to use a certain runtime class, but it could be useful to prevent users from having access to the capabilities builder pods have. Though, I don't want discussion about that to take away from the other reasons we should add this to the SCC structure. If you feel it does, I will drop the case.
The way I see the access to low-latency capabilities is similar to how AnyUid or AdditionalCapabilities works: vanilla k8s + cri-o has a set of capabilities it gives all users (in this case, access to a runtime class that can configure CPU scheduling), and the Openshift multi-tenant model requires that capability is restricted from some set of users. I think your notion that existing SCCs will be duplicated comes my poor representation of the way we'd enable the low-latency runtime class. Is there anyway I can make it clearer?
why would we add an additional admission controller for something that has the same knowledge as SCC (whether a user/service account is allowed to do something) but uses it in a slightly different way. This "switch" you've described is exactly what I thought SCC was. Are there other examples in openshift of additional admission controllers gating user access in a similar way SCC does?
yeah I can see this, though ultimately we'll need this enhancement to prevent generic users from accessing this runtime class, regardless of whether we use SCC to force the builder pods to use the runtime class. All in all, I am hearing we need this enhancement for at least the kata use case. If you'd like me to drop the other cases, that's fine with me. As mentioned before, the scope of this PR is not to decide how exactly this feature will be used, but rather adding the feature itself |
cfeaa47
to
a70ed66
Compare
997e42c
to
986fbeb
Compare
986fbeb
to
fbbf640
Compare
comments addressed @sttts @deads2k @zanetworker |
LGTM :) |
- `openshift-sandboxed-containers` are certainly the most restrictive, as it runs the process in an kernel separated container, protecting the host from a class of kernel | ||
vulnerabilities the default runtime is vulnerable to. | ||
|
||
Thus, a strategy similar to how [RunAsUser][run-as-user] could be implemented, where every new possible entry in AllowedRuntimeClasses must be given a point value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should or even can assign values to known runtime classes. Admission should not know about runtime classes, i.e. the meaning of the strings, at all. Think about version skew cases between apiserver and kubelet and runtime.
Also I don't think the actual point values of individual classes matter at all. Note that the user is specifying the runtime class explicitly in the pod. Hence, we choose SCCs that contain that in RequiredRuntimeClasses
. Why should the other runtime classes in the set influence the choice?
I would go this way:
points([..., "*", ...]) = 4
- otherwise:
points(X) = max(len(X), 3)
It orders SCCs with up to 3 items by length. I don't think it is of value to consider bigger sets. That will never happen in practice.
Note that we also have to think about precedence of these points to the existing ones. If you look into https://github.com/openshift/apiserver-library-go/blob/master/pkg/securitycontextconstraints/util/sort/byrestrictions.go and compare with the other point values, where do you see runtime classes?
- if you have the choice between: a) privileged +
["foo"]
b) unprivileged,["*"]
– for a pod that does not require privileged, but wants runtime class"foo"
? - if you have the choice between: a) hostPort +
["foo"]
b) no host port,["*"]
– for a pod that does not request host ports, but wants runtime class"foo"
? - if you have the choice between: a) hostNetwork +
["foo"]
b) pod network,["*"]
– for a pod that does not require host networking, but wants runtime class"foo"
? - if you have the choice between: a) hostPath mount +
["foo"]
b) no hostPath mounts,["*"]
– for a pod that does not request a hostPath, but wants runtime class"foo"
? - if you have the choice between: a) runAsAnyUserPoints +
["foo"]
b) runAsUserPoints,["*"]
– for a pod that does not specify the user, but wants runtime class"foo"
? - if you have the choice between: a) all capabilities +
["foo"]
b) no extra capabilities,["*"]
– for a pod that does not ask for additional capabilities, but wants runtime class"foo"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I could use some help here. I have added some open questions about how to weight them relatively when we don't know each field. I am still not convinced we can work around needing some Enum type, and to handle apiserver drift, to be able to accurately describe the relative weight among each of the runtime classes.
I am also pretty sure running a container in a VM, or in a user namespace is more secure than dropping all capabilities, regardless of what that capability is (see https://man7.org/linux/man-pages/man7/user_namespaces.7.html). How do we handle the fact that capMinPoints is 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we handle the fact that capMinPoints is 0?
Why do we have to? We might have to shift all cap value up with some offset if you want to put runtime classes "before".
I am also pretty sure running a container in a VM, or in a user namespace is more secure than dropping all capabilities, regardless of what that capability is
So everything other than default ""
is more secure, independent from capabilities? This suggests we have to give non-default runtime classes more than capMaxPoints = 9999
.
What about the other 5 questions in my comment above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About "VM+any capability is more secure than non-VM+no-capability", to be very clear (and as a corollay: "we need enums"):
I don't think this order matters, and hence the corollary doesn't matter either: every pod either has empty runtime class (= default) or it has an explicit runtime class string specified. If that is true, every admission request will either
- consider all SCCs with default class (among the required ones),
- or it will consider all SCCs with some other runtime class.
In either case, the other listed runtime class in those considered SCCs should not influence the order (because they don't matter for the specified runtime class).
Or in other words: the runtime class is not chosen by admission, but it only filters the set of applicable SCCs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we handle the fact that capMinPoints is 0?
Why do we have to? We might have to shift all cap value up with some offset if you want to put runtime classes "before".
Ah I didn't know we were so flexible with moving the other options points arounds, sounds good.
I am also pretty sure running a container in a VM, or in a user namespace is more secure than dropping all capabilities, regardless of what that capability is
So everything other than default
""
is more secure, independent from capabilities? This suggests we have to give non-default runtime classes more thancapMaxPoints = 9999
.
I think it's closer that: running a container in a vm or user namespace makes capMaxPoints for that pod really equal 0. Like, it takes away all of the security risk of capabilities. Same goes for runAsAnyUser. These runtime classes nullify the risk.
What about the other 5 questions in my comment above?
The other options (privileged, hostPort, hostPath, hostNetwork) + either VM or user ns is riskier than without those security options and "". Neither do anything to mitigate these security capabilities.
My general issue with this approach of ascribing value to runtime classes in general is that we are basically hard coding a value for a very flexible field.
For instance, runtime classes could be the way we introduce crun into openshift, allowing users to opt into using crun when it becomes a tech preview. Technically, the runtime class that would use crun is no safer than runc (they are basically a 1:1 mapping).
Do you have insight on how to future proof the field?
@@ -105,31 +106,33 @@ However, if admins happened to have added them, they will also have to allow the | |||
#### Point Values: | |||
Openshift SCCs need to be ordered by relative point value of "how restrictive" they are (a system described [here][byrestrictions]. | |||
This section describes how points will be assigned for `RequiredRuntimeClasses`. | |||
In general, a smaller set of RequiredRuntimeClasses is more restricted. For instance, a list that is a strict subset of another is more restrictive, and thus should have | |||
In general, a smaller set of RequiredRuntimeClasses is more restricted. For instance, a list that is shorter than another is more restrictive, and thus should have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with #562 (comment) I question this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe make clearer what "restrictive" means. Correct: it matches less pods. Incorrect: when chosen it is more secure.
|
||
`RequiredRuntimeClasses` relative weight to the other fields partially depends on the runtime classes that Openshift opts to support. | ||
To start, we will consider the case where Openshift supports `""`, `openshift-builder` and `openshift-sandboxed-containers` (notes on supporting | ||
`openshift-low-latency` in the [open questions](#open-questions) section). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is wrong, compare #562 (comment): we should not consider concrete runtime classes here and don't have to.
e19a629
to
7854153
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This enhancement describes the approach for gating runtime classes with SCC Signed-off-by: Peter Hunt <pehunt@redhat.com>
…eClasses Signed-off-by: Peter Hunt <pehunt@redhat.com>
7854153
to
4893af9
Compare
@haircommander: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
/unassign |
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
@openshift-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This enhancement describes the approach for gating runtime classes with SCC
Signed-off-by: Peter Hunt pehunt@redhat.com