-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] PodRestriction proposal #1950
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tallclair The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/sig scheduling |
**Bind to Pod labels**: Pod labels don't have any authorization of usage without something like | ||
PodRestriction, so binding to labels isn't sufficient no it's own. | ||
|
||
**Bind to Namespaces**: The approach I'm recommending. Cluster level objects use a namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much more intuitive. I am not sure how well it works at scale. One case I could think of is having a lot of namespaces all having different labels based on which selection could happen resulting in delay of policy binding. This could be avoided by labeling all the namespaces with predefined value though.
|
||
#### Optional Extensions | ||
|
||
- Namespaced PodRestrictions - PodRestriction objects can be created in a namespace, so that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please elaborate on how to avoid conflicts between restriction objects created in namespace vs cluster-scoped ones? or are the restriction objects which are defined in cluster-scoped cannot be touched in namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we implemented namespaced restrictions, they would be a strict subset of the cluster restrictions. E.g. the admission controller first validates the pod against the cluster-scoped restrictions (and rejects it immediately if it's invalid). Only if it's valid under the cluster restrictions is it then checked against the namespaced restrictions.
#### Monolith vs. Modular | ||
|
||
Our current approach to policies has been "modular", i.e. a different controller and API for each | ||
"area" of features. This has led to some inconsistencies and gaps in our policies, and makes adding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have gone past that stage of "area" of features as well. While I am not completely against monolithic approach after your point, I think, having a delegation model to area would helpful and might be a good step. For example, podrestriction controller would delegate to scheduling admission controller which can add restrictions specific to scheduling. But a careful grouping of controllers to areas is very much needed and some controllers may be in 2 or more groups.
There are 2 problems with mutating Pod fields: 1) it becomes hard to compose policies (what happens | ||
when the defaults conflict?) and 2) you need a policy ordering mechanism for when a Pod matches | ||
multiple policies. Instead, PodRestriction opts out of mutating defaults entirely, and leaves that | ||
for a different component to solve ([PodPreset?](../service-catalog/pod-preset.md)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if mutation is needed via webhooks, won't they be allowed as well? If yes, it may be too restrictive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this. I'm suggesting that the PodRestriction doesn't do any mutations, but it has nothing to do with whether other controllers / policies make mutating changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, thanks for the clarification. After reading this first line the paragraph I got confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I'm not sure what the future of PodPreset is.
AutomountServiceAccountToken *BoolRestriction | ||
SecurityContext *PodSecurityContextRestriction | ||
|
||
// TODO: Scheduling restrictions are deeply nested structures and require |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think grouping them might be helpful further. I could talk from scheduler perspective, how about having a schedulerrestriction which has affinity, schedulerName, tolerations etc. This is similar to composition of restrictions.
### Existing Policies | ||
|
||
There are multiple policies that constrain pod objects today. PodRestriction would supersede these | ||
policies, and they would eventually be deprecated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The admission controllers mentioned here for the most part all perform mutation/defautling so how do you envision deprecating them in favor of this admission controller if it isn't supposed to mutate/default fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, another mechanism for defaulting / mutating would need to be in place before they could be deprecated. E.g. extend PodPreset to default those fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like it may get complex for PSP's since that admission controller has very specific logic around choosing a PSP to apply to a pod.
cc @liggitt
@erictune brought up a great point in another thread: Unifying all policies into a single resource means that you can't subdivide authorization on that resource. For instance, I can't have one admin who's responsible for setting the security constraints, and another responsible for setting the resource constraints. A better approach might be to stick with the "modular" approach we have now, but define a consistent set of design principles that each policy should adhere to. In other words, all policies should bind, compose, and restrict in the same (or very similar) ways. If we want to stick with the modular approach, I think we need to survey the scope of pod features and define a clearly delineated list of what policies we want, and what features fall into which policy. For instance, where should label restrictions go? SchedulingPolicy? LabelingPolicy? MetadataPolicy? |
|
||
In other words, the Pod is first checked against Accept restrictions, and is accepted if any | ||
match. Then, it is checked against the Drop restrictions, and rejected if any don't match. The | ||
default behavior at the end is to accept the pod. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative approach - use a firewall analogy:
- Fields in the policy are just pattern-matching, there's no concept of per-field allow vs deny
- Policy action is
allow
ordeny
. If the field matchers in a policy match the pod, the action is taken. - Policy has a priority specifying the order. Conflicts are resolved as deny (fail closed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tallclair - I'd prefer having a policy action which would be allow, deny or default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does a default action mean? Do you mean setting field defaults?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI i've added a #1937(comment) that sums up state of policy-related standards.
and easier to add exceptions for namespaces, but can lead to a combinatorial explosion of different | ||
policies that can become unmanageable. | ||
|
||
**Proposal**: Why not both? The default behavior is intersection, but you can punch a hole through |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tallclair - what if a Policy specifies intersection and another one union, and both are applicable, how do we resolve this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was proposing that the "union" (accept) policies are checked first. Alternatively, the "firewall" approach I mentioned uses priority and favors deny.
[#48041](https://github.com/kubernetes/kubernetes/issues/48041), | ||
[#61185](https://github.com/kubernetes/kubernetes/issues/61185)). | ||
|
||
There have been other proposals for policies restricting fields like labels, and annotations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Namespaced PodRestrictions - PodRestriction objects can be created in a namespace, so that | ||
namespace admins can place additional restrictions within their namespace. | ||
|
||
- Additional Bindings - The NamespaceSelector is not a top-level field so that additional binding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be at the top level. It just needs to be optional.
#### Justification | ||
|
||
From a security perspective, it should be sufficient to apply the restriction to Pod requests, so | ||
why validate other objects? Fail-fast. By surfacing an error before the controller object is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this can work in the case that some other admission controller mutates pod fields.
|
||
New CRDs that hold a PodTemplate and create Pods should be able to be registered against the | ||
PodRestriction controller. The registration should include a path (auto-discovery?) to extract the | ||
PodTemplate from the CRD. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erictune, who is thinking about Operators that might not explicitly expose pod templates
new restrictions needlessly complex. I propose we take a "monolithic" approach, with a single policy | ||
(PodRestriction) for all Pod fields. | ||
|
||
A monolithic restriction policy (i.e. single resource type) also allows the user to express more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be useful to add disadvantages of a monolithic policy, such as:
- harder to evolve (we still don't have fine-grain field versioning)
- no single SIG responsible
- increases the likelihood of multiple PodRestriction objects applying to the same pods
|
||
- Pods can only run as privileged if they run with a specific AntiAffinity | ||
|
||
- Pods labeled as staging must not run on production nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cross product of coordinated policies could be minimized by making up a common classification scheme.
(Without suggesting the classification mechanism at this point.)
Examples:
- Pods of class Foo can use volume type Bar and Pods of class Foo must have reduced privileges
- Pods of class Privileged can run privileged containers (and permissions of equivalent power) and Pods of class Privileged must run with a specific AntiAffinity
- Pods of class Staging must be labeled as env=staging and Pods of class Staging cannot tolerate the Production taint
There are 2 problems with mutating Pod fields: 1) it becomes hard to compose policies (what happens | ||
when the defaults conflict?) and 2) you need a policy ordering mechanism for when a Pod matches | ||
multiple policies. Instead, PodRestriction opts out of mutating defaults entirely, and leaves that | ||
for a different component to solve ([PodPreset?](../service-catalog/pod-preset.md)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I'm not sure what the future of PodPreset is.
predictable, and easier to manage (i.e. I want to restrict Pod field X, I know how to do that in | ||
the PodRestriction). | ||
|
||
2. Restrictions are self-contained. There isn't complex interdependent logic here. If you want more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this pretty much rules out subsuming LimitRange.
vs. Modular](#monolith-vs-modular)). We don't need to build it all at once, but I think over | ||
time it will eventually grow into the full version. | ||
|
||
4. Fields are unrestricted by default. This is important due to the size of the API. If I want to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to think about API evolution: what should happen when new fields are added?
|
||
const ( | ||
BindingModeAccept BindingMode = "Accept" | ||
BindingModeDrop BindingMode = "Drop" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the term "Drop" confusing. It's Deny, the opposite of Accept? (As opposed to, e.g., elide.)
type CapabilitiesRestriction struct { | ||
PointerRestriction | ||
Add *StringListRestriction | ||
Drop *StringListRestriction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, does this mean deny or elide?
type StringRestriction struct { | ||
PointerRestriction | ||
|
||
Whitelist []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean include/exclude, or accept/deny?
What's the rationale for per-field accept/deny rather than top-level accept/deny?
See updates on kubernetes/kubernetes#60001 (comment) My recommendation is that we focus on the SchedulingPolicy proposal for now (possibly applying some of the ideas from this proposal), and close this specific proposal. |
This is a proposal for a new policy object (and associated admission controller) that adds basic restrictions to most fields on a Pod. This is motivated by a desire to unify the various policies we have now (which all behave in slightly different ways), and to unblock expanding the set of restricted fields. Furthermore, it aims to solve some of the key usability problems with PodSecurityPolicies, such as the inability to compose policies and the difficulty in binding policies.
Help Wanted: I don't actually have time to own the implementation of this proposal, so I'm looking for community members who are interested in realizing this. If we decide to move forward with this approach, my hope is that we can consolidate some of the efforts around other new restriction policies.
Question: Should this be rolled up into the overall policy proposal or something like OPA? Alternatively, should this use a path-based OPA-style API?
Fix for kubernetes/kubernetes#60001
/sig auth