Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP for promoting seccomp to GA #1148

Merged
merged 10 commits into from
May 6, 2020
Merged

Conversation

tallclair
Copy link
Member

This is a proposal to upgrade the seccomp annotation on pods & pod security policies to a field, and
mark the feature as GA. This proposal aims to do the bare minimum to clean up the feature, without
blocking future enhancements.

/sig-node
/sig-auth

/priority important-longterm

/assign @liggitt @dchen1107 @derekwaynecarr
/cc @jessfraz

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jul 17, 2019
@k8s-ci-robot k8s-ci-robot requested a review from jessfraz July 17, 2019 23:57
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 17, 2019
@dchen1107
Copy link
Member

/lgtm but leaving it to @liggitt and @derekwaynecarr for the final approval.

nit: Per our offline discussion, can you make downgrade workflow more explicit here?

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 23, 2019

- Declare seccomp GA
- Fully document and formally spec the feature support
- Migrate the annotations to standard API fields
Copy link
Member

@liggitt liggitt Jul 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest using language like "add equivalent API fields" rather than "migrate" (which sounds like we will move data, which we know doesn't work well)

edit: if the migration only occurs in pod objects where the fields are immutable after creation, this can work... the description below should be explicit that no data movement is done in pod template objects embedded in other types

edit 2: you were already explicit about pod templates... nevermind

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the pod fields are immutable, is it possible to still set them on update as part of the storage flow? The criteria would be:

  1. The seccomp annotations are set, and do not change as part of the update.
  2. The seccomp fields are not set.

I guess the first criteria is problematic, since the seccomp fields could be changed in the update, and then a subsequent update could leave them be, so maybe that doesn't matter.


type SeccompOptions struct {
// The seccomp profile to run with.
SeccompProfile
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this intended to be embedded rather than something like Profile SeccompProfile? what other options do you anticipate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I modeled it off the Volume type.

The only option that comes to mind is the fallback strategy. Currently, if the seccomp options can't be enforced the node silently ignores them. In contrast, when AppArmor can't be enforced the pod is held in a Blocked state.

From a security perspective, silently failing could leave a potential security hole. From a portability perspective, it's nice to be able to choose the best practices options with a best-effort approach. Hence it might make sense to give the user the choice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens for nodes/runtimes that don't enable seccomp is worth documenting, at least

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just securityContext.seccompProfile instead of embedding/nesting (can prefix other seccomp fields with seccomp if they are needed in the future

}

// Only one profile source may be set.
// +union
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @apelisse ... don't we want to add a discriminator here? what's the current approach to doing that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are discriminators required if the structure is immutable? I guess we'd still need it for pod templates...

Copy link
Member Author

@tallclair tallclair Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the discriminator field need to be a 1:1 mapping with union fields? If not, the API could look like this:

type SeccompProfile struct {
  // +unionDiscriminator
  // +optional
  Type *SeccompProfileType
  // +optional
  LocalhostProfile *string
}

type SeccompProfileType string

const (
  SeccompProfileUnconfined SeccompProfileType = "Unconfined"
  SeccompProfileRuntimeDefault SeccompProfileType = "RuntimeDefault"
  SeccompProfileLocalhost SeccompProfileType = "Localhost"
)

In other words, it seems redundant to have a the type indicate a boolean value that must be set to true. In those cases, the discriminator should be sufficient to indicate the option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the discriminator field need to be a 1:1 mapping with union fields?

No, a union type without an associated member field is permitted

keps/sig-node/20190717-seccomp-ga.md Outdated Show resolved Hide resolved
type SeccompProfileSet struct {
// Whether the unconfined profile is included in this set.
// +optional
Unconfined *bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is false different from null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. Maybe this should just be required? Is it permissible to have an optional non-pointer bool if the false & nil are equivalent? I suppose a reason to have nil is to allow for defaulting in mutating admission

keps/sig-node/20190717-seccomp-ga.md Outdated Show resolved Hide resolved
// Load a profile defined in static file on the node.
// The profile must be preconfigured on the node to work.
// +optional
LocalhostProfile *string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the current annotations contain sufficient detail to know whether they should convert to a runtime or localhost profile?

Copy link
Member Author

@tallclair tallclair Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently the legal values for the annotations are:

unconfined        # the real default
docker/default    # equivalent to runtime/default
runtime/default
localhost/<path>  # where path is relative to the node's configured seccomp root

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spec out how annotation -> field conversion would work in the docker/default case (and how we'd validate annotation and field matched in that case... probably a special case)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we use API defaulting to set RuntimeProfile to "default" if type is set to "Runtime"? if so, would we clear RuntimeProfile on update if Type was changed to something other than "Runtime"?

@tallclair
Copy link
Member Author

Note to reviewers: this PR is for 1.17 - please prioritize reviewing 1.16 PRs over this.

@tallclair tallclair added this to the v1.17 milestone Jul 30, 2019
// The allowed localhostProfiles. Values may end in '*' to include all
// localhostProfiles with a prefix.
// +optional
LocalhostProfiles []string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Config object for storing profiles across the nodes? Can be another KEP though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree this would be useful, but implementing it is a big project that I don't want to block GA on. See non-goals:

  • Providing mechanisms for loading profiles from outside the static seccomp node directory

All API skew is resolved in the API server. New Kubelets will only use the seccomp values specified
in the fields, and ignore the annotations.

#### Pod Creation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected behavior if the CRI runtime doesn't support seccomp? The pod should be created without seccomp (as in the current alpha annotation) or should result in an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to maintain the backwards compatible behavior of ignoring it (I'll add a line to be explicit about this). Going forward, I left space in the API for a FailureStrategy which could dictate how to handle that situation (e.g. some applications may depend on seccomp application to run securely, while others might accept a best-effort best-practices approach).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No application should depend on a policy; a policy can be applied as a non privileged operation by any code, so if it requires it then it can just apply it.

harder to make some of the enhancements listed under [Non-Goals](#non-goals). Since the current
behavior is unguarded, I think we already need to treat the behavior as GA (which is why it's been
so hard to change the default profile), so I do not think these changes will actually increase the
friction.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my biggest concern, but I think the arguments you've laid out are correct.
This is a necessary step to get to the good stuff :)

Thanks so much Tim!

@mrunalp
Copy link
Contributor

mrunalp commented Aug 6, 2019

This looks fine for promoting what we have right now. However, I think that we would want to follow up with owning/defining seccomp policies by k8s better for workload portability as well as handling of failure scenarios.

@tallclair
Copy link
Member Author

Sorry it took so long to iterate on this. It is not currently a top priority for me, and may not make the v1.18 cutoff if it needs much more work.

There are 3 open issues:

  • how to handle warnings, I proposed an annotation in the updated KEP
  • how to handle runtime/default
  • how to handle localhost profiles

@palnabarun
Copy link
Member

Sorry it took so long to iterate on this. It is not currently a top priority for me, and may not make the v1.18 cutoff if it needs much more work.

Removing the 1.18 milestone in that case.

@palnabarun
Copy link
Member

/milestone clear

@k8s-ci-robot k8s-ci-robot removed this from the v1.18 milestone Mar 6, 2020
@tallclair
Copy link
Member Author

@pjbgf - Any chance you'd be interested in picking up this proposal to get it across the finish line? I think the most complicated of the open questions is what to do with localhost profiles. There's a bit of a chicken-and-egg dilemma there:

  • Don't want to promote localhost profiles to GA and immediately deprecate
  • Don't want to deprecate localhost profiles before there's a replacement
  • Don't want to build the replacement until seccomp goes to GA

@pjbgf
Copy link
Member

pjbgf commented Apr 9, 2020

@tallclair I am indeed quite keen on pushing this forward. But will only have availability again from the end of the month. Will get back to you then.

@pjbgf
Copy link
Member

pjbgf commented Apr 30, 2020

  • Don't want to promote localhost profiles to GA and immediately deprecate
  • Don't want to deprecate localhost profiles before there's a replacement
  • Don't want to build the replacement until seccomp goes to GA

@tallclair I had a catch-up today (w/ @saschagrunert, @evrardjp) around this. We will present the current plan of action on the next sig-node meeting, which in summary is:

  1. Keep this KEP as is - to decrease amount of changes required.
  2. Drop the two KEPs: Built-in profiles, complain mode, enable seccomp by default and using ConfigMap to store profiles.
  3. The end-to-end seccomp experience will be evolved out-of-tree with the seccomp-operator. Features developed there would potentially be pushed in-tree as and when it make sense.

@tallclair
Copy link
Member Author

/retest

@k8s-ci-robot k8s-ci-robot added sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 5, 2020
@tallclair
Copy link
Member Author

@derekwaynecarr @liggitt - Can we merge this as-is (provisional, with unresolved sections), so that I can handoff to @pjbgf to resolve the remaining issues?

@tallclair tallclair added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label May 5, 2020
@liggitt
Copy link
Member

liggitt commented May 5, 2020

if the open discussions are captured in unresolved sections with links to the thread in this PR and/or the people involved (e.g. https://github.com/kubernetes/enhancements/blame/master/keps/NNNN-kep-template/README.md#L38-L40) that seems ok

@tallclair
Copy link
Member Author

Updated the unresolved section format to match the official standard. I don't think there is more context to add on these.

@derekwaynecarr
Copy link
Member

I am fine merging this as-is, and iterating on the unresolved sections per discussion in sig-node on 5/4.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 6, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr, tallclair

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 6, 2020
@k8s-ci-robot k8s-ci-robot merged commit 2c681f9 into kubernetes:master May 6, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone May 6, 2020
@pjbgf pjbgf mentioned this pull request May 7, 2020
gnufied pushed a commit to gnufied/enhancements that referenced this pull request May 18, 2020
* KEP for promoting seccomp to GA

* Add section about PSP application

* Address feedback

* Update seccomp GA API

* Fill in approvers/reviewers

* Mark KEP as implementable

* Iterate on KEP review

* revert to provisional status

* Regenerate TOC

* Update unresolved section format
chelseychen pushed a commit to chelseychen/enhancements that referenced this pull request May 26, 2020
* KEP for promoting seccomp to GA

* Add section about PSP application

* Address feedback

* Update seccomp GA API

* Fill in approvers/reviewers

* Mark KEP as implementable

* Iterate on KEP review

* revert to provisional status

* Regenerate TOC

* Update unresolved section format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.