-
Notifications
You must be signed in to change notification settings - Fork 21
CLOUDP-347194 - enable Pod Security Admission at restricted
level
#473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
MCK 1.5.0 Release NotesNew Features
Bug Fixes
|
SecurityContext: &corev1.SecurityContext{ | ||
ReadOnlyRootFilesystem: ptr.To(true), | ||
AllowPrivilegeEscalation: ptr.To(false), | ||
Capabilities: &corev1.Capabilities{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any potential that adding this default will break any customer's workload (or rather prevent the operator from deploying or the workload sts from restarting) and will require some manual intervention? Just thinking about our semver guarantees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is our own deployment, that we manage. If the customer wants a managedSecurityContext they are allowed to, but otherwise we should be able to modify the one we provide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is our defaults. Only problem I see with this is some customers now requiring explicitly setting capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be consider as a security fix? In that case we should be able to overwrite our defaults if they are not secure even if this forces customers to explicitly specify custom capabilities. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, do we force customers (which don't care about it) to do any manual fix when upgrading? If yes, we need to bump major.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what will we break here. We are changing our default SecurityContext for operator and other pods created. If customer wants to have dedicated SecurityContext or PodSecurityContext they need to set MANAGED_SECURITY_CONTEXT
env var and our defaults will be completely overwritten. If they don't set MANAGED_SECURITY_CONTEXT
every change they make to SecurityContext manually will be overwritten by our defaults.
Code that handles securityContext settings:
mongodb-kubernetes/mongodb-community-operator/pkg/kube/podtemplatespec/podspec_template.go
Lines 312 to 322 in 917723b
func WithDefaultSecurityContextsModifications() (Modification, container.Modification) { | |
managedSecurityContext := envvar.ReadBool(ManagedSecurityContextEnv) // nolint:forbidigo | |
configureContainerSecurityContext := container.NOOP() | |
configurePodSpecSecurityContext := NOOP() | |
if !managedSecurityContext { | |
configurePodSpecSecurityContext = WithSecurityContext(DefaultPodSecurityContext()) | |
configureContainerSecurityContext = container.WithSecurityContext(container.DefaultSecurityContext()) | |
} | |
return configurePodSpecSecurityContext, configureContainerSecurityContext | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have discussed with @lsierant that the change for the more strict Capabilities is only applied on the db/om containers, not the whole Pod. This will not affect other containers in the Pod i.e. security, istio sidecars that customer can have. The only change on the Pod level is adding seccompProfile: type: RuntimeDefault
. We can do two things with it:
- move the
seccompProfile: type: RuntimeDefault
to container level and don't specify it on pod level. We will have our containers with secure seccomp settings, but if customer will add any sidecar to it it will not have seccomp settings applied - keep it as is and secure entire Pod
@mircea-cosbuc looking for guidance here how to proceed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's best to set it at pod level. I think based on @lsierant this needs clarity on what customers might need to change on upgrade (if anything), outlining those scenarios and deciding if it's a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked what are the consequences for using seccomp: type: RuntimeDefault
and it defaults to what container runtime is used. Containerd and docker for example have very similar default seccomp profile -> https://docs.docker.com/engine/security/seccomp/#significant-syscalls-blocked-by-the-default-profile
Based on what I have found in official Kubernetes docs:
These profiles may differ between runtimes like CRI-O or containerd. They also differ for its used hardware architectures. But generally speaking, those default profiles allow a common amount of syscalls while blocking the more dangerous ones, which are unlikely or unsafe to be used in a containerized application.
Additionally on Red Hat OpenShift Container Platform RuntimeDefault
is often enforced by default via Security Context Constraints (SCCs).
To summarise it is unlikely that users of our Operator require more syscalls permissions in MongoDB workloads than what is allowed by RuntimeDefault
seccomp. Nevertheless I should add comment in the changelog how to mitigate securityContext defaults by using managedSecurityContext
.
@lsierant @mircea-cosbuc let me know if that justifies approving PR. I have already edited changelog.
Summary
Based on the HELP-81729 ticket I investigated if our workloads align with
restricted
Pod Security Standards security level. Unfortunately we cannot test enforcing of the rules easily and guarantee meeting restricted profile. This is mainly because how our e2e tests are setup. For example we are using istio, which addsistio-init
containers to provide service mesh network capabilities and Istio containers do not follow restricted profile. Ourtests
pod also does not follow the PSS requirements. There are also other issues we have faced when testing the enforcement and this requires more time allocation and we cannot promise timelines and priorities.Because of this I have enabled the
warn
mode forrestricted
security level instead ofenforce
. For one complexe2e_om_ops_manager_backup_sharded_cluster
test I have enforced therestricted
level and only in single cluster, so that we can monitor our PSS alignment. More about levels and modes can be found hereProof of Work
Passing CI for all tests that have warning + passing enforcement for
e2e_om_ops_manager_backup_sharded_cluster
test.Example warning from the e2e_om_ops_manager_backup_tls test:
Checklist
skip-changelog
label if not needed