Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow CSI helpers in the SELinux policy #3779

Merged
merged 9 commits into from
Feb 21, 2024

Conversation

bcressey
Copy link
Contributor

Issue number:
#3684

Description of changes:
There's a fair amount of refactoring here, but the essential change involves adding a new /opt/csi mount with a special label - csi_exec_t - where privileged containers can write binaries that systemd is allowed to execute. This is intended for the special case of FUSE mounts, where the mounted filesystem needs to survive a restart or upgrade of the CSI driver daemonset.

Ideally these binaries would either be statically linked or else wrapped by a runc invocation to minimize host dependencies, but this isn't enforced. Asking systemd to run a unit requires the break-glass super_t label so it can be assumed that the caller knows the risks and asserts that it is correct.

In terms of policy refactoring, some of the type attribute identifiers have been renamed for clarity, and new ones have been added so that rules can be applied to the set rather than one-by-one to individual types.

At the OS level, the cni_exec_t label is now applied to all of /opt/cni rather than just /opt/cni/bin. This is done for symmetry with the new /opt/csi mount, and is expected to be safe because /opt/cni is unconditionally removed on each boot.

The only part of this change that's specific to the mountpoint S3 CSI driver is the compat symlink from /opt/mountpoint-s3-csi to redirect into /opt/csi so those files receive the correct label. This is similar to the compat symlink added for the secrets store CSI provider.

Testing done:
Deployed the mountpoint S3 CSI driver to my cluster and made these edits to the s3-csi-node daemonset:

# add to s3-plugin container to allow it to interact with systemd
securityContext:
    seLinuxOptions:
        type: super_t

# add to install-mp initContainer to allow it to write to `/opt/csi`
securityContext:
    privileged: true

I also ran through my SELinux-related test suite and verified that it passed.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

This ensures that the `runc` process will receive the correct label
if it's started as a systemd unit instead of being invoked by some
other service.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
Rather than specifying the transitions for each container executable
object type, group them into sets and specify the rules just once for
each set.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
The distinction between "protected" (i.e. "write-restricted") and
"restricted" (i.e. "read-restricted") was unclear and the attribute
names did not imply that one was related to the other.

Clarify this by renaming the attributes and defining the subset
relationship between them.

Another distinction in the policy is between local files that can be
mutated by both confined and unconfined system processes, and local
files that can only be mutated by unconfined system processes. The
first kind of objects are matched by rules to specific subjects; the
second kind are instead defined by the absence of rules that would
allow confined subjects to mutate them.

Add the "sensitive" attribute to collect these types and to clarify
the policy objective: these are files that can't be mutated by
containers and also can't be mutated by confined system processes.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
CSI drivers that mount filesystems with FUSE need to ensure that the
mounting process survives a container restart; otherwise, they cannot
be updated without triggering a filesystem failure in any pod which
uses the mount. One workaround for this is to have the host run the
mounting process on behalf of the container, so that the lifecycles
of the driver and the mount are no longer the same.

In some ways this is similar to CNI, where containers can provide
plugins that the host runs while setting up new network namespaces.
It's also different in that CSI mount helpers must run before the
container is created, rather than during creation. CSI mount helpers
may also need access to credentials or other secrets to perform the
mount, so the processes must be treated as privileged rather than
unprivileged containers.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
Ideally, CSI drivers that want the host to run a helper process on
their behalf would arrange to run that process inside a container, to
avoid any dependencies on host software beyond the systemd interface.
However, this isn't strictly required, and treating the process as a
container fulfills the policy objective.

Allow systemd to execute such processes directly, without requiring
them to be wrapped by a `runc` invocation.

Note however that it requires a high level of privilege to interact
with systemd via its DBUS API to create a unit and arrange for it to
run. There are no plans to relax this restriction.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
These directories will be used for overlayfs state, and unexpected
modifications could disrupt the system.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
This is done for consistency with the new /opt/csi mount, where the
helpers may need to store non-executable files as well as binaries.

Note that /opt/cni has always been cleaned up on every boot, so this
will not remove any files that weren't previously removed.

The main change is that files outside of /opt/cni/bin will now be
labeled with "cni_exec_t" instead of "local_t". These types are
largely equivalent in the current policy, in terms of file-related
permissions, so the change should be safe.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
This sets up /opt/csi as the designated location for CSI helpers that
the host system is permitted to execute.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
Ensure that files written by the S3 CSI driver are written to a path
with the correct SELinux label to allow systemd to execute them.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
@bcressey bcressey merged commit 1f96e7b into bottlerocket-os:develop Feb 21, 2024
50 checks passed
@bcressey bcressey deleted the s3-csi-selinux branch February 21, 2024 21:30
@vyaghras vyaghras mentioned this pull request Feb 21, 2024
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants