Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.0] cgroupv2: ebpf: ignore inaccessible existing programs #3087

Merged
merged 2 commits into from
Jul 15, 2021

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Jul 14, 2021

This is a backport of PR #3055 to 1.0 branch. Draft until that one is merged. Original description follows.

This is necessary in order for runc to be able to configure device
cgroups with --systemd-cgroup on distributions that have very strict
SELinux policies such as openSUSE MicroOS.

The core issue here is that systemd is adding its own BPF policy that
has an SELinux label such that runc cannot interact with it. In order to
work around this, we can just ignore the policy -- in theory this
behaviour is not correct but given that the most obvious case
(--systemd-cgroup) will still handle updates correctly, this logic is
reasonable.

Fixes: d0f2c25 ("cgroup2: devices: replace all existing filters when attaching")
Signed-off-by: Aleksa Sarai cyphar@cyphar.com


Changelog Entry

 * cgroupv2: bpf: Ignore inaccessible existing programs in case of
   permission error when handling replacement of existing bpf cgroup
   programs. This fixes a regression in 1.0.0, where some SELinux
   policies would block runc from being able to run entirely. #3055

@kolyshkin kolyshkin added this to the 1.0.1 milestone Jul 14, 2021
@kolyshkin kolyshkin mentioned this pull request Jul 14, 2021
@AkihiroSuda AkihiroSuda marked this pull request as ready for review July 14, 2021 02:31
AkihiroSuda
AkihiroSuda previously approved these changes Jul 14, 2021
Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait, this PR was made against the wrong branch -- should be against release-1.0.

@cyphar cyphar closed this Jul 14, 2021
@cyphar cyphar reopened this Jul 14, 2021
@AkihiroSuda AkihiroSuda marked this pull request as draft July 14, 2021 06:26
@cyphar
Copy link
Member

cyphar commented Jul 14, 2021

(Good thing I do merges manually -- git mm screamed at me when I tried to merge this.)

@kolyshkin kolyshkin changed the base branch from master to release-1.0 July 14, 2021 07:10
@kolyshkin kolyshkin dismissed AkihiroSuda’s stale review July 14, 2021 07:10

The base branch was changed.

@kolyshkin
Copy link
Contributor Author

Ah wait, this PR was made against the wrong branch -- should be against release-1.0.

My bad 🤦🏻

Fixed.

We need to update the eBPF library so that we can get the raw syscall
errors from bpf(2) syscalls using errors.Is.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
(cherry picked from commit fe518a0)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is necessary in order for runc to be able to configure device
cgroups with --systemd-cgroup on distributions that have very strict
SELinux policies such as openSUSE MicroOS[1].

The core issue here is that systemd is adding its own BPF policy that
has an SELinux label such that runc cannot interact with it. In order to
work around this, we can just ignore the policy -- in theory this
behaviour is not correct but given that the most obvious case
(--systemd-cgroup) will still handle updates correctly, this logic is
reasonable.

[1]: https://bugzilla.suse.com/show_bug.cgi?id=1182428

Fixes: d0f2c25 ("cgroup2: devices: replace all existing filters when attaching")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
(cherry picked from commit 57e3c54)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin kolyshkin marked this pull request as ready for review July 14, 2021 07:20
Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

breakings added a commit to breakings/packages that referenced this pull request Aug 8, 2021
This is the first stable release in the 1.0 branch, fixing a few medium
and high priority issues with runc 1.0.0, including a few that affect
Kubernetes' usage of libcontainer.

Bugfixes:

- Fixed occasional runc exec/run failure ("interrupted system call") on an
  Azure volume. ([#3074](opencontainers/runc#3074))
- Fixed "unable to find groups ... token too long" error with /etc/group
  containing lines longer than 64K characters. ([#3079](opencontainers/runc#3079))
- cgroup/systemd/v1: fix leaving cgroup frozen after Set if a parent cgroup is
  frozen. This is a regression in 1.0.0, not affecting runc itself but some
  of libcontainer users (e.g Kubernetes). ([#3085](opencontainers/runc#3085))
- cgroupv2: bpf: Ignore inaccessible existing programs in case of
  permission error when handling replacement of existing bpf cgroup
  programs. This fixes a regression in 1.0.0, where some SELinux
  policies would block runc from being able to run entirely. ([#3087](opencontainers/runc#3087))
- cgroup/systemd/v2: don't freeze cgroup on Set. ([#3092](opencontainers/runc#3092))
- cgroup/systemd/v1: avoid unnecessary freeze on Set. ([#3093](opencontainers/runc#3093))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants