Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.1] fix intelrdt #3978

Merged
merged 11 commits into from
Aug 10, 2023
Merged

[1.1] fix intelrdt #3978

merged 11 commits into from
Aug 10, 2023

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Aug 9, 2023

All runc-1.1.x releases (including 1.1.8) silently fail to set ClosID (as reported in #3550), despite the fix in runc-1.1.1 (#3406, a backport of #3382).

This happens because the backport was done without noticing that the changes in #3354 (in particular, dbd990d which removes extra check for if !intelrdt.IsCATEnabled() && !intelrdt.IsMBAEnabled()) were NOT backported.

So, if CAT and MBA features are not available, but we still have resctrl, we do nothing about it.

A minimal fix would be something like this:

diff --git a/libcontainer/factory_linux.go b/libcontainer/factory_linux.go
index a1fa7de2..8da5349b 100644
--- a/libcontainer/factory_linux.go
+++ b/libcontainer/factory_linux.go
@@ -52,12 +52,8 @@ func InitArgs(args ...string) func(*LinuxFactory) error {
 // containers that use the Intel RDT "resource control" filesystem to
 // create and manage Intel RDT resources (e.g., L3 cache, memory bandwidth).
 func IntelRdtFs(l *LinuxFactory) error {
-       if !intelrdt.IsCATEnabled() && !intelrdt.IsMBAEnabled() {
-               l.NewIntelRdtManager = nil
-       } else {
-               l.NewIntelRdtManager = func(config *configs.Config, id string, path string) intelrdt.Manager {
-                       return intelrdt.NewManager(config, id, path)
-               }
+       l.NewIntelRdtManager = func(config *configs.Config, id string, path string) intelrdt.Manager {
+               return intelrdt.NewManager(config, id, path)
        }
        return nil
 }

but I'm not quite sure what are the implications of always having an instance of intelrdt manager.

Instead, let do a limited backport of fixes from the main branch. This PR backports the following main PRs:

Fixes: #3550

@kolyshkin kolyshkin added this to the 1.1.9 milestone Aug 10, 2023
}
rootOnce.Do(func() {
// If resctrl is available, kernel creates this directory.
if unix.Access("/sys/fs/resctrl", unix.F_OK) != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In master, we are doing a statfs and check for RDTGROUP_SUPER_MAGIC. Do we want to keep it consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need porting the whole #3485, which will double the size of this PR. I am on the verge -- I like #3485 (all of it, but especially the removal of /proc/self/mountinfo parsing which has been my pet peeve for the last few years), but I am not sure it is required to fix the issue we're talking about.

So, I'm not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I decided to port it anyway, since the cherry-picks are clean and it makes the further maintenance (if there will be a need for it -- I hope not!) easier.

@mrunalp
Copy link
Contributor

mrunalp commented Aug 10, 2023

I left one comment. Looks fine to me.

Comment on lines -55 to -56
if !intelrdt.IsCATEnabled() && !intelrdt.IsMBAEnabled() {
l.NewIntelRdtManager = nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Removal of this check is THE fix!

@kolyshkin
Copy link
Contributor Author

Rebased on top of #3976 to check against updated CI.

kolyshkin and others added 11 commits August 10, 2023 17:09
In case resctrl filesystem can not be found in /proc/self/mountinfo
(which is pretty common on non-server or non-x86 hardware), subsequent
calls to Root() will result in parsing it again and again.

Use sync.Once to avoid it. Make unit tests call it so that Root() won't
actually parse mountinfo in tests.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 02e961b)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This test was written back in the day when findIntelRdtMountpointDir
was using its own mountinfo parser. Commit f1c1fdf changed that,
and thus this test is actually testing moby/sys/mountinfo parser, which
does not make much sense.

Remove the test, and drop the io.Reader argument since we no longer need
to parse a custom file.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 6c6b14e)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In a (quite common) case RDT is not supported by the kernel/hardware,
it does not make sense to parse /proc/cpuinfo and /proc/self/mountinfo,
and yet the current code does it (on every runc exec, for example).

Fortunately, there is a quick way to check whether RDT is available --
if so, kernel creates /sys/fs/resctrl directory. Check its existence,
and skip all the other initialization if it's not present.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit edeb3b3)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
For the Nth time I wanted to replace parsing mountinfo with
statfs and the check for superblock magic, but it is not possible
since some code relies of mount options check which can only
be obtained via mountinfo.

Add a note about it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 5e201e7)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
TestGetContainerStats test a function that is smaller than the test
itself, and only calls a couple of other functions (which are
represented by mocks). It does not make sense to have it.

mockIntelRdtManager is only needed for TestGetContainerStats
and TestGetContainerState, which basically tests that Path
is called. Also, it does not make much sense to have it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 8593285)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
[This is a manual port of commit dbd990d to release-1.1
branch. Original description follows.]

Remove intelrtd.Manager interface, since we only have a single
implementation, and do not expect another one.

Rename intelRdtManager to Manager, and modify its users accordingly.

Remove NewIntelRdtManager from factory.

Remove IntelRdtfs. Instead, make intelrdt.NewManager return nil if the
feature is not available.

Remove TestFactoryNewIntelRdt as it is now identical to TestFactoryNew.

Add internal function newManager to be used for tests (to make sure
some testing is done even when the feature is not available in
kernel/hardware).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This function is unused, and removing it simplifies the changes which
follow this commit.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 13674f4)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Reading /proc/cpuinfo is a surprisingly expensive operation. Since
kernel version 4.12 [1], opening /proc/cpuinfo on an x86 system can
block for around 20 milliseconds while the kernel samples the current
CPU frequency. There is a very recent patch [2] which gets rid of the
delay, but has yet to make it into the mainline kenel. Regardless,
kernels for which opening /proc/cpuinfo takes 20ms will continue to be
run in production for years to come. libcontainer only opens
/proc/cpuinfo to read the processor feature flags so all the delays to
get an accurate snapshot of the CPU frequency are just wasted time.

If we wanted to, we could interrogate the CPU features directly from
userspace using the `CPUID` instruction. However, Intel and AMD CPUs
have flags in different positions for their analogous sub-features and
there are CPU quirks [3] which would need to be accounted for. Some
Haswell server CPUs support RDT/CAT but are missing the `CPUID` flags
advertising their support; the kernel checks for support on that
processor family by probing the the hardware using privileged
RDMSR/WRMSR instructions [4]. This sort of probing could not be
implemented in userspace so it would not be possible to check for RDT
feature support in userspace without false negatives on some hardware
configurations.

It looks like libcontainer reads the CPU feature flags as a kind of
optimization so that it can skip checking whether the kernel supports an
RDT sub-feature if the hardware support is missing. As the kernel only
exposes subtrees in the `resctrl` filesystem for RDT sub-features with
hardware and kernel support, checking the CPU feature flags is redundant
from a correctness point of view. Remove the /proc/cpuinfo check as it
is an optimization which actually hurts performance.

[1]: https://unix.stackexchange.com/a/526679
[2]: https://lore.kernel.org/all/20220415161206.875029458@linutronix.de/
[3]: https://github.com/torvalds/linux/blob/7cf6a8a17f5b134b7e783c2d45c53298faef82a7/arch/x86/kernel/cpu/resctrl/core.c#L834-L851
[4]: https://github.com/torvalds/linux/blob/a6b450573b912316ad36262bfc70e7c3870c56d1/arch/x86/kernel/cpu/resctrl/core.c#L111-L153

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 9f10748)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The intelrdt package only needs to parse mountinfo to find the mount
point of the resctrl filesystem. Users are generally going to mount the
resctrl filesystem to the pre-created /sys/fs/resctrl directory, so
there is a common case where mountinfo parsing is not required. Optimize
for the common case with a fast path which checks both for the existence
of the /sys/fs/resctrl directory and whether the resctrl filesystem was
mounted to that path using a single statfs syscall.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit c156bde)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The OCI runtime spec mandates "[i]f intelRdt is not set, the runtime
MUST NOT manipulate any resctrl pseudo-filesystems." Attempting to
delete files counts as manipulating, so stop doing that when the
container's RDT configuration is nil.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 56daf36)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Unless the container's runtime config has intelRdt configuration set,
any checks for whether Intel RDT is supported or the resctrl filesystem
is mounted are a waste of time as, per the OCI Runtime Spec, "the
runtime MUST NOT manipulate any resctrl pseudo-filesystems." And in the
likely case where Intel RDT is supported by both the hardware and
kernel but the resctrl filesystem is not mounted, these checks can get
expensive as the intelrdt package needs to parse mountinfo to check
whether the filesystem has been mounted to a non-standard path.
Optimize for the common case of containers with no intelRdt
configuration by only performing the checks when the container has opted
in.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit ea0bd78)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants