-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Joining cgroups blindly causes performance problems #861
Comments
mmm the performance degradation is known but i didnt know that it was that much for blkio 😵 |
@dqminh Unfortunately I discovered this after having discussions with our kernel team on internal mailing lists. I'll ask them if they can link to upstream discussion though. As for not joining unspecified cgroups, I'll work on this once I figure out why ubuntu isn't affected by this as badly. |
This could potentially also be fixed by the lazy cgroup handling (by only attaching to cgroups that we are using, and then attaching later if a user tries to update the limits). |
@cyphar can you add instructions on how to replicate this issue reliably ( what is the environment of the host, kernel version etc. ) ? I've tried with both bare-metal with kernel 4.4 and on a ubuntu xenial VM on DigitalOcean but enable to replicate it. |
Hello, same pb here. |
I too am running into this issue. Has anyone found a solution, even if it is a hack or temporary? I haven't found a way to get runc to work without cgroups... |
That's old... It was fixed on newer versions of docker, in my case the problem was solved changing a systemd config file: Read more about it here:
|
@ipeoshir This is a separate issue. I found the rlimit issue while trying to figure out this issue. Those limits are non-cgroup limits. |
@cyphar It seems that I can not reproduce this issue in my env:
So as you see, there is no performance downgrade in my test. I am using CoreOS and CentOS 7.2, and the kernel version is 4.7.3-coreos-r3 and 3.10 (CentOS 7.2). So it seems this is not a general performance issue? Or at least not applied to CoreOS and CentOS 7.2? And do you know if there is a ticket created in kernel to trace this issue? |
@qianzhangxa We had an internal bug about this issue. I will have to go check again what the exact reproducer was, and double check that it still occurs. But it definitely was happening on a stock kernel last time I tried it. What blkio scheduler are you using? I believe that if you're using |
Yes, I am using And I see in blkio cgroups, there is a |
I assume you'd need to use a |
@cyphar I did some experiments, and I think this performance issue will happen only when the IO scheduler for the disk is set to And to use the cgroup blkio control functionalities, we have to set the IO scheduler to |
The performance issue is only as obvious as this with ext4 and data=ordered. In the original bug report it looks like it's a core CFQ problem (and some discussions with kernel devs have confirmed this). The main reason for me opening this issue is that it's not necessary for us to join cgroups if we're not going to set any limits (and we can always move the process on |
@cyphar Got it, thanks! BTW, for that core CFQ problem, do you know if there is any ticket in kernel community to trace it? I'd like to get more details. |
I don't think there is one. Most of the discussion was on some internal mailing lists with our kernel devs, and I think the conversation stalled. |
@cyphar I found this performance may not be related to blkio cgroup, because I found even a process does not join any blkio cgroup, the performance issue will happen as well.
So I think the behavior is:
What is confusing me is 3, I am not sure why the performance issue will not happen only when the process joins the root blkio cgroup. Any ideas? |
@qianzhangxa Your options (1) and (3) are the same, unless you didn't mount the blkio cgroup at all until the second step. "Not joining" in this context means staying in the root -- all processes start in the root cgroup (once a hierarchy is mounted). However, the reason why (3) has no performance impact is because the way blkio weighting between two cgroups is done involves adding latency to competing cgroups (to avoid CFQ giving more weight to a cgroup incorrectly). This logic doesn't apply in the root cgroup because there are no competing cgroups to the root. |
@cyphar The blkio cgroup was automatically mounted when the OS was booted.
The above test was done in Ubuntu 17.04 and the test in my last post was done in Ubuntu 16.04. I also thought "Not joining" in this context means staying in the root -- all processes start in the root cgroup, so I thought for (1) and (3), I should have got the similar test result, but as you see in the above test, compared with (3), the (1) has significant performance issue which is really confusing me. |
@qianzhangxa There's no not-joining processes, every process is in some cgroup if you enabled an subsystem and mounted it. And it's also not always true that not-joining process means it's in root process, your case is simply because your shell process was in some sub-cgroup after os boots which was controlled by systemd, see |
Yes @hqhq! I see the process is initially in the sub-group |
Moving isn't valid strategy for all controllers, read the "Memory ownership" section in the documentation: https://www.kernel.org/doc/Documentation/cgroup-v2.txt "A memory area is charged to the cgroup which instantiated it and stays |
It turns out that joining cgroups that we don't use has a non-zero performance degredation. The most obvious case is with
blkio
which can cause operations to become 10 times slower. The following test assumes you have some block device/dev/sdX
that is formatted asext4
(this used a spinning hard drive, but you could also use a flash usb):This is already a known issue upstream (in the kernel), but it is a general problem (most cgroup controllers have special cases for their root cgroup to maximise performance -- but few have optimisations to reduce the hierarchy based on which cgroups have limits set).
Unfortunately, the naive solution (not joining cgroups if we don't intend to use them in
config.json
) causes obvious issues withrunc update
(and commands that make assumptions about which cgroups we've joined). So we'd have to write quite a bit of code to create new cgroups and join container processes to them if the user requests a limit that wasn't required before. We could do it with the freezer cgroup and some enumeration.This is (somewhat) related to the lazy cgroup work that we should do as a part of #774.
The performance issue described in moby/moby#21485 occurs because of this.
The text was updated successfully, but these errors were encountered: