Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cgroups: check for cpuset.cpus before set the value #254

Closed
wants to merge 1 commit into from

Conversation

HuKeping
Copy link
Contributor

@HuKeping HuKeping commented Sep 8, 2015

Assume we put some CPUs offline and then bring them online again, since
there is not a mechanism that to help docker to refresh the state of CPUs,
all the following containers will never use those off-and-on CPUs again.

So we should always check whether there has already a change of CPUs before
we set it to the container.

Signed-off-by: Hu Keping hukeping@huawei.com

@HuKeping
Copy link
Contributor Author

HuKeping commented Sep 8, 2015

Here comes the example:

Assume I have 4 CPUs in my host, at the very beginning everything is OK for

 docker run -ti --name hkp_ubuntu  --cpuset-cpus=0-3 ubuntu bash

Then I put CPU1 off and then put it on

echo 0 > /sys/devices/system/cpu/cpu1/online

echo 1 > /sys/devices/system/cpu/cpu1/online

Let's check for the cpuset.cpus of cgroup docker

cat /sys/fs/cgroup/cpuset/docker/cpuset.cpus
0,2-3

And then run that docker CLI again, error happen

Error response from daemon: Cannot start container 0abdc40d8a96d66318340b863cff56177e4958540839440fb25edb8a57430c51: [8] System error: write /sys/fs/cgroup/cpuset/docker/0abdc40d8a96d66318340b863cff56177e4958540839440fb25edb8a57430c51/cpuset.cpus: invalid argument

This is because when we shut down CPU1, the cgroup system update all the sub-cgroup system about the cpuset.cpus. But when the CPU1 comes back it didn't do the same thing.

Assume we put some CPUs offline and then bring them online again, since
there is not a mechanism that to help docker to refresh the state of CPUs,
all the following containers will never use those off-and-on CPUs again.

So we should always check whether there has already a change of CPUs before
we set it to the container.

Signed-off-by: Hu Keping <hukeping@huawei.com>
@hqhq
Copy link
Contributor

hqhq commented Sep 9, 2015

I don't think it's a bug.

For cgroup, it's reasonable that cgroup updates cpuset.cpus of all sub-cgroups when some CPUs are offline, because they are not available. And it's also reasonable that cgroup didn't do the same thing when CPUs come back, because cgroup don't know if these cpuset.cpus are assigned by user or changed by cgroup (except root cgroup, it should always enable all cpus). So I think it's intended. Ping @lizf-os (kernel cgroup and cpuset maintainer), can you confirm that?

So the same reason, a Docker container can't expend cpuset.cpus for it's parent, because Docker don't know if it's assigned by user or changed by CPU offline.

@HuKeping
Copy link
Contributor Author

HuKeping commented Sep 9, 2015

Isn't it wired that all the CPUs are available but docker --cpuset.cpus fail?

@hqhq
Copy link
Contributor

hqhq commented Sep 9, 2015

It is wired in your scenario, but think about another scenario, if I set docker cgroup can only use cpus 0-1, then a container with cpus 0-4 can still start and change the cpus config for it's parent, just because root cgroup is configed as 0-4. Isn't this even more wired?

I think the solution would be on kernel side, let kernel record every movement for cpuset.cpus changes, if they are changed by cgroup itself when CPUs are shutdown, and never changed before CPUs are up again, then cgroup should change these cpuset.cpus back. Not sure if that's worth doing.

@HuKeping
Copy link
Contributor Author

HuKeping commented Sep 9, 2015

The docker cgroup are all set when start docker daemon and the daemon will use all the available CPUs. So there is not such a scenario that set docker cgroup can only use cpus 0-1.

Besides, even if you set the docker cgroup to only use cpus 0-1 by some script like

sudo echo 0-1 > /sys/fs/cgroup/cpuset/docker/cpuset.cpus

What if we turn CPU1 off and then on? All the follow containers will never have a chance to use CPU1 again.

@lizf-os
Copy link
Contributor

lizf-os commented Sep 9, 2015

This is a long-standing issue, but it has been fixed, but only if you use unified hierarchy, which hasn't been supported by docker or any distro.

@HuKeping
Copy link
Contributor Author

HuKeping commented Sep 9, 2015

Thanks @lizf-os but I am more care about the capability of Docker discover the real available CPUs

@hqhq
Copy link
Contributor

hqhq commented Sep 9, 2015

Besides, even if you set the docker cgroup to only use cpus 0-1 by some script like

Yes, this is the usage, and I know a lot of people use in this way (before Docker support this usage by itself).

What if we turn CPU1 off and then on? All the follow containers will never have a chance to use CPU1 again.

That'll be a problem, but there is nothing you can do on Docker side, your PR would break things as I said before.

As @lizf-os said, it's fixed in kernel but only for unified hierarchy, don't know why this can't be fixed for mutil hierarchy?

@lizf-os
Copy link
Contributor

lizf-os commented Sep 9, 2015

Because we don't want people stick with legacy hierarchies forever, so all new developments have been done for unified hierarchy only.

@crosbymichael
Copy link
Member

Ya, i'm not sure if this is a problem we can solve at this level.

@mrunalp
Copy link
Contributor

mrunalp commented Sep 18, 2015

Yep, this should be solved at a higher level.

@ddingel ddingel mentioned this pull request Oct 18, 2016
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants