-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CPU hotplug support #1119
Comments
I personally don't really like not joining cgroups, because it means that doing a My main concern with this is that it doesn't really solve the problem, it more works around it. Now, the workaround isn't too bad, but I do wonder if just fixing #654 wouldn't be a better solution overall (as it would also mean that we get unified hierarchy support for free). |
@cyphar thank you for your feedback and @michael-holzheu for internal discussion about it. There are two points which might make our change more favorable:
|
@ddingel Sorry for the late. Honestly I don't quite like your proposal, it's too aggressive and might cause backward compatibility issues, and with your proposal, updating containers will cause cgroup migration, which means cpu/memory migration which all make things complicated. For this issue, I'd rather another possible way that we add a new flag which can force update all parent configurations, but I don't know how other maintainers think. |
@hqhq Thank you for your feedback. The proposal was not meant to be aggressive in any way, sorry for that. So @michael-holzheu and I took the time to understand how your proposal would work in respect of this issue and I would like to know if we have the same understanding and also what the other maintainers think about it. The docker daemon would be started with an optional
This would allow newly started containers to scale to all CPUs. Additionally the docker daemon could also enable this cpu via containerd -> runc for every container which:
This second step might be a scaling problem if the number of containers or the number of hotplug events is big. On the other hand we would get:
Maybe this could be an opt-out like |
@ddingel This looks like a better approach, but there is still a major concern as I had in #254 , say you have cpu 0-3 on your host, when cpu 1 got removed, Besides that, all these monitor and refresh work should be done by Docker, I don't know how they would like it. For runc, I think the only compromise we can do is mostly like what #254 tried to do, but with an option like |
@hqhq I think I am missing your point, here is how I think this would be played:
In my understanding there is no need to refresh or forward the change from online to offline. The change from offline to online would trigger the Docker daemon to refresh. So we have following states:
State within the Docker daemon (process as well as on disk like So for runc I would think it is more like a In case that
But from what I understand it will not be fixed for v1. For v2 I am still searching how and if that will be fixed, @lizf-os maybe you could point me to some documentation/code? So that I could see if we the proposed solution for v1 is applicable for v2?
Fair enough I am fine with |
@ddingel Fair enough, that looks practicable, thanks for explaining. I'm fine we add force update for cpuset in runc no matter Docker will take your idea or not, feel free to shot a PR :) . |
Hi! Previously, we did not consider the case when a CPU event occurs and no container is present. In this case, runc update cannot be used and we may have lost some CPU events. My proposal is to develop a plugin for docker that can be enabled by adding the flag --authorization-plugin. It creates a process that listens to CPU event and updates the file /sys/fs/cgroup/cpuset/docker/cpuset.cpus and every container using the command runc update. Currently, I have tried to enable the plugin. My solution avoid to add an additional flag and take into account the previous problem. Do you find this solution acceptable or you see some drawbacks? Any help or feedbacks will be particularly appreciated :) |
Hi all! Create a file
|
When a cpu gets online, the docker daemon updates the cgroups of its containers. It resolves the issues: - moby#27453 - opencontainers/runc#1119 The extention can be used with the flags --cgroup-parent and --cpuset-cpus Signed-off-by: Alice Frosi <alice@linux.vnet.ibm.com>
Description
Allow new and already started containers to use hotplugged CPUs. For details please take a look at this Docker issue.
In the discussion of the mentioned issue it came up that this should be targeted at runc. Also that this rejected PR might help us.
Steps to reproduce the issue:
Take a look at the original Docker issue.
Describe the results you expected:
Already running and newly started containers should be able to use additional online CPUs.
Initial proposal
Containers which are not started with a
--cpuset-cpus
or--cpuset-mems
option are called unrestricted in the following, the others are called restricted.For each unrestricted container we skip the current creation of a cpuset cgroup sub-directory under the Docker cgroup directory. Then all tasks within that container will be automatically associated to the root cpuset cgroup. Because the root cpuset cgroup is managed by the kernel its cpuset.cpus will always contain all online CPUs. Therefore unrestricted containers will use all online CPUs.
Additional comments
After taking a look at the rejected PR we came to the conclusion that our proposed solutions has two significant advantages:
It might be worth to note that
runc update --cpuset-cpus | --cpuset-mems
will need some more logic to be able to create on-the-fly new cgroup hierarchies.The text was updated successfully, but these errors were encountered: