-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-install nvidia container runtime + drivers on GPU instances #11628
Conversation
olemarkus
commented
May 30, 2021
•
edited
Loading
edited
- install nvidia container runtime
- install nvidia drivers
- make it opt-in to avoid clash with those who prebake/use gpu operator or similar
- add device plugin as addon
Worth mentioning that we override the containerd configuration override. Or at least modify it. kOps only knows about if the instance will have GPUs during nodeup, while cloudup will set the config override to the kOps default config by default (hence the containerd config override will always be set to the assumed "final" config). |
9798157
to
57fc757
Compare
The idea is good in general, we should make this easy for people. In particular, here are some of my thoughts:
|
The above has a challenge if someone creates a mixed instance group with both GPU and non-GPU instance types. I am not sure why anyone would do that, but it is possible, and if nvidia runtime is used on non-GPU, it will break. That being said, we could do some hardware probing instead, I guess. Just seemed more reliable to describe the instance type.
Yeah, I am aware. I plan on adding other distros as needed and in the interim solve this with some distro checks and documentation.
Thanks. I didn't like the way I did this one. asset file url is a much better idea.
If we find a way to do this in cloudup I'd also like to see if this can be put the final config aux. I don't like using the override flag as a carry mechanism cloudup -> nodeup. |
No need for that, we have validation. If we can validate that instances should not have a mix of ARM and AMD, we can make sure GPU instances are validated in same way
Meant that the code checks for Debian family instead of Ubuntu specific
👍
I agree, but to keep this on track we can do it later, when the config aux is merged. I agree that aux would be a better place for it. |
So if we don't allow mixing GPU and non-GPU instance types, building the config cloudup should be fine. |
A prerequisite would be to allow configuring containerd for each IG, not sure if possible now. Once that is possible, the package installs can just be done in the |
Right. I think that is a challenge with how nodeup config is being built. Probably the containerd config should be built based on IG and cluster spec and then written to one of the configs rather than writing it back to the IG/cluster spec. But we are a long way away from that. I think continuing the current path makes sense for now. There is nothing here that is exposed to the user, so things can be moved later on (maybe even before 1.22 GA) |
I didn't do many of the APT changes. Using assets was challenging as it is arch dependent, which doesn't make sense in this case. It could also run into run-time issues if we use hashes (upstream can change keys) and it is the TLS cert we trust here. |
I'll defer device addon to another PR. |
Determining the arch of an instance and whether or not it has a GPU appears to be firmly in the bailiwick of nodeup. I don't see why there's an objection to putting such logic in there. Making the admin configure this through the API is just causing them grief. I do not understand or agree with this desire to make nodeup dumb. It should do the things it is well suited for and not do the things it is not well suited for. |
GPU stuff is something that very few operators need or use. Checking each node in every cluster if it has that capability just for that is not exactly ideal. More over, this has to work on the various supported platforms, not just AWS. |
Can't nodeup to the equivalent of |
13c26aa
to
5e34b41
Compare
ig.Spec.NodeLabels = make(map[string]string) | ||
} | ||
ig.Spec.NodeLabels["kops.k8s.io/gpu"] = "1" | ||
ig.Spec.Taints = append(ig.Spec.Taints, "nvidia.com/gpu:NoSchedule") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a behavioral change? i.e. workloads that previously ran on GPU instances will need to add this toleration to remain schedulable?
Or is there a webhook or similar that will add this automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I think this one should additionally be blocked by NvidiaGPU.enabled. That way one has explicitly subscribed to kOps way of doing things. I'll add docs to match as well.
@@ -765,6 +765,7 @@ func addNodeupPermissions(p *Policy, enableHookSupport bool) { | |||
addASLifecyclePolicies(p, enableHookSupport) | |||
p.unconditionalAction.Insert( | |||
"ec2:DescribeInstances", // aws.go | |||
"ec2:DescribeInstanceTypes", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was originally a little sad that we couldn't get this from the metadata or DescribeInstances. But I came round to your point of view, because InstanceTypes are generic data that won't really be sensitive, and if there's one we should get rid of it's probably DescribeInstances :-)
I think this looks great, and as we agreed in office hours we can nest it under containerd. My one remaining challenge is the automatic taint, because I'm worried that we'll break clusters that upgrade that are using GPU instances until users add that toleration. Or is there some other mechanism at play here: e.g. everyone is already using this toleration, or there's a webhook that adds it? |
5e34b41
to
f5fed2a
Compare
These concerns should be addressed now. |
Thanks @olemarkus This lgtm now and we did agree on this at our last discussion; I'm going to approve but hold until after office hours. /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: justinsb The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
/hold cancel |
…628-origin-release-1.22 Automated cherry pick of #11628: Add nvidia configuration to the api