-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CA-370578 use subsystemId in NVidia GPU matching #4826
CA-370578 use subsystemId in NVidia GPU matching #4826
Conversation
This applies cleanly to the Yangtze branch for a potential backport. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't change the IDL or anything in the DB, so should be ok to backport.
Added comments as suggested. Will squash once accepted. |
So far, we identified a GPU by its deviceId; this turns out to be ambiguous and vgpuConfig.xml may contain multiple GPUs with identical deviceId, causing us to use more vGPU types than we can actually support. This patch introduces a new abstract type "gpu" in the VENDOR signature such that its Nvidia instance can use more information to identify a GPU. When matching a GPU found on the PCI bus with GPUs listed in vgpuConfig.xml we use: deviceId match and subsystemId match or deviceId match and subsystemId in vgpuConfig is 0 subsystemId=0 is used by Nvidia as a wildcard and thus subsystemId=0 is ignored during a match. The matching for other vendors remains unchanged (and is still based on deviceId). Signed-off-by: Christian Lindig <christian.lindig@citrix.com>
48a75b9
to
eb855fb
Compare
@@ -410,6 +413,16 @@ module Vendor_nvidia = struct | |||
; compatible_model_names_on_pgpu: string list | |||
} | |||
|
|||
type gpu = {device_id: int; subsys_id: int} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we don't need the subsystem vendor id at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need it for Nvidia - that is the entire point of the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The is a subsystem device ID, and a subsystem vendor ID as well, which is not in the type now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. We don't use the subsystem vendor ID right now but could easily extend this to use it as well. We would expect it to match exactly (no wildcard mechanism).
So far, we identified a GPU by its deviceId; this turns out to be ambiguous and vgpuConfig.xml may contain multiple GPUs with identical deviceId, causing us to use more vGPU types than we can actually support.
This patch introduces a new abstract type "gpu" in the VENDOR signature such that its Nvidia instance can use more information to identify a GPU. This is the core of this patch and fix.
When matching a GPU found on the PCI bus with GPUs listed in vgpuConfig.xml we use:
or deviceId match and subsystemId in vgpuConfig is 0
subsystemId=0 is used by Nvidia as a wildcard and thus subsystemId=0 is ignored during a match.
The matching for other vendors remains unchanged (and is still based on deviceId).
Signed-off-by: Christian Lindig christian.lindig@citrix.com