-
Notifications
You must be signed in to change notification settings - Fork 637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about GPU time-sharing on Kubernetes #404
Comments
The device plugin does not currently have a mechanism to expose different devices as different resource types. It is also not possible to apply sharing settings on a per-GPU basis. Note that the plugin will still allow a sharing setting to be applied to GPUs that may not support this feature and effectively reports the same device multiple times to the Kubelet. The behaviour of containers that are both started on a device where time-slicing is not supported will depend on the application and the device, but should mirror what happens when two applications which access the same device are started on the host. |
Thank you for your response. We now have a clear understanding of the limitations of this plugin. We are using this repo for managing GPUs on Kubernetes, and we have also noticed another repo related to DRA(Dynamic Resource Allocation), which offers greater flexibility and richer functionality. Regarding this, we have another question: Will there be strong support for using GPUs on Kubernetes through the DRA approach in the future? I'm not sure if it's appropriate to ask this question here. If not, could you please let me know where I should ask it? |
Yes DRA will be well supported. We see it as the future of GPU support in Kubernetes. Please see: |
This issue is stale because it has been open 90 days with no activity. This issue will be closed in 30 days unless new comments are made or the stale label is removed. |
As @elezar said this is not (yet possible), see
I myself would also like to only enable time-slicing on a subset of GPUs. Is there any chance we could make this issue here a feature request @elezar? Or is this capability never coming to the device plugin in favor of DRA? That seems quite a while out though - NVIDIA/k8s-dra-driver#131 |
1. Issue or feature description
Is it possible to enable time-sharing on select GPUs within a single node, rather than all of them?
If not all GPUs on a single node support time-slicing, like Kepler K80 GPU, what behavior can be expected from the plugin?
The text was updated successfully, but these errors were encountered: