[shim] Change NVIDIA GPU detection method #1945
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
/dev/nvidiactl
, notnvidia-smi
binary, to detect NVIDIA GPU. Unlike the binary, the devfs file only exists if some conditions are met (the exact way how/dev/ndivia*
character device files are created is complicated and setup-specific — involving some of: kernel module, udev, modprobe, nvidia-persistenced, X server, and more — but in general, it should be safe to assume that if NVIDIA GPU is available, then/dev/nvidiactl
does exist.nvidia-smi
to get GPU info directly on the host, not inside a container. Using Docker is completely unnecessary, as NVIDIA Container Toolkit mounts libs and executables from the host — dstack-provided Docker image doesn't even containnvidia-smi
binary, it's always a bind-mounted file from the host.Fixes: #1942