You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am facing an issue where my host machine detects the GPU correctly, but inside the Docker container, I get the following error when running nvidia-smi:
Failed to initialize NVML: Driver/library version mismatch
I have tried multiple configurations and different CUDA base images, but I can't resolve this issue. I believe the problem is related to library version conflicts between the host and the container.
Host System (Outside Docker)
OS: Ubuntu 22
GPU: NVIDIA GeForce GTX TITAN Black (GK110B, Compute Capability 3.5)
Driver: 470.256.02
CUDA Version (from nvidia-smi): 11.4 (Host driver supports up to CUDA 12)
Docker Version: 27.3.1
NVIDIA Container Toolkit Installed: Yes, version 1.17.4-1
Container Configuration
Base Image Used: (I have tried multiple)
nvidia/cuda:10.2-runtime-ubuntu18.04
nvidia/cuda:10.2-base-ubuntu18.04
nvidia/cuda:10.2-runtime
Container OS: Ubuntu 18.04
CUDA Version inside container: 10.2
NVIDIA Container Toolkit Installed:Yes
Run command:
sudo docker run --gpus all -it --name my_container -v /home/user/my_project:/workspace my_niftypet_runtime
Debugging Attempts
Verified NVIDIA Container Toolkit is installed on the host
dpkg -l | grep nvidia-container
Output:
ii libnvidia-container-tools 1.17.4-1
ii libnvidia-container1:amd64 1.17.4-1
ii nvidia-container-toolkit 1.17.4-1
ii nvidia-container-toolkit-base 1.17.4-1
Tried forcing the container to use host libraries:
Running the container with:
sudo docker run --gpus all -it --env LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu my_niftypet_runtime
Still getting the NVML error.
Tried other images:
I attempted using nvidia/cuda:10.2.89-base-ubuntu18.04, but it seems unavailable on Docker Hub.
Questions & Help Needed
How can I ensure that the container correctly uses the host’s NVIDIA libraries to avoid the Driver/library version mismatch error?
Is there any specific Docker image or configuration recommended for older GPUs like the GTX TITAN Black that require CUDA 10.2?
Could my Docker version (27.3.1) or NVIDIA Container Toolkit version (1.17.4-1) be incompatible with my setup?
This issue is blocking my work¡. Any help would be greatly appreciated!
Thank you in advance! 😊
The text was updated successfully, but these errors were encountered:
Hello,
I am facing an issue where my host machine detects the GPU correctly, but inside the Docker container, I get the following error when running
nvidia-smi
:I have tried multiple configurations and different CUDA base images, but I can't resolve this issue. I believe the problem is related to library version conflicts between the host and the container.
Host System (Outside Docker)
nvidia-smi
): 11.4 (Host driver supports up to CUDA 12)Container Configuration
nvidia/cuda:10.2-runtime-ubuntu18.04
nvidia/cuda:10.2-base-ubuntu18.04
nvidia/cuda:10.2-runtime
Debugging Attempts
Verified NVIDIA Container Toolkit is installed on the host
dpkg -l | grep nvidia-container
Output:
Tried forcing the container to use host libraries:
Tried other images:
nvidia/cuda:10.2.89-base-ubuntu18.04
, but it seems unavailable on Docker Hub.Questions & Help Needed
Driver/library version mismatch
error?This issue is blocking my work¡. Any help would be greatly appreciated!
Thank you in advance! 😊
The text was updated successfully, but these errors were encountered: