GPU not detected inside container (NVML "Driver/library version mismatch" error) #929

jaimeehh · 2025-02-16T16:22:24Z

Hello,

I am facing an issue where my host machine detects the GPU correctly, but inside the Docker container, I get the following error when running nvidia-smi:

Failed to initialize NVML: Driver/library version mismatch

I have tried multiple configurations and different CUDA base images, but I can't resolve this issue. I believe the problem is related to library version conflicts between the host and the container.

Host System (Outside Docker)

OS: Ubuntu 22
GPU: NVIDIA GeForce GTX TITAN Black (GK110B, Compute Capability 3.5)
Driver: 470.256.02
CUDA Version (from nvidia-smi): 11.4 (Host driver supports up to CUDA 12)
Docker Version: 27.3.1
NVIDIA Container Toolkit Installed: Yes, version 1.17.4-1

Container Configuration

Base Image Used: (I have tried multiple)
- nvidia/cuda:10.2-runtime-ubuntu18.04
- nvidia/cuda:10.2-base-ubuntu18.04
- nvidia/cuda:10.2-runtime
Container OS: Ubuntu 18.04
CUDA Version inside container: 10.2
NVIDIA Container Toolkit Installed: Yes

Run command:

sudo docker run --gpus all -it --name my_container -v /home/user/my_project:/workspace my_niftypet_runtime

Debugging Attempts

Verified NVIDIA Container Toolkit is installed on the host

dpkg -l | grep nvidia-container

Output:

ii  libnvidia-container-tools                  1.17.4-1
ii  libnvidia-container1:amd64                 1.17.4-1
ii  nvidia-container-toolkit                   1.17.4-1
ii  nvidia-container-toolkit-base              1.17.4-1

Tried forcing the container to use host libraries:
- Running the container with:
```
sudo docker run --gpus all -it --env LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu my_niftypet_runtime
```
- Still getting the NVML error.
Tried other images:
- I attempted using nvidia/cuda:10.2.89-base-ubuntu18.04, but it seems unavailable on Docker Hub.

Questions & Help Needed

How can I ensure that the container correctly uses the host’s NVIDIA libraries to avoid the Driver/library version mismatch error?
Is there any specific Docker image or configuration recommended for older GPUs like the GTX TITAN Black that require CUDA 10.2?
Could my Docker version (27.3.1) or NVIDIA Container Toolkit version (1.17.4-1) be incompatible with my setup?

This issue is blocking my work¡. Any help would be greatly appreciated!

Thank you in advance! 😊

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU not detected inside container (NVML "Driver/library version mismatch" error) #929

GPU not detected inside container (NVML "Driver/library version mismatch" error) #929

jaimeehh commented Feb 16, 2025

GPU not detected inside container (NVML "Driver/library version mismatch" error) #929

GPU not detected inside container (NVML "Driver/library version mismatch" error) #929

Comments

jaimeehh commented Feb 16, 2025

Host System (Outside Docker)

Container Configuration

Debugging Attempts

Questions & Help Needed