Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. #8

Closed
Vincent-luo opened this issue Jul 18, 2024 · 6 comments
Closed

Comments

@Vincent-luo
Copy link

When I run nvidia-smi in the docker you provided, I get the error:

NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.

How can I fix this error? Thanks!

@warmshao
Copy link
Owner

Is your machine's driver installed? Give me more information about your computer and graphics card.

@Vincent-luo
Copy link
Author

Yes I installed nvidia driver, I can run nvidia-smi outside docker container. My system is ubuntu 18.04, GPU is V100.

@warmshao
Copy link
Owner

Yes I installed nvidia driver, I can run nvidia-smi outside docker container. My system is ubuntu 18.04, GPU is V100.

Did it run successfully?​​​​​​​​​​​​​​​​

@shaoguowen
Copy link

hi, guys, you can reference #9:

  1. check real name of cuda: ls -lh /usr/lib/x86_64-linux-gnu/
  2. remake the soft link of cuda like this accoding to your real path: ln -sf /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.xxx /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
  3. ln -sf /usr/lib/x86_64-linux-gnu/libcuda.so.xxx /usr/lib/x86_64-linux-gnu/libcuda.so.1

@Vincent-luo
Copy link
Author

hi, guys, you can reference #9:

  1. check real name of cuda: ls -lh /usr/lib/x86_64-linux-gnu/
  2. remake the soft link of cuda like this accoding to your real path: ln -sf /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.xxx /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
  3. ln -sf /usr/lib/x86_64-linux-gnu/libcuda.so.xxx /usr/lib/x86_64-linux-gnu/libcuda.so.1

Thanks for your help! That works!

@Vincent-luo
Copy link
Author

Here's how I solved it, in case it helps others:

First find related .so file on your machine, you can see libnvidia-ml.so.1 and libcuda.so.1 all seem broken.

> ls -lh /usr/lib/x86_64-linux-gnu/libnvidia-ml.so*
-r-xr-xr-x 1 root root    0 Jul 17 08:41 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
-rwxr-xr-x 1 root root 1.8M Jun 17  2022 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.470.103.01

> ls -lh /usr/lib/x86_64-linux-gnu/libcuda.so*
lrwxrwxrwx 1 root root  12 Jul 17 08:41 /usr/lib/x86_64-linux-gnu/libcuda.so -> libcuda.so.1
-r-xr-xr-x 1 root root   0 Jul 17 08:41 /usr/lib/x86_64-linux-gnu/libcuda.so.1
-rwxr-xr-x 1 root root 24M Jun 17  2022 /usr/lib/x86_64-linux-gnu/libcuda.so.470.103.01
-rw-r--r-- 1 root root 21M Feb 27  2023 /usr/lib/x86_64-linux-gnu/libcuda.so.515.105.01

Then create new soft link, for me it's like:

ln -sf /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.470.103.01 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
ln -sf /usr/lib/x86_64-linux-gnu/libcuda.so.470.103.01 /usr/lib/x86_64-linux-gnu/libcuda.so.1

That's it. Have fun!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants