-
Notifications
You must be signed in to change notification settings - Fork 2k
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]. #1243
Comments
I ran into the same issue. |
This Solved My Problem.
|
Hi @joshuafc, for me the problem was the line
|
Same thing happened here. Had an issue with an apt repository (unrelated to |
I had the same error with Docker via (FYI: When you uninstall docker with snap, perhaps uninstall takes forever somehow. I could manage to handle that problem with |
A new release of nvidia-docker was posted on Tuesday 19-May. Can you try the newest version and let us know if you are still having this issue. |
I also tried to get it running with a I am getting the same as mentioned at the top. No idea how to restart the deamon under snap, but even restarting the whole computer didn't work. |
@isolin My workaround on Ubuntu 18.04 was to uninstall the snap installed Docker and install docker using apt:
|
Wow. Didn't notice that. It worked! |
@
It also works for me!!! Thank you~ |
Thanks. This was my problem also. |
Hi! I have same exact problem, Aug 2020, its still broken, done everything under the sun to fix it inlcluding recompile Microsofts kernel for Ubuntu, same problem. It does not see the nvidia video card under lspci. need help getting this fix asap. Thanks! |
Problem seems to be on the window side the devie path between the two pieces |
As per your log you're also using Kudos to @ogukei & @ricklentz for pointing me to the solution 👌🏻 |
henry2man can you clarify what you mean by apt docker version I dont think I am using snaps but maybe I am mistaken. Do you got a link to directions? Thanks in advance! |
@wanfuse123 in your logs there are paths related to snap package system, so probably, like me before, you're using the snap Docker version in Ubuntu. In order to fix the issue you have to uninstall the snap version (I don't remember the exact command but it's pretty straightforward) and then install docker.io using "sudo apt install ..." Hope this helps. |
I removed snap component in the guest vm with sudo apt autoremove --purge snapd* and it still installs so I dont think this has anything to do with it, but correct me if I am wrong please??? If snapd isnt installed then I dont think its possible. Let me know! |
It worked for about 3 days and suddenly stopped working, no idea what happened though. |
OMG! My mistake, I assumed you already know what is snap. "Snap is a software packaging and deployment system developed by Canonical for the operating systems that use the Linux kernel." https://en.wikipedia.org/wiki/Snap_(package_manager) We can say that "snap" is something like "APT", a tool you can use to install and manage software packages. So, in this exact case you have to remove docker snap package using:
And then, you have to install docker.io version using apt. You'll find detailed instructions here: https://docs.docker.com/engine/install/ubuntu/ |
I had the same issue, I solved it using these steps.
Lastly, make sure to restart docker
Now you should be able to get the nvidia-smi using docker
These steps worked for me in my Ubuntu 20..04
|
Figured out the cause. The bios has to have the onboard video card enabled even if your using the nvidia card. Not sure why, but hope this helps someone else! |
From the NVIDIA CUDA/WSL 2 documentation:
|
This usually works, but in my case when I had the problem, I had disabled
the onboard video card IN BIOS previously, and did a few other tweaks in
there too. Then I installed the correct windows 10 nvidia beta driveds on
the host and it started working!
Hope this helps some other person!
…On Sat, Oct 3, 2020, 9:17 AM strarsis ***@***.***> wrote:
From the NVIDIA CUDA/WSL 2 documentation
<https://docs.nvidia.com/cuda/wsl-user-guide/index.html#installing-docker>
:
Use the Docker installation script to install Docker for your choice of
WSL 2 Linux distribution. Note that NVIDIA Container Toolkit does not yet
support Docker Desktop WSL 2
<https://docs.docker.com/docker-for-windows/wsl/> backend.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1243 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADDBZWLRVB5MS2ZZ4BSH67DSI4P4ZANCNFSM4MDDV5EA>
.
|
Man, you used snap to install docker? Gutsy move. Considering snap's apparmor madness and the fact that it's already partitioning the crud out of each install - it just feels like way too many layers of wtf for me. I'd be interested to hear how that's going for you. |
Thanks it worked for me |
FYI: just my stupid experience This error happened when I used Ubuntu 20.04 but made a command for Ubuntu 18.04 such as
So, if you are as stupid as I am, please check whether you made any command which does not match your environments. |
I am going to close this since the original poster has solved his issue and the thread is now degenerating into a mix of other issues from other people. If you encounter a new issue related to this, please feel free to open a new ticket. |
I found that:
did not fix the issue. However, rebooting my server with For reference, I had a slightly different error:
|
After installing
Your service may instead be called
|
Same error on ubuntu 20.04:
even after fixed by downgrading docker from
(see install a specific version on docker website) |
This gist helped me : https://gist.github.com/nathzi1505/d2aab27ff93a3a9d82dada1336c45041 |
Really, the instructions here are broken: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installing-on-ubuntu-and-debian They still suggest to install If you're on the latest stable ubuntu and upgraded (also, just use latest stable docker), you don't need it anymore. distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker just works. All you need is an nvidia gpu and a working nvidia driver. |
you can try to restart the service of docker. |
none of solutions above works for me. I used the latest Docker version 20 and it ran with this error as long as I used "gpus" command. Finally solved it by uninstalling Docker and follow the steps on nvidia docker2 to reinstall Docker: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installing-on-ubuntu-and-debian. |
Don't any solutions above work in my case. Finally, I execute again the first step on the documentation Setting up Docker: it was that it solved my problems. |
I solve the same problem by just re-install docker: |
Removing the snap version, rebooting and then reinstalling with apt solved the issue for me. |
But if I do not have permission to run |
Why dont you have permissions? |
Hello! I ran into the same problem while trying to launch TensorRT container but non of the above solutions worked for me. Maybe I'm doing something wrong, but I don't know how to check it and I tried to follow all the instructions and prerequisties for this. Thank you in advance! |
So just to check,
If you sure all this is fine, then it might be an issue with the docker image but first just let me know :) |
+-----------------------------------------------------------------------------+ |
Everything looks fine, cuda picked up, drivers fine, super weird. Ive experience this issue a lot, and those steps tend to work or just restarting the docker service There must be a deeper issue to this which I havent discovered yet |
Are you really supposed to have those nvidia packages installed for an AMD gpu? I'm on arch linux and I have the built in apu of a 5700g on one machine and a 5700xt on another machine and I get this error |
Well you cant use an AMD gpu with cuda? So you wont be able to use it with Docker |
It works for Ubuntu 22.04 |
For me, in nvidia-docker2 and Ubuntu 22.04, this command resolved the issue:
|
1. Issue or feature description
I am getting an error message
when I run
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
after the installation of the NVIDIA toolkit. The installation does not show any error message, but I getnvidia-container-cli: command not found
when runningnvidia-container-cli -V
.2. Steps to reproduce the issue
I installed the NVIDIA container toolkit using the code in the Ubuntu section on the main page.
I think that the NVIDIA drivers are installed, since I have used the GPUs with CUDA before without problem and
nvidia-smi
runs without problem (see below).3. Information to attach (optional if deemed irrelevant)
Some nvidia-container information:
nvidia-container-cli -k -d /dev/tty info
I get
nvidia-container-cli: command not found
when I run the above command.Kernel version from
uname -a
Linux dechter 4.15.0-91-generic gpu not found in a docker container #92~16.04.1-Ubuntu SMP Fri Feb 28 14:57:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Any relevant kernel output lines from
dmesg
nvidia-smi -a
docker version
dpkg -l '*nvidia*'
orrpm -qa '*nvidia*'
NVIDIA container library version from
nvidia-container-cli -V
This returns an error message:
nvidia-container-cli: command not found
Docker command, image and tag used
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
I get the following output when running
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
:The text was updated successfully, but these errors were encountered: