-
Notifications
You must be signed in to change notification settings - Fork 2k
Error: nvidia-docker2 : Depends: nvidia-container-runtime (>= 3.4.0) #1388
Comments
It looks like you may have an old nvidia container stack installed (since you are able to run nvidia-container-cli successfully, but it was never installed as part of the current Can you try uninstalling This shouldn't be necessary, but it's worth a shot. Also, are you on a DGX machine? |
@klueska I hit the same issue on Juno laptop, so it's not hardware-specific. Following instruction and fetching
which point into 18.04 repo. |
@AlexMikhalev the fact that Are you also seeing problems with though:
I have tested it multiple times in various environments and am not able to reproduce the issue. |
@klueska I uninstalled the libnvidia-container1, but I still got the same error. I am not on a DGX machine. This is the list of Nvidia packages from "dpkg -l 'nvidia'"
|
@JingL1014 Those are just the packages you have installed. Can you show me the list of packages available:
Mine shows:
|
Mine shows:
|
So it clearly shows version
Can you manually install nvidia-container-runtime and see which version gets installed?:
|
Trying to install container-runtime
|
@AlexMikhalev It looks like you have added a ppa repository from In addition to that, it appears that they don't actually include You need to either remove the |
Hi! I am having the exact same issue. I am using pop-os (based on Ubuntu 20.04m distributed by system76), and I have previously installed and use nvidia-docker2 in machines with the same OS following the normal ubuntu guide (changing the distribution to match the ubuntu version (ubuntu20.04 in the other machines as well). I get what you are saying about the system76 repos, even though it is weird it worked fine about 2 months ago. Anyway, I could try to make NVIDIA repos higher priority, but I don't know how to do it. Could you please guide me on this? Thank you. |
@sebautistam I don't know the details of what it looks like in pop-os, but I'm guessing you have some files under |
There is a file in that directory with this inside:
|
You will need to adjust these (or add more rules for the nvidia repos) according to this: |
I have modified the preferences file and now it looks like this: `Package: * Package: * Package: * And the Do you know the name of the package to include in the preferences file, just to be more specific and avoid future problems with other packages? Thank you! |
@sebautistam Sorry, I'm not that familiar with these preference files. I just know they've caused problems for people in the past, so I figured it was related here. |
I figured out that in my case the problem is the version 1.2.0+ds-0lambda1 of libnvidia-container1 is automatically installed.
So I manually installed version 1.3.0-1 and then manually installed other packages using the following commands. Now nvidia-docker is successfully installed. :
|
@JingL1014 I followed you advice successfully until the last line:
|
@AlexMikhalev It seems that the nvidia-container-runtime of version 3.1.4 is to be installed, probably you can try to manually install version 3.4.0 by running: And then run: |
Fixed. The issue was the priority of system76 apt repo over Nvidia. |
Hi, @AlexMikhalev , which file did you change the repo priority? /etc/apt/sources.list ? I could not find system76 entry there |
I tried to install nvidia-container-runtime for same issue. But it asked me to install nvidia-container-toolkit. I tried to install and it said it's already installed. But it did not let me install nvidia-container-runtime again. (base) JJteam@lambda-quad:~$ sudo apt-get install -y nvidia-container-runtime The following packages have unmet dependencies: The following packages have unmet dependencies: |
@ffahmed
and
|
Thanks @AlexMikhalev for prompt reply! For me I don't have those files in /etc/apt/preferences.d/ cat /etc/apt/preferences.d/cuda-repository-pin-600 Package: nsight-systems Package: *
|
@ffahmed What repo is your version |
Thanks @klueska @AlexMikhalev both of you. vi /etc/apt/preferences.d/nvidia-docker-pin-1002 Then I didn't need to install nvidia-container-toolkit. I was able to directly do |
@ffahmed thank you. Your solution worked for me. You saved my day. |
Thanks.. @klueska @AlexMikhalev @ffahmed |
For me the following worked- Package: * then I ran- |
This does not solve my issue! I get the same exact error as before:
I am on PopOS 22.04 and trying to run this image: https://github.com/atinfinity/nvidia-egl-desktop-ros2/blob/main/foxy/Dockerfile |
@sandman can you check the version of the NVIDIA Container CLI that you are using: It may be that you are not using a version that supports cgroupv2. |
@elezar My version is:
Does this support cgroupv2? |
There was a cgroup-related fix released in v1.8.1 so updating to at least that version (or 1.10.0) is recommended. |
Thanks @elezar ! I got it to work by installing 1.10.0 |
How did you manually instal 1.10.0? @sandman |
@jbartolozzi I have created a gist with the steps: https://gist.github.com/sandman/3777b07f69e117aa8bf1adede26a4e36 |
Ditto Pop-os 22.04: |
@adwaykanhere @intrainepha My Gist applies for 1.10.0 (for host PopOs 22.04 and Container running Ubuntu 20.04). I did not test 1.11.0. |
@fgoodwin @intrainepha @sandman works with 1.11.0. I've removed nvidia-docker2 and its dependencies and then reinstalled again.
|
@Winand this still does not work for me..
And installed:
Getting the error
|
@Winand the fix you mentionned does not work for me either (cli-version==1.11.0 on PopOS 22.04). |
@mathisc you have docker-ce or docker desktop? |
@Winand I don't have
|
@mathisc your "dirty" fixed worked for me, thanks |
Thanks! this is the only one that worked for me |
@mathisc Your quick and dirty fix is the only one that worked for me! |
this workaround was the only one that worked for me, is there any official solution about this issue? |
I'm still stuck on this problem.
docker: Error response from daemon: failed to create shim: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: container error: cgroup subsystem devices not found: unknown. PopOS: 22.04 Any new idea that may help? |
I have the same exact system and am experiencing this problem as well. |
Followup: I was able to get the following output after overwriting the sources.list again from the installation guide with:
(notice the hardcoded ubuntu22.04) Then I installed nvidia-docker2 and restarted docker
Successful Output:
|
After multiple frustrating attempts to follow this advice, I realised the issue was |
For anybody out there using a TUXEDO with TUXEDO OS , I fixed mine by simply adding the nvidia libnvidia container repo and doing an So with one command: curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list \
&& \
sudo apt-get update && sudo apt upgrade -y |
Thanks @criadoperez. As you pointed out, please refer to the updated installation documenation Where Please create an issue against https://github.com/NVIDIA/nvidia-container-toolkit if there are still problems. |
The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense.
Also, before reporting a new issue, please make sure that:
1. Issue or feature description
I am following the instruction on github to install nvidia-docker on Ubuntu20.04 but failed with the following error. Could you help me to identify the problem? Thank you!
sudo apt-get update
Hit:1 https://nvidia.github.io/libnvidia-container/stable/ubuntu20.04/amd64 InRelease
Hit:2 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu20.04/amd64 InRelease
Hit:3 https://nvidia.github.io/nvidia-docker/ubuntu20.04/amd64 InRelease
Get:4 https://download.docker.com/linux/ubuntu focal InRelease [36.2 kB]
Hit:5 http://security.ubuntu.com/ubuntu focal-security InRelease
Hit:6 http://archive.lambdalabs.com/ubuntu focal InRelease
Hit:7 http://archive.ubuntu.com/ubuntu focal InRelease
Hit:8 http://archive.ubuntu.com/ubuntu focal-updates InRelease
Hit:9 http://archive.ubuntu.com/ubuntu focal-backports InRelease
Fetched 36.2 kB in 1s (46.2 kB/s)
Reading package lists... Done
sudo apt-get install -y nvidia-docker2
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
nvidia-docker2 : Depends: nvidia-container-runtime (>= 3.4.0) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
2. Steps to reproduce the issue
sudo apt-get install -y nvidia-docker2
3. Information to attach (optional if deemed irrelevant)
nvidia-container-cli -k -d /dev/tty info
I0923 20:39:55.953720 464021 nvc.c:282] initializing library context (version=1.2.0, build=)
I0923 20:39:55.953761 464021 nvc.c:256] using root /
I0923 20:39:55.953766 464021 nvc.c:257] using ldcache /etc/ld.so.cache
I0923 20:39:55.953770 464021 nvc.c:258] using unprivileged user 4163:4163
I0923 20:39:55.953786 464021 nvc.c:299] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0923 20:39:55.953881 464021 nvc.c:301] dxcore initialization failed, continuing assuming a non-WSL environment
W0923 20:39:55.956568 464022 nvc.c:187] failed to set inheritable capabilities
W0923 20:39:55.956616 464022 nvc.c:188] skipping kernel modules load due to failure
I0923 20:39:55.956875 464023 driver.c:101] starting driver service
I0923 20:39:55.959606 464021 nvc_info.c:679] requesting driver information with ''
I0923 20:39:55.960768 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.450.57
I0923 20:39:55.960809 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.450.57
I0923 20:39:55.960831 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.450.57
I0923 20:39:55.960854 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.450.57
I0923 20:39:55.960889 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.450.57
I0923 20:39:55.960923 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.450.57
I0923 20:39:55.960945 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.450.57
I0923 20:39:55.960965 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.450.57
I0923 20:39:55.961000 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.450.57
I0923 20:39:55.961033 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.450.57
I0923 20:39:55.961054 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.450.57
I0923 20:39:55.961074 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.450.57
I0923 20:39:55.961095 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.450.57
I0923 20:39:55.961128 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.450.57
I0923 20:39:55.961161 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.450.57
I0923 20:39:55.961182 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.450.57
I0923 20:39:55.961203 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.450.57
I0923 20:39:55.961235 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.450.57
I0923 20:39:55.961257 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.450.57
I0923 20:39:55.961295 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.450.57
I0923 20:39:55.961534 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.450.57
I0923 20:39:55.961646 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.450.57
I0923 20:39:55.961669 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.450.57
I0923 20:39:55.961692 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.450.57
I0923 20:39:55.961716 464021 nvc_info.c:168] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.450.57
I0923 20:39:55.961757 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-tls.so.450.57
I0923 20:39:55.961790 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.450.57
I0923 20:39:55.961827 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.450.57
I0923 20:39:55.961864 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-opencl.so.450.57
I0923 20:39:55.961887 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-ml.so.450.57
I0923 20:39:55.961923 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-ifr.so.450.57
I0923 20:39:55.961957 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-glvkspirv.so.450.57
I0923 20:39:55.961989 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-glsi.so.450.57
I0923 20:39:55.962009 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-glcore.so.450.57
I0923 20:39:55.962032 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-fbc.so.450.57
I0923 20:39:55.962071 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-encode.so.450.57
I0923 20:39:55.962105 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-eglcore.so.450.57
I0923 20:39:55.962125 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-compiler.so.450.57
I0923 20:39:55.962146 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvidia-allocator.so.450.57
I0923 20:39:55.962182 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libnvcuvid.so.450.57
I0923 20:39:55.962229 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libcuda.so.450.57
I0923 20:39:55.962272 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libGLX_nvidia.so.450.57
I0923 20:39:55.962295 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libGLESv2_nvidia.so.450.57
I0923 20:39:55.962318 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libGLESv1_CM_nvidia.so.450.57
I0923 20:39:55.962340 464021 nvc_info.c:168] selecting /usr/lib/i386-linux-gnu/libEGL_nvidia.so.450.57
W0923 20:39:55.962361 464021 nvc_info.c:349] missing library libnvidia-fatbinaryloader.so
W0923 20:39:55.962366 464021 nvc_info.c:349] missing library libvdpau_nvidia.so
W0923 20:39:55.962373 464021 nvc_info.c:353] missing compat32 library libnvidia-cfg.so
W0923 20:39:55.962379 464021 nvc_info.c:353] missing compat32 library libnvidia-fatbinaryloader.so
W0923 20:39:55.962384 464021 nvc_info.c:353] missing compat32 library libnvidia-ngx.so
W0923 20:39:55.962389 464021 nvc_info.c:353] missing compat32 library libvdpau_nvidia.so
W0923 20:39:55.962395 464021 nvc_info.c:353] missing compat32 library libnvidia-rtcore.so
W0923 20:39:55.962400 464021 nvc_info.c:353] missing compat32 library libnvoptix.so
W0923 20:39:55.962407 464021 nvc_info.c:353] missing compat32 library libnvidia-cbl.so
I0923 20:39:55.968551 464021 nvc_info.c:275] selecting /usr/bin/nvidia-smi
I0923 20:39:55.968574 464021 nvc_info.c:275] selecting /usr/bin/nvidia-debugdump
I0923 20:39:55.968597 464021 nvc_info.c:275] selecting /usr/bin/nvidia-persistenced
I0923 20:39:55.968612 464021 nvc_info.c:275] selecting /usr/bin/nvidia-cuda-mps-control
I0923 20:39:55.968631 464021 nvc_info.c:275] selecting /usr/bin/nvidia-cuda-mps-server
I0923 20:39:55.968652 464021 nvc_info.c:437] listing device /dev/nvidiactl
I0923 20:39:55.968657 464021 nvc_info.c:437] listing device /dev/nvidia-uvm
I0923 20:39:55.968663 464021 nvc_info.c:437] listing device /dev/nvidia-uvm-tools
I0923 20:39:55.968667 464021 nvc_info.c:437] listing device /dev/nvidia-modeset
I0923 20:39:55.968695 464021 nvc_info.c:316] listing ipc /run/nvidia-persistenced/socket
W0923 20:39:55.968712 464021 nvc_info.c:320] missing ipc /tmp/nvidia-mps
I0923 20:39:55.968717 464021 nvc_info.c:744] requesting device information with ''
I0923 20:39:55.975153 464021 nvc_info.c:627] listing device /dev/nvidia0 (GPU-b4284e5d-adf4-2a5e-69dd-f53c99fc475d at 00000000:01:00.0)
I0923 20:39:55.981478 464021 nvc_info.c:627] listing device /dev/nvidia1 (GPU-c2e07576-ea0a-33b0-1622-f8c2132c2086 at 00000000:21:00.0)
I0923 20:39:55.988026 464021 nvc_info.c:627] listing device /dev/nvidia2 (GPU-ce68be3f-afa6-1eb5-a43c-27640ca76732 at 00000000:4b:00.0)
I0923 20:39:55.994670 464021 nvc_info.c:627] listing device /dev/nvidia3 (GPU-b74b3210-8285-2858-0bd7-5fb7e2d40cba at 00000000:4c:00.0)
NVRM version: 450.57
CUDA version: 11.0
Device Index: 0
Device Minor: 0
Model: Quadro RTX 6000
Brand: Quadro
GPU UUID: GPU-b4284e5d-adf4-2a5e-69dd-f53c99fc475d
Bus Location: 00000000:01:00.0
Architecture: 7.5
Device Index: 1
Device Minor: 1
Model: Quadro RTX 6000
Brand: Quadro
GPU UUID: GPU-c2e07576-ea0a-33b0-1622-f8c2132c2086
Bus Location: 00000000:21:00.0
Architecture: 7.5
Device Index: 2
Device Minor: 2
Model: Quadro RTX 6000
Brand: Quadro
GPU UUID: GPU-ce68be3f-afa6-1eb5-a43c-27640ca76732
Bus Location: 00000000:4b:00.0
Architecture: 7.5
Device Index: 3
Device Minor: 3
Model: Quadro RTX 6000
Brand: Quadro
GPU UUID: GPU-b74b3210-8285-2858-0bd7-5fb7e2d40cba
Bus Location: 00000000:4c:00.0
Architecture: 7.5
I0923 20:39:55.994743 464021 nvc.c:337] shutting down library context
I0923 20:39:55.995575 464023 driver.c:156] terminating driver service
I0923 20:39:55.995902 464021 driver.c:196] driver service terminated successfully
Kernel version from
uname -a
Linux mlrgpu07 5.4.0-47-generic Topic/small additions #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Any relevant kernel output lines from
dmesg
Driver information from
nvidia-smi -a
Driver Version : 450.57
CUDA Version : 11.0
Docker version from
docker version
Client: Docker Engine - Community
Version: 19.03.13
API version: 1.40
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:02:52 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.13
API version: 1.40 (minimum version 1.12)
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:01:20 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.7
GitCommit: 8fba4e9a7d01810a393d5d25a3621dc101981175
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
dpkg -l '*nvidia*'
orrpm -qa '*nvidia*'
nvidia-container-cli -V
version: 1.2.0
build date: 2020-07-09T02:45+00:00
build revision:
build compiler: gcc-5 5.4.0 20160609
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -Wdate-time -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections -Wl,-Bsymbolic-functions -Wl,-z,relro
The text was updated successfully, but these errors were encountered: