Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

error waiting for container: context canceled when running any docker image with --runtime=nvidia #1209

Closed
9 tasks
HemaZ opened this issue Mar 5, 2020 · 9 comments

Comments

@HemaZ
Copy link

HemaZ commented Mar 5, 2020

1. Issue or feature description

I've been recently facing the following problem when i try to run any docker image using Nvidia docker.

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 1 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: input error: parse user/group failed: no such file or directory\\\\n\\\"\"": unknown.
ERRO[0000] error waiting for container: context canceled 

3. Information to attach (optional if deemed irrelevant)

  • Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
  • Kernel version from uname -a

Linux hema 5.3.0-40-generic #32~18.04.1-Ubuntu SMP Mon Feb 3 14:05:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

  • Any relevant kernel output lines from dmesg
  • Driver information from nvidia-smi -a

==============NVSMI LOG==============

Timestamp : Thu Mar 5 02:32:24 2020
Driver Version : 440.33.01
CUDA Version : 10.2

Attached GPUs : 1
GPU 00000000:03:00.0
Product Name : GeForce 920MX
Product Brand : GeForce
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Enabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-23fcb2ab-a6c2-b9e3-f455-6bf92a57b371
Minor Number : 0
VBIOS Version : 82.08.5A.00.0D
MultiGPU Board : No
Board ID : 0x300
GPU Part Number : N/A
Inforom Version
Image Version : N/A
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x03
Device : 0x00
Domain : 0x0000
Device Id : 0x134F10DE
Bus Id : 00000000:03:00.0
Sub System Id : 0x39F117AA
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 4x
Current : 4x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 687000 KB/s
Rx Throughput : 5000 KB/s
Fan Speed : N/A
Performance State : P0
Clocks Throttle Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : N/A
HW Power Brake Slowdown : N/A
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
FB Memory Usage
Total : 2004 MiB
Used : 449 MiB
Free : 1555 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 3 MiB
Free : 253 MiB
Compute Mode : Default
Utilization
Gpu : 23 %
Memory : 14 %
Encoder : N/A
Decoder : N/A
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Temperature
GPU Current Temp : 35 C
GPU Shutdown Temp : 99 C
GPU Slowdown Temp : 94 C
GPU Max Operating Temp : 98 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : N/A
Power Draw : N/A
Power Limit : N/A
Default Power Limit : N/A
Enforced Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 993 MHz
SM : 993 MHz
Memory : 900 MHz
Video : 973 MHz
Applications Clocks
Graphics : 967 MHz
Memory : 900 MHz
Default Applications Clocks
Graphics : 965 MHz
Memory : 900 MHz
Max Clocks
Graphics : 993 MHz
SM : 993 MHz
Memory : 900 MHz
Video : 973 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Processes
Process ID : 1285
Type : G
Name : /usr/lib/xorg/Xorg
Used GPU Memory : 34 MiB
Process ID : 1520
Type : G
Name : /usr/bin/gnome-shell
Used GPU Memory : 102 MiB
Process ID : 7745
Type : G
Name : /usr/lib/xorg/Xorg
Used GPU Memory : 144 MiB
Process ID : 7935
Type : G
Name : /usr/bin/gnome-shell
Used GPU Memory : 158 MiB
Process ID : 8553
Type : G
Name : /usr/lib/firefox/firefox
Used GPU Memory : 0 MiB
Process ID : 8727
Type : G
Name : /usr/lib/firefox/firefox
Used GPU Memory : 0 MiB
Process ID : 26816
Type : G
Name : /usr/lib/firefox/firefox
Used GPU Memory : 0 MiB

  • Docker version from docker version

Client: Docker Engine - Community
Version: 19.03.6
API version: 1.40
Go version: go1.12.16
Git commit: 369ce74a3c
Built: Thu Feb 13 01:27:49 2020
OS/Arch: linux/amd64
Experimental: false

Server: Docker Engine - Community
Engine:
Version: 19.03.6
API version: 1.40 (minimum version 1.12)
Go version: go1.12.16
Git commit: 369ce74a3c
Built: Thu Feb 13 01:26:21 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.12
GitCommit: 35bd7a5f69c13e1563af8a93431411cd9ecf5021
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683

  • NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==========================-==================-==================-==========================================================
un libgldispatch0-nvidia (no description available)
ii libnvidia-cfg1-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA binary OpenGL/GLX configuration library
un libnvidia-cfg1-any (no description available)
un libnvidia-common (no description available)
ii libnvidia-common-440 440.59-0ubuntu0.18 all Shared files used by the NVIDIA libraries
rc libnvidia-compute-435:amd6 435.21-0ubuntu0.18 amd64 NVIDIA libcompute package
ii libnvidia-compute-440:amd6 440.33.01-0ubuntu1 amd64 NVIDIA libcompute package
ii libnvidia-container-tools 1.0.7-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.0.7-1 amd64 NVIDIA container runtime library
un libnvidia-decode (no description available)
ii libnvidia-decode-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA Video Decoding runtime libraries
un libnvidia-encode (no description available)
ii libnvidia-encode-440:amd64 440.33.01-0ubuntu1 amd64 NVENC Video Encoding runtime library
un libnvidia-fbc1 (no description available)
ii libnvidia-fbc1-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
un libnvidia-gl (no description available)
ii libnvidia-gl-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
un libnvidia-ifr1 (no description available)
ii libnvidia-ifr1-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA OpenGL-based Inband Frame Readback runtime library
un libnvidia-ml1 (no description available)
un nvidia-304 (no description available)
un nvidia-340 (no description available)
un nvidia-384 (no description available)
un nvidia-390 (no description available)
un nvidia-common (no description available)
rc nvidia-compute-utils-435 435.21-0ubuntu0.18 amd64 NVIDIA compute utilities
ii nvidia-compute-utils-440 440.33.01-0ubuntu1 amd64 NVIDIA compute utilities
ii nvidia-container-runtime 3.1.4-1 amd64 NVIDIA container runtime
un nvidia-container-runtime-h (no description available)
ii nvidia-container-toolkit 1.0.5-1 amd64 NVIDIA container runtime hook
rc nvidia-dkms-435 435.21-0ubuntu0.18 amd64 NVIDIA DKMS package
ii nvidia-dkms-440 440.33.01-0ubuntu1 amd64 NVIDIA DKMS package
un nvidia-dkms-kernel (no description available)
un nvidia-docker (no description available)
rc nvidia-docker2 2.2.2-1 all nvidia-docker CLI wrapper
ii nvidia-driver-440 440.33.01-0ubuntu1 amd64 NVIDIA driver metapackage
un nvidia-driver-binary (no description available)
un nvidia-kernel-common (no description available)
rc nvidia-kernel-common-435 435.21-0ubuntu0.18 amd64 Shared files used with the kernel module
ii nvidia-kernel-common-440 440.33.01-0ubuntu1 amd64 Shared files used with the kernel module
un nvidia-kernel-source (no description available)
un nvidia-kernel-source-435 (no description available)
ii nvidia-kernel-source-440 440.33.01-0ubuntu1 amd64 NVIDIA kernel source package
un nvidia-legacy-304xx-vdpau- (no description available)
un nvidia-legacy-340xx-vdpau- (no description available)
un nvidia-libopencl1-dev (no description available)
ii nvidia-modprobe 440.33.01-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
un nvidia-opencl-icd (no description available)
un nvidia-persistenced (no description available)
ii nvidia-prime 0.8.8.2 all Tools to enable NVIDIA's Prime
ii nvidia-settings 440.44-0ubuntu0.18 amd64 Tool for configuring the NVIDIA graphics driver
un nvidia-settings-binary (no description available)
un nvidia-smi (no description available)
un nvidia-utils (no description available)
ii nvidia-utils-440 440.33.01-0ubuntu1 amd64 NVIDIA driver support binaries
un nvidia-vdpau-driver (no description available)
ii xserver-xorg-video-nvidia- 440.33.01-0ubuntu1 amd64 NVIDIA binary Xorg driver

  • NVIDIA container library version from nvidia-container-cli -V

version: 1.0.7
build date: 2020-01-21T18:59+00:00
build revision: b71f87c04b8eca8a16bf60995506c35c937347d9
build compiler: x86_64-linux-gnu-gcc-7 7.4.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections

  • NVIDIA container library logs (see troubleshooting)
  • Docker command, image and tag used
@situjie68
Copy link

I have the same problem, have you solved it?

@HemaZ
Copy link
Author

HemaZ commented Mar 8, 2020

@situjie68 not yet

@RenaudWasTaken
Copy link
Contributor

Hello!

Sorry for the lag, can you attach nvidia-container-cli -k -d /dev/tty info ?

@RenaudWasTaken
Copy link
Contributor

Also can you attach the logs from the library: https://github.com/NVIDIA/nvidia-docker/wiki/Troubleshooting

@HemaZ
Copy link
Author

HemaZ commented Mar 10, 2020

Hello!

Sorry for the lag, can you attach nvidia-container-cli -k -d /dev/tty info ?

 -- WARNING, the following logs are for debugging purposes only --

I0310 09:08:55.563473 6875 nvc.c:281] initializing library context (version=1.0.7, build=b71f87c04b8eca8a16bf60995506c35c937347d9)
I0310 09:08:55.563675 6875 nvc.c:255] using root /
I0310 09:08:55.563719 6875 nvc.c:256] using ldcache /etc/ld.so.cache
I0310 09:08:55.563761 6875 nvc.c:257] using unprivileged user 1000:1000
W0310 09:08:55.583091 6876 nvc.c:186] failed to set inheritable capabilities
W0310 09:08:55.583283 6876 nvc.c:187] skipping kernel modules load due to failure
I0310 09:08:55.584072 6877 driver.c:133] starting driver service
I0310 09:08:55.649102 6875 nvc_info.c:438] requesting driver information with ''
I0310 09:08:55.653326 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.440.33.01
I0310 09:08:55.653660 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/tls/libnvidia-tls.so.440.33.01
I0310 09:08:55.654130 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.440.33.01 over /usr/lib/x86_64-linux-gnu/tls/libnvidia-tls.so.440.33.01
I0310 09:08:55.654991 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.440.33.01
I0310 09:08:55.656254 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.440.33.01
I0310 09:08:55.657034 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.440.33.01
I0310 09:08:55.658106 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.440.33.01
I0310 09:08:55.658330 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.440.33.01
I0310 09:08:55.659438 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.440.33.01
I0310 09:08:55.660312 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.440.33.01
I0310 09:08:55.660488 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.440.33.01
I0310 09:08:55.660660 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.440.33.01
I0310 09:08:55.661745 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.440.33.01
I0310 09:08:55.663038 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.440.33.01
I0310 09:08:55.664025 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.440.33.01
I0310 09:08:55.664374 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.440.33.01
I0310 09:08:55.665356 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.440.33.01
I0310 09:08:55.665615 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.440.33.01
I0310 09:08:55.666898 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.440.33.01
I0310 09:08:55.668211 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.440.33.01
I0310 09:08:55.671062 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.440.33.01
I0310 09:08:55.672322 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.440.33.01
I0310 09:08:55.673303 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.440.33.01
I0310 09:08:55.674175 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.440.33.01
I0310 09:08:55.674416 6875 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.440.33.01
I0310 09:08:55.675935 6875 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.440.48.02
I0310 09:08:55.676557 6875 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.440.48.02
I0310 09:08:55.677567 6875 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-opencl.so.440.48.02
I0310 09:08:55.678989 6875 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-ml.so.440.48.02
I0310 09:08:55.680268 6875 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-fbc.so.440.48.02
I0310 09:08:55.681430 6875 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-fatbinaryloader.so.440.48.02
I0310 09:08:55.682423 6875 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-encode.so.440.48.02
I0310 09:08:55.683852 6875 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-compiler.so.440.48.02
I0310 09:08:55.685621 6875 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvcuvid.so.440.48.02
I0310 09:08:55.687487 6875 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libcuda.so.440.48.02
W0310 09:08:55.687926 6875 nvc_info.c:303] missing library libvdpau_nvidia.so
W0310 09:08:55.687983 6875 nvc_info.c:307] missing compat32 library libnvidia-ml.so
W0310 09:08:55.688028 6875 nvc_info.c:307] missing compat32 library libnvidia-cfg.so
W0310 09:08:55.688071 6875 nvc_info.c:307] missing compat32 library libcuda.so
W0310 09:08:55.688092 6875 nvc_info.c:307] missing compat32 library libnvidia-opencl.so
W0310 09:08:55.688112 6875 nvc_info.c:307] missing compat32 library libnvidia-ptxjitcompiler.so
W0310 09:08:55.688146 6875 nvc_info.c:307] missing compat32 library libnvidia-fatbinaryloader.so
W0310 09:08:55.688185 6875 nvc_info.c:307] missing compat32 library libnvidia-compiler.so
W0310 09:08:55.688222 6875 nvc_info.c:307] missing compat32 library libvdpau_nvidia.so
W0310 09:08:55.688258 6875 nvc_info.c:307] missing compat32 library libnvidia-encode.so
W0310 09:08:55.688289 6875 nvc_info.c:307] missing compat32 library libnvidia-opticalflow.so
W0310 09:08:55.688327 6875 nvc_info.c:307] missing compat32 library libnvcuvid.so
W0310 09:08:55.688364 6875 nvc_info.c:307] missing compat32 library libnvidia-eglcore.so
W0310 09:08:55.688395 6875 nvc_info.c:307] missing compat32 library libnvidia-glcore.so
W0310 09:08:55.688437 6875 nvc_info.c:307] missing compat32 library libnvidia-tls.so
W0310 09:08:55.688467 6875 nvc_info.c:307] missing compat32 library libnvidia-glsi.so
W0310 09:08:55.688503 6875 nvc_info.c:307] missing compat32 library libnvidia-fbc.so
W0310 09:08:55.688539 6875 nvc_info.c:307] missing compat32 library libnvidia-ifr.so
W0310 09:08:55.688562 6875 nvc_info.c:307] missing compat32 library libnvidia-rtcore.so
W0310 09:08:55.688603 6875 nvc_info.c:307] missing compat32 library libnvoptix.so
W0310 09:08:55.688636 6875 nvc_info.c:307] missing compat32 library libGLX_nvidia.so
W0310 09:08:55.688671 6875 nvc_info.c:307] missing compat32 library libEGL_nvidia.so
W0310 09:08:55.688706 6875 nvc_info.c:307] missing compat32 library libGLESv2_nvidia.so
W0310 09:08:55.688744 6875 nvc_info.c:307] missing compat32 library libGLESv1_CM_nvidia.so
W0310 09:08:55.688782 6875 nvc_info.c:307] missing compat32 library libnvidia-glvkspirv.so
W0310 09:08:55.688818 6875 nvc_info.c:307] missing compat32 library libnvidia-cbl.so
I0310 09:08:55.691823 6875 nvc_info.c:233] selecting /usr/bin/nvidia-smi
I0310 09:08:55.691970 6875 nvc_info.c:233] selecting /usr/bin/nvidia-debugdump
I0310 09:08:55.692077 6875 nvc_info.c:233] selecting /usr/bin/nvidia-persistenced
I0310 09:08:55.692573 6875 nvc_info.c:233] selecting /usr/bin/nvidia-cuda-mps-control
I0310 09:08:55.692776 6875 nvc_info.c:233] selecting /usr/bin/nvidia-cuda-mps-server
I0310 09:08:55.692941 6875 nvc_info.c:370] listing device /dev/nvidiactl
I0310 09:08:55.692983 6875 nvc_info.c:370] listing device /dev/nvidia-uvm
I0310 09:08:55.693076 6875 nvc_info.c:370] listing device /dev/nvidia-uvm-tools
I0310 09:08:55.693149 6875 nvc_info.c:370] listing device /dev/nvidia-modeset
I0310 09:08:55.693345 6875 nvc_info.c:274] listing ipc /run/nvidia-persistenced/socket
W0310 09:08:55.693451 6875 nvc_info.c:278] missing ipc /tmp/nvidia-mps
I0310 09:08:55.693490 6875 nvc_info.c:494] requesting device information with ''
I0310 09:08:55.702120 6875 nvc_info.c:524] listing device /dev/nvidia0 (GPU-23fcb2ab-a6c2-b9e3-f455-6bf92a57b371 at 00000000:03:00.0)
NVRM version:   440.33.01
CUDA version:   10.2

Device Index:   0
Device Minor:   0
Model:          GeForce 920MX
Brand:          GeForce
GPU UUID:       GPU-23fcb2ab-a6c2-b9e3-f455-6bf92a57b371
Bus Location:   00000000:03:00.0
Architecture:   5.0
I0310 09:08:55.702325 6875 nvc.c:318] shutting down library context
I0310 09:08:55.703546 6877 driver.c:192] terminating driver service
I0310 09:08:55.735740 6875 driver.c:233] driver service terminated successfully

@HemaZ
Copy link
Author

HemaZ commented Mar 10, 2020

@RenaudWasTaken
Copy link
Contributor

@HemaZ you attached the container-runtime log but not the libnvidia-container log :)
There are 2 debug statements in the file you need to enable

@HemaZ
Copy link
Author

HemaZ commented Mar 19, 2020

@HemaZ you attached the container-runtime log but not the libnvidia-container log :)
There are 2 debug statements in the file you need to enable

sorry for the delay. the attached file is the only one generated

the other one which should be in /var/log/nvidia-container-toolkit.log is not generated.

here is my config file /etc/nvidia-container-runtime/config.toml

#swarm-resource = "DOCKER_RESOURCE_GPU"

[nvidia-container-cli]
#root = "/run/nvidia/driver"
#path = "/usr/bin/nvidia-container-cli"
environment = []
debug = "/var/log/nvidia-container-toolkit.log"
#ldcache = "/etc/ld.so.cache"
load-kmods = true
#no-cgroups = false
user = "root:vglusers"
ldconfig = "@/sbin/ldconfig.real"

[nvidia-container-runtime]
debug = "/var/log/nvidia-container-runtime.log"

@HemaZ
Copy link
Author

HemaZ commented May 12, 2020

for anyone getting the same problem. i've uninstalled all nvidia related packages, drivers, cuda and docker.
and installed everything again and this solved the problem for me.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants