You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
3. Information to attach (optional if deemed irrelevant)
Kernel version from uname -a
Linux hostname 3.10.0-514.16.1.el7.x86_64 #1 SMP Wed Apr 12 15:04:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Any relevant kernel output lines from dmesg
[Mon Oct 29 12:07:38 2018] device vethcd22dec entered promiscuous mode
[Mon Oct 29 12:07:39 2018] docker0: port 1(vethcd22dec) entered forwarding state
[Mon Oct 29 12:07:39 2018] docker0: port 1(vethcd22dec) entered forwarding state
[Mon Oct 29 12:07:39 2018] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 236
[Mon Oct 29 12:07:39 2018] NVRM: GPU at PCI:0000:8a:00: GPU-aa1a2c8f-eacf-29be-47b0-fbf8c6491ed9
[Mon Oct 29 12:07:39 2018] NVRM: GPU Board Serial Number: 0321116070750
[Mon Oct 29 12:07:39 2018] NVRM: Xid (PCI:0000:8a:00): 48, An uncorrectable double bit error (DBE) has been detected on GPU in the framebuffer at partition 4, subpartition 1.
[Mon Oct 29 12:07:39 2018] docker0: port 1(vethcd22dec) entered disabled state
[Mon Oct 29 12:07:39 2018] docker0: port 1(vethcd22dec) entered disabled state
[Mon Oct 29 12:07:39 2018] device vethcd22dec left promiscuous mode
[Mon Oct 29 12:07:39 2018] docker0: port 1(vethcd22dec) entered disabled state
[Mon Oct 29 12:07:40 2018] NVRM: Xid (PCI:0000:8a:00): 63, Dynamic Page Retirement: New page retired, reboot to activate (0x00000000002ce8ac).
[Mon Oct 29 12:39:50 2018] device vetha1169e8 entered promiscuous mode
[Mon Oct 29 12:39:50 2018] docker0: port 1(vetha1169e8) entered forwarding state
[Mon Oct 29 12:39:50 2018] docker0: port 1(vetha1169e8) entered forwarding state
[Mon Oct 29 12:39:50 2018] docker0: port 1(vetha1169e8) entered disabled state
[Mon Oct 29 12:39:50 2018] docker0: port 1(vetha1169e8) entered disabled state
[Mon Oct 29 12:39:50 2018] device vetha1169e8 left promiscuous mode
[Mon Oct 29 12:39:50 2018] docker0: port 1(vetha1169e8) entered disabled state
Driver information from nvidia-smi -a
==============NVSMI LOG==============
Timestamp : Mon Oct 29 12:28:27 2018
Driver Version : 390.30
Attached GPUs : 8
GPU 00000000:05:00.0
Product Name : Tesla K80
Product Brand : Tesla
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Enabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 1920
Driver Model
Current : N/A
Pending : N/A
......
Docker version from docker version
Client:
Version: 18.06.1-ce
API version: 1.38
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:23:03 2018
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 18.06.1-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:25:29 2018
OS/Arch: linux/amd64
Experimental: false
NVIDIA packages version from dpkg -l '*nvidia*'orrpm -qa '*nvidia*'
output of rpm -qa '*nvidia*'
[Mon Oct 29 12:07:39 2018] NVRM: Xid (PCI:0000:8a:00): 48, An uncorrectable double bit error (DBE) has been detected on GPU in the framebuffer at partition 4, subpartition 1.
[Mon Oct 29 12:07:40 2018] NVRM: Xid (PCI:0000:8a:00): 63, Dynamic Page Retirement: New page retired, reboot to activate (0x00000000002ce8ac).
One of your GPU hit a DBE, you should reboot to retire the faulting page.
Also check that your driver is installed properly and libcuda.so.1 matches your driver version
As mentioned above, check that your driver is installed properly. You can try to launch a CUDA application outside of containers, like the CUDA samples, and it will fail if that's the case.
Closing as there wasn't any follow-up. Feel free to re-open if you have any more information.
1. Issue or feature description
Not able to use nvidia-docker2.0 with driver version 390.30
And get error message:
2. Steps to reproduce the issue
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
3. Information to attach (optional if deemed irrelevant)
uname -a
dmesg
nvidia-smi -a
docker version
dpkg -l '*nvidia*'
orrpm -qa '*nvidia*'
output of
rpm -qa '*nvidia*'
nvidia-container-cli -V
cat /var/log/nvidia-container-runtime-hook.log
The text was updated successfully, but these errors were encountered: