Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

driver 367: invalid cross-device link #184

Closed
Wildcarde opened this issue Aug 26, 2016 · 6 comments
Closed

driver 367: invalid cross-device link #184

Wildcarde opened this issue Aug 26, 2016 · 6 comments

Comments

@Wildcarde
Copy link

It looks like you are explicitly adding support for different drivers as the toolkit progresses with the latest being 364. Unfortunately the Titan XP requires 367 to be usable, and 1.0.0-rc3 doesn't support that yet.

Attempting to run nvidia-docker's test run of nvidia-smi fails generating the following on the command line and in journalctl (on ubuntu 16 LTS):

root@ubuntu1604:~# nvidia-docker run --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: create nvidia_driver_367.35: VolumeDriver.Create: internal error, check logs for details.

root@ubuntu1604:~# journalctl -n -u nvidia-docker
Aug 26 17:00:33 ubuntu1604 systemd[1]: Starting NVIDIA Docker plugin...
Aug 26 17:00:33 ubuntu1604 nvidia-docker-plugin[11889]: /usr/bin/nvidia-docker-plugin | 2016/08/26 17:00:33 Loading NVIDIA unified memory
Aug 26 17:00:33 ubuntu1604 nvidia-docker-plugin[11889]: /usr/bin/nvidia-docker-plugin | 2016/08/26 17:00:33 Loading NVIDIA management library
Aug 26 17:00:33 ubuntu1604 nvidia-docker-plugin[11889]: /usr/bin/nvidia-docker-plugin | 2016/08/26 17:00:33 Discovering GPU devices
Aug 26 17:00:33 ubuntu1604 systemd[1]: Started NVIDIA Docker plugin.
Aug 26 17:00:36 ubuntu1604 nvidia-docker-plugin[11889]: /usr/bin/nvidia-docker-plugin | 2016/08/26 17:00:36 Provisioning volumes at /var/lib/nvidia-docker/volumes
Aug 26 17:00:36 ubuntu1604 nvidia-docker-plugin[11889]: /usr/bin/nvidia-docker-plugin | 2016/08/26 17:00:36 Serving plugin API at /var/lib/nvidia-docker
Aug 26 17:00:36 ubuntu1604 nvidia-docker-plugin[11889]: /usr/bin/nvidia-docker-plugin | 2016/08/26 17:00:36 Serving remote API at localhost:3476
Aug 26 17:00:47 ubuntu1604 nvidia-docker-plugin[11889]: /usr/bin/nvidia-docker-plugin | 2016/08/26 17:00:47 Received create request for volume 'nvidia_driver_367.35'
Aug 26 17:00:47 ubuntu1604 nvidia-docker-plugin[11889]: /usr/bin/nvidia-docker-plugin | 2016/08/26 17:00:47 Error: link /usr/lib/nvidia-367/bin/nvidia-cuda-mps-control /var/lib/nvidia-docker/volumes/nvidia_driver/367.35/bin/nvidia-cuda-mps-control: invalid cross-device link
@flx42
Copy link
Member

flx42 commented Aug 26, 2016

Driver 367 works fine actually, you have another problem, look at issue #133.
It's a known limitation mentioned on our wiki.

For systemd, looks at this comment.

@flx42 flx42 changed the title Support for driver 367 and Titan XP (pascal) driver 367: invalid cross-device link Aug 26, 2016
@Wildcarde
Copy link
Author

Thanks for the quick follow up on this, it looks like this is a result of the way the integrator decided to install the OS and drives:

Filesystem      Size  Used Avail Use% Mounted on
udev            252G     0  252G   0% /dev
tmpfs            51G   11M   51G   1% /run
/dev/sda5       9.8G  4.4G  4.9G  48% /
/dev/sda6        30G   11G   18G  37% /usr
tmpfs           252G  268K  252G   1% /dev/shm
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           252G     0  252G   0% /sys/fs/cgroup
/dev/sda7       9.8G  4.8G  4.5G  52% /var
/dev/sda8       9.8G   23M  9.2G   1% /tmp
/dev/sda10      129G   11G  113G   9% /home
/dev/sda1       477M  101M  347M  23% /boot
tmpfs            51G   40K   51G   1% /run/user/1000

For now forcing it to /usr/local/nvidia-docker and making subfolders for nvidia_driver/367.35 seems to have fixed the driver issue itself. Now I'm running into path issues where it can't find nvidia-smi (or any other programs in the path) inside the container.

root@ubuntu1604:~# nvidia-docker run --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: oci runtime error: exec: "nvidia-smi": executable file not found in $PATH.
root@ubuntu1604:~# nvidia-docker run --rm hello-world

Hello from Docker!
To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker Hub account:
 https://hub.docker.com

For more examples and ideas, visit:
 https://docs.docker.com/engine/userguide/

root@ubuntu1604:~# nvidia-docker run --rm nvidia/cuda "echo test"
docker: Error response from daemon: oci runtime error: exec: "echo test": executable file not found in $PATH.

I suspect this will be similarly an issue with the vendor config, but will have to pursue that next week.

@flx42
Copy link
Member

flx42 commented Aug 26, 2016

making subfolders for nvidia_driver/367.35

Did you mkdir this directory? This is a bad idea, Docker will believe the volume already exists, but it's actually empty (this folder should contain nvidia-smi, this is probably why it can't be found).

@Wildcarde
Copy link
Author

I had, the system was actually reporting a permission denied when trying to create them itself.

@flx42
Copy link
Member

flx42 commented Aug 26, 2016

nvidia-docker-plugin is started as user nvidia-docker, you need to grant the proper permissions for this user before starting the daemon.

@Wildcarde
Copy link
Author

And success! Thanks for the help!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants