You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.
I'm working on using Openstack magnum and kubernetes to have gpu aware docker deployments. I adapted a fedora atomic image (which provide all of the requirement for openstack magnum) and installed the requirements for gpu utilization (nvidia-driver, cuda/cudnn, nvidia-docker2). The system upgrades, including nvidia-docker2 have been installed via the rpm-ostree package manager, except nvidia-driver and cuda (but for cuda/cudnn it's just file copying if i'm right). Fedora atomic doesn't support dkms or akmod, so I installed nvidia-driver with runfile at last, avoiding kernel upgrades which would break the install. nvidia-smi and cuda samples are functional. I supposed that dkms support has an impact only for kernel update but maybe for nvidia-docker2 as well?
2. Steps to reproduce the issue
docker run --rm nvidia/cuda nvidia-smi
Output:
container_linux.go:247: starting container process caused "process_linux.go:362: container init caused "rootfs_linux.go:54: mounting \"cgroup\" to rootfs \"/var/lib/docker/overlay2/72f8a07a4857ee246cd4b69b0a1c110253367c1eb72f616c44654335faabffe9/merged\" at \"/sys/fs/cgroup\" caused \"no subsystem for mount\"""
/usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:362: container init caused "rootfs_linux.go:54: mounting \"cgroup\" to rootfs \"/var/lib/docker/overlay2/72f8a07a4857ee246cd4b69b0a1c110253367c1eb72f616c44654335faabffe9/merged\" at \"/sys/fs/cgroup\" caused \"no subsystem for mount\""".
Or
mkdir /mycontainer cd /mycontainer mkdir rootfs docker export $(docker create busybox) | tar -C rootfs -xvf - nvidia-container-runtime spec nvidia-container-runtime run 1
Output:
container_linux.go:247: starting container process caused "process_linux.go:362: container init caused "rootfs_linux.go:54: mounting \"cgroup\" to rootfs \"/mycontainer/rootfs\" at \"/sys/fs/cgroup\" caused \"no subsystem for mount\"""
3. Information
Security more /sys/fs/cgroup/devices/devices.list
Output:
a *:* rwm
And disabling selinux doesn't show any improvements.
Kernel version from uname -a
Linux fedora.novalocal 4.15.4-200.fc26.x86_64 Add README image #1 SMP Mon Feb 19 19:43:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Any relevant kernel output lines from dmesg
[ 5875.711206] docker0: port 1(veth598691a) entered blocking state
[ 5875.713356] docker0: port 1(veth598691a) entered disabled state
[ 5875.720088] device veth598691a entered promiscuous mode
[ 5875.727272] IPv6: ADDRCONF(NETDEV_UP): veth598691a: link is not ready
[ 5875.730830] IPv6: ADDRCONF(NETDEV_UP): veth6d3b5d8: link is not ready
[ 5875.734218] IPv6: ADDRCONF(NETDEV_UP): veth6d3b5d8: link is not ready
[ 5875.736882] IPv6: ADDRCONF(NETDEV_CHANGE): veth6d3b5d8: link becomes ready
[ 5875.739504] IPv6: ADDRCONF(NETDEV_CHANGE): veth598691a: link becomes ready
[ 5875.742263] docker0: port 1(veth598691a) entered blocking state
[ 5875.744508] docker0: port 1(veth598691a) entered forwarding state
[ 5876.210364] docker0: port 1(veth598691a) entered disabled state
[ 5876.215538] device veth598691a left promiscuous mode
[ 5876.218455] docker0: port 1(veth598691a) entered disabled state
Hello everyone
1. Issue or feature description
I'm working on using Openstack magnum and kubernetes to have gpu aware docker deployments. I adapted a fedora atomic image (which provide all of the requirement for openstack magnum) and installed the requirements for gpu utilization (nvidia-driver, cuda/cudnn, nvidia-docker2). The system upgrades, including nvidia-docker2 have been installed via the rpm-ostree package manager, except nvidia-driver and cuda (but for cuda/cudnn it's just file copying if i'm right). Fedora atomic doesn't support dkms or akmod, so I installed nvidia-driver with runfile at last, avoiding kernel upgrades which would break the install. nvidia-smi and cuda samples are functional. I supposed that dkms support has an impact only for kernel update but maybe for nvidia-docker2 as well?
2. Steps to reproduce the issue
docker run --rm nvidia/cuda nvidia-smi
Output:
container_linux.go:247: starting container process caused "process_linux.go:362: container init caused "rootfs_linux.go:54: mounting \"cgroup\" to rootfs \"/var/lib/docker/overlay2/72f8a07a4857ee246cd4b69b0a1c110253367c1eb72f616c44654335faabffe9/merged\" at \"/sys/fs/cgroup\" caused \"no subsystem for mount\"""
/usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:362: container init caused "rootfs_linux.go:54: mounting \"cgroup\" to rootfs \"/var/lib/docker/overlay2/72f8a07a4857ee246cd4b69b0a1c110253367c1eb72f616c44654335faabffe9/merged\" at \"/sys/fs/cgroup\" caused \"no subsystem for mount\""".
Or
mkdir /mycontainer
cd /mycontainer
mkdir rootfs
docker export $(docker create busybox) | tar -C rootfs -xvf -
nvidia-container-runtime spec
nvidia-container-runtime run 1
Output:
container_linux.go:247: starting container process caused "process_linux.go:362: container init caused "rootfs_linux.go:54: mounting \"cgroup\" to rootfs \"/mycontainer/rootfs\" at \"/sys/fs/cgroup\" caused \"no subsystem for mount\"""
3. Information
more /sys/fs/cgroup/devices/devices.list
Output:
a *:* rwm
And disabling selinux doesn't show any improvements.
uname -a
Linux fedora.novalocal 4.15.4-200.fc26.x86_64 Add README image #1 SMP Mon Feb 19 19:43:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
dmesg
[ 5875.711206] docker0: port 1(veth598691a) entered blocking state
[ 5875.713356] docker0: port 1(veth598691a) entered disabled state
[ 5875.720088] device veth598691a entered promiscuous mode
[ 5875.727272] IPv6: ADDRCONF(NETDEV_UP): veth598691a: link is not ready
[ 5875.730830] IPv6: ADDRCONF(NETDEV_UP): veth6d3b5d8: link is not ready
[ 5875.734218] IPv6: ADDRCONF(NETDEV_UP): veth6d3b5d8: link is not ready
[ 5875.736882] IPv6: ADDRCONF(NETDEV_CHANGE): veth6d3b5d8: link becomes ready
[ 5875.739504] IPv6: ADDRCONF(NETDEV_CHANGE): veth598691a: link becomes ready
[ 5875.742263] docker0: port 1(veth598691a) entered blocking state
[ 5875.744508] docker0: port 1(veth598691a) entered forwarding state
[ 5876.210364] docker0: port 1(veth598691a) entered disabled state
[ 5876.215538] device veth598691a left promiscuous mode
[ 5876.218455] docker0: port 1(veth598691a) entered disabled state
nvidia-smi -a
nvidia-smi.txt
docker version
docker-info.txt
dpkg -l '*nvidia*'
orrpm -qa '*nvidia*'
libnvidia-container-tools-1.0.0-0.1.alpha.3.x86_64
libnvidia-container1-1.0.0-0.1.alpha.3.x86_64
nvidia-container-runtime-1.1.1-1.docker1.13.1.x86_64
nvidia-docker2-2.0.2-1.docker1.13.1.noarch
nvidia-container-cli -V
version: 1.0.0
build date: 2018-01-11T00:23+0000
build revision: 4a618459e8ba522d834bb2b4c665847fae8ce0ad
build compiler: gcc 4.8.5 20150623 (Red Hat 4.8.5-16)
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
The text was updated successfully, but these errors were encountered: