building images with nvidia-docker #595

cobie8a · 2018-01-05T20:58:25Z

1. Building images with nvidia-docker

2. In the past, I am able to just call nvidia-docker build. With version 2 requiring '--runtime' flag, Docker does not recognize it (although it works just fine as 'docker run --runtime'). I have not seen anything in the documentation regarding building with nvidia-docker version 2. Please advise.

cobie8a · 2018-01-05T21:32:59Z

I've followed the instructions [https://github.com/nvidia/nvidia-container-runtime#docker-engine-setup] to register the runtime but still cannot set runtime default to 'nvidia'. I stop docker.service, and run 'sudo dockerd --default-runtime=nvidia &' which sets my runtime default to 'nvidia' but when I try to restart the service, it fails.

Please help!

flx42 · 2018-01-05T21:50:34Z

Do you need GPU support during docker build? If not, you can just use docker build.

With version 1.0, nvidia-docker build was not doing anything special.

cobie8a · 2018-01-06T00:16:38Z

Sounds good. I’ll give it a try. Primarily, I thought nvidia-docker provided gpu passthru support for building Caffe with GPU, which I do see on the build logs.

flx42 · 2018-01-06T00:34:02Z

You don't need to have a GPU machine to build a GPU project. The compiler (nvcc) doesn't need to run GPU code, it only needs to know which GPU families you will target.

xkszltl · 2018-03-19T05:22:27Z

Actually I think it's really important the have runtime support for docker build.

The reason is for testing:
If we wanna run unit tests after compiling GPU-related tools, we'll have to get GPU access somehow.

RuRo · 2019-02-27T12:10:14Z

This is really quite important. Many tools require the presence of hardware to be configured correctly.
Please, either fix this, or provide us with a workaround for building with tools, that refuse to compile without libcuda etc.

RenaudWasTaken · 2019-02-27T17:55:32Z

Set the default runtime to NVIDIA

RuRo · 2019-02-27T18:15:40Z

Set the default runtime to NVIDIA

I don't have access to /etc/docker/daemon.json on the system. I am assuming there is no 'per-user' default for this, since it's a daemon setting. Am I missing something?

icolwell-as · 2019-08-08T22:48:20Z

I ran into this same issue trying to compile something that uses tensorflow in a xenial-based image. tensorflow was complaining:

ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

I was able to get my docker builds to work by setting the default runtime as @RenaudWasTaken suggested. I didn't really know how to do this until I googled around figuring it out. Perhaps this may help others:

Edit/create the /etc/docker/daemon.json with the below content:

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia"
}

Install nvidia-container-runtime package. I had followed the instructions here, but it seems nvidia-container-runtime isn't installed by default.

sudo apt-get install nvidia-container-runtime

sudo systemctl restart docker.service
Try your docker build again.

Related Links:
https://github.com/nvidia/nvidia-container-runtime#docker-engine-setup
https://docs.nvidia.com/dgx/nvidia-container-runtime-upgrade/index.html#using-nv-container-runtime

RenaudWasTaken · 2019-08-08T23:26:36Z

Another solution if your docker build is just doing compilation is to use the stubs in /usr/local/cuda/lib64/stubs/

dancingpipi · 2019-10-21T09:24:33Z

I ran into this same issue trying to compile something that uses tensorflow in a xenial-based image. tensorflow was complaining:
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.
I was able to get my docker builds to work by setting the default runtime as @RenaudWasTaken suggested. I didn't really know how to do this until I googled around figuring it out. Perhaps this may help others:

Edit/create the /etc/docker/daemon.json with the below content:
{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia"
}
Install nvidia-container-runtime package. I had followed the instructions here, but it seems nvidia-container-runtime isn't installed by default.
sudo apt-get install nvidia-container-runtime
sudo systemctl restart docker.service

Try your docker build again.

Related Links:
https://github.com/nvidia/nvidia-container-runtime#docker-engine-setup
https://docs.nvidia.com/dgx/nvidia-container-runtime-upgrade/index.html#using-nv-container-runtime

mark!

RenaudWasTaken · 2019-10-21T19:12:28Z

@z13974509906 the recommended path is to build CUDA code during docker build time and run CUDA code during docker run time :)

You wouldn't need libcuda.so in that case and can use the stubs at build time.

kevindoran · 2020-07-29T11:20:52Z

To build using the stubs, you need to make the stubs path known to the linker. One option is to add the path to the LIBRARY_PATH environmental variable. (LD_LIBRARY_PATH is for runtime linking, whereas LIBRARY_PATH is used for compile time linking). Example:

ENV LIBRARY_PATH $LIBRARY_PATH:/usr/local/cuda/lib64/stubs

wuyuanyi135 · 2020-11-16T21:07:29Z

I ran into this same issue trying to compile something that uses tensorflow in a xenial-based image. tensorflow was complaining:
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.
I was able to get my docker builds to work by setting the default runtime as @RenaudWasTaken suggested. I didn't really know how to do this until I googled around figuring it out. Perhaps this may help others:

Edit/create the /etc/docker/daemon.json with the below content:
{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia"
}
Install nvidia-container-runtime package. I had followed the instructions here, but it seems nvidia-container-runtime isn't installed by default.
sudo apt-get install nvidia-container-runtime
sudo systemctl restart docker.service

Try your docker build again.

Related Links:
https://github.com/nvidia/nvidia-container-runtime#docker-engine-setup
https://docs.nvidia.com/dgx/nvidia-container-runtime-upgrade/index.html#using-nv-container-runtime

This solved my problem where torch.cuda.is_available() returns False.

xkszltl · 2020-11-16T21:18:49Z

@icolwell-as

...

sudo systemctl restart docker.service

Try your docker build again.

Related Links:
https://github.com/nvidia/nvidia-container-runtime#docker-engine-setup
https://docs.nvidia.com/dgx/nvidia-container-runtime-upgrade/index.html#using-nv-container-runtime

You don't need to restart the daemon, sudo killall -s HUP dockerd is usually enough.
Unlike it sounds, it won't kill anything.
It will send SIGHUP to dockerd and signal handler will reload the config json.

flx42 added the work as intended label Jan 7, 2018

flx42 closed this as completed Jan 7, 2018

denis-sumin mentioned this issue Mar 27, 2019

Docker Runtime Error: Not Compiled with GPU support facebookresearch/maskrcnn-benchmark#167

Open

qzhong0605 mentioned this issue Oct 31, 2019

Unable to build cuda extension without nvidia runtime with PyTorch 1.3 pytorch/pytorch#28955

Closed

caprest mentioned this issue Jul 27, 2020

What is the best practice for building container with NVIDIA Container Toolkit ? #1356

Closed

AllentDan mentioned this issue Apr 22, 2022

[Docker: Cannot Build TensorRT Plugin]- No CUDA runtime is found open-mmlab/mmcv#1706

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

building images with nvidia-docker #595

building images with nvidia-docker #595

cobie8a commented Jan 5, 2018

cobie8a commented Jan 5, 2018

flx42 commented Jan 5, 2018 •

edited

Loading

cobie8a commented Jan 6, 2018

flx42 commented Jan 6, 2018

xkszltl commented Mar 19, 2018

RuRo commented Feb 27, 2019

RenaudWasTaken commented Feb 27, 2019

RuRo commented Feb 27, 2019

icolwell-as commented Aug 8, 2019 •

edited

Loading

RenaudWasTaken commented Aug 8, 2019

dancingpipi commented Oct 21, 2019

RenaudWasTaken commented Oct 21, 2019 •

edited

Loading

kevindoran commented Jul 29, 2020

wuyuanyi135 commented Nov 16, 2020

xkszltl commented Nov 16, 2020 •

edited

Loading

building images with nvidia-docker #595

building images with nvidia-docker #595

Comments

cobie8a commented Jan 5, 2018

1. Building images with nvidia-docker

2. In the past, I am able to just call nvidia-docker build. With version 2 requiring '--runtime' flag, Docker does not recognize it (although it works just fine as 'docker run --runtime'). I have not seen anything in the documentation regarding building with nvidia-docker version 2. Please advise.

cobie8a commented Jan 5, 2018

flx42 commented Jan 5, 2018 • edited Loading

cobie8a commented Jan 6, 2018

flx42 commented Jan 6, 2018

xkszltl commented Mar 19, 2018

RuRo commented Feb 27, 2019

RenaudWasTaken commented Feb 27, 2019

RuRo commented Feb 27, 2019

icolwell-as commented Aug 8, 2019 • edited Loading

RenaudWasTaken commented Aug 8, 2019

dancingpipi commented Oct 21, 2019

RenaudWasTaken commented Oct 21, 2019 • edited Loading

kevindoran commented Jul 29, 2020

wuyuanyi135 commented Nov 16, 2020

xkszltl commented Nov 16, 2020 • edited Loading

flx42 commented Jan 5, 2018 •

edited

Loading

icolwell-as commented Aug 8, 2019 •

edited

Loading

RenaudWasTaken commented Oct 21, 2019 •

edited

Loading

xkszltl commented Nov 16, 2020 •

edited

Loading