cmake fails unable to find cuda library while building an image #1033

dittothat · 2019-07-31T22:37:09Z

1. Issue or feature description

I have created a Dockerfile to containerize some medical image processing code. With NVIDIA-docker2 I was able to use the file to generate an image without issue. When I attempt to build that image on a different machine with the latest Docker (19.03) and the latest NVIDIA-docker, it fails on step 8/8 when cmake cannot find the CUDA_CUDA_LIBRARY. When I run the step 7/8 image in the bash shell I can copy and paste the cmake command (line 86 of the Dockerfile) that failed during build, and it configures and then compiles fine in the image. My conception of Docker containers and images is being strained by this issue. I don't understand why a RUN command could fail while the same command run in the image would work.

Debugging a bit in the image I initialize with "docker run --gpus all -it /bin/bash", cmake finds CUDA_CUDA_LIBRARY in the image at /usr/lib/x86_64-linux-gnu/libcuda.so, but when I hard code that location in the Dockerfile cmake command (i.e. I use the commented line 87 in the Dockerfile I link above) cmake gives the error "No rule to make target '/usr/lib/x86_64-linux-gnu/libcuda.so', needed by '../bin/SVRreconstructionGPU'." which makes me believe that library actually doesn't exist in the "build image".

2. Steps to reproduce the issue

git clone git@github.com:dittothat/dockerfetalrecon.git
cd dockerfetalrecon
docker build -t fetalrecon .
This will fail when cmake cannot find the cuda libraries needed to compile.
Comment out line 86 and uncomment line 87 in the Dockerfile
docker build -t fetalrecon .
This will fail when the library really cannot be found.
Now initialize the image created by step 7/8 in build:
docker run --gpus all -it fetalrecon /bin/bash
Then in the container:
cd /usr/src/fetalReconstruction/source/build
cmake -DCUDA_SDK_ROOT_DIR:PATH=/usr/local/cuda-9.1/samples ..
make
Everything compiles just fine (though sometimes I must run make again to fix the linking error with niftiio toward the end, still trying to figure out what is going on there).

3. Information to attach (optional if deemed irrelevant)

Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
container_information.log
Kernel version from uname -a
Linux titan 5.0.0-21-generic Release file has an invalid format #22+system76-Ubuntu SMP Tue Jul 16 19:57:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Any relevant kernel output lines from dmesg
Driver information from nvidia-smi -a
driver_information.log
Docker version from docker version
docker_version.log
NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'
NVIDIA_pacakges_ver.log
NVIDIA container library version from nvidia-container-cli -V
NVIDIA_container_lib_ver.log
NVIDIA container library logs (see troubleshooting)
Docker command, image and tag used

The text was updated successfully, but these errors were encountered:

kiendang · 2019-08-03T16:19:46Z

To use nvidia runtime with docker build you need to make it the default runtime. Just put "default-runtime": "nvidia" in /etc/docker/daemon.json.

guptaNswati · 2019-08-06T20:10:29Z

Yes, the library wont be present during the build time unless you mount it inside the container. You can either do a docker run --gpus and do the rest of the build inside the container and then do a docker commit. Or use a -v option to manually mount it. Hope this helps. Closing now.

dittothat · 2019-08-09T14:38:01Z

To use nvidia runtime with docker build you need to make it the default runtime. Just put "default-runtime": "nvidia" in /etc/docker/daemon.json.

Ok, this makes sense. I tried adding the line to suggest to daemon.json, but docker will not start with this modified config file. With the latest nvidia-docker working with Docker 19.03.1, the nvidia runtime doesn't appear to be registered (i.e. dockerd --default-runtime=nvidia returns specified default runtime 'nvidia' does not exist). I am cautious to rely on the documentation in the wiki given that it now spans three nvidia-docker versions. Is it necessary and are there updated instructions for registering the the nividia runtime with the latest nvidia-docker? I suppose editing daemon.json as described may no longer be the accepted method for configuring the default runtime during docker build.

Yes, the library wont be present during the build time unless you mount it inside the container.

Can you give any more details about where to find the appropriate library to mount and to compile against? Since the beauty of nividia-docker is that it is host driver agnostic to some extent, it seems to me that that CUDA libraries I mount should correspond to the CUDA version in the specific nividia-docker image I have selected. Perhaps I am wrong.

docker run --gpus and do the rest of the build inside the container and then do a docker commit

This seems to deviate wildly from best docker practices. I know it will work, but I would love to get docker build and a Dockerfile working properly for my use. That the CUDA libraries are not mounted during build seems like a problem to me.

kiendang · 2019-08-09T14:44:26Z

Ok, this makes sense. I tried adding the line to suggest to daemon.json, but docker will not start with this modified config file. With the latest nvidia-docker working with Docker 19.03.1, the nvidia runtime doesn't appear to be registered (i.e. dockerd --default-runtime=nvidia returns specified default runtime 'nvidia' does not exist). I am cautious to rely on the documentation in the wiki given that it now spans three nvidia-docker versions. Is it necessary and are there updated instructions for registering the the nividia runtime with the latest nvidia-docker? I suppose editing daemon.json as described may no longer be the accepted method for configuring the default runtime during docker build.

You have to install the nvidia-container-runtime package in your host. Then put this inside daemon.json

{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia"
}

dittothat · 2019-08-09T16:05:45Z

That's excellent, it worked nicely. Thank you very much for your help.

reconlabs-sergio · 2023-09-01T09:52:46Z

For the life of me, I cannot get this approach to work. Is there something that overrides the default runtime? Is there a way to debug which runtime is getting used?
I'm starting FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-devel and I need cuda available to compile a few libraries in the build stage. If I run the container, it's there, but cuda will not be available during the build stage.
Any clues?

reconlabs-sergio · 2023-09-01T10:19:25Z

I didn't know I should restart the deamon for changes to take place

After modifying the deamon.json, sudo systemctl restart docker did the trick.

guptaNswati closed this as completed Aug 6, 2019

reconlabs-sergio mentioned this issue Sep 6, 2023

Added a Dockerfile and instructions to build it graphdeco-inria/gaussian-splatting#163

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmake fails unable to find cuda library while building an image #1033

cmake fails unable to find cuda library while building an image #1033

dittothat commented Jul 31, 2019 •

edited by guptaNswati

Loading

kiendang commented Aug 3, 2019

guptaNswati commented Aug 6, 2019 •

edited

Loading

dittothat commented Aug 9, 2019 •

edited

Loading

kiendang commented Aug 9, 2019

dittothat commented Aug 9, 2019

reconlabs-sergio commented Sep 1, 2023 •

edited

Loading

reconlabs-sergio commented Sep 1, 2023 •

edited

Loading

cmake fails unable to find cuda library while building an image #1033

cmake fails unable to find cuda library while building an image #1033

Comments

dittothat commented Jul 31, 2019 • edited by guptaNswati Loading

1. Issue or feature description

2. Steps to reproduce the issue

3. Information to attach (optional if deemed irrelevant)

kiendang commented Aug 3, 2019

guptaNswati commented Aug 6, 2019 • edited Loading

dittothat commented Aug 9, 2019 • edited Loading

kiendang commented Aug 9, 2019

dittothat commented Aug 9, 2019

reconlabs-sergio commented Sep 1, 2023 • edited Loading

reconlabs-sergio commented Sep 1, 2023 • edited Loading

dittothat commented Jul 31, 2019 •

edited by guptaNswati

Loading

guptaNswati commented Aug 6, 2019 •

edited

Loading

dittothat commented Aug 9, 2019 •

edited

Loading

reconlabs-sergio commented Sep 1, 2023 •

edited

Loading

reconlabs-sergio commented Sep 1, 2023 •

edited

Loading