Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

Newby question to CUDA container and ssh #36

Closed
spalkovits opened this issue Jan 18, 2016 · 6 comments
Closed

Newby question to CUDA container and ssh #36

spalkovits opened this issue Jan 18, 2016 · 6 comments
Labels

Comments

@spalkovits
Copy link

Hello,
I have a machine with a proper CUDA and Docker installation. When I start an interactive container and for example do an nvidia-sim -l everything looks fine. However when I add an ssh server that in the future other users can also use CUDA (without knowing about Docker) the same container fails when I do an nvidia-sim, although the binary is there.
I read about the nvidia-docker-plugin, but I think I need something like a step by step instruction on how to use it.
Regards,
Stefan

@3XX0
Copy link
Member

3XX0 commented Jan 18, 2016

I'm not sure I understood your problem correctly.
Where is sshd living? in the host or in the container? Are you using NV_HOST?
Can you give use the list of commands you issued with their respective output, it would help us reproduce the error.

@3XX0 3XX0 added the question label Jan 18, 2016
@spalkovits
Copy link
Author

Hello,
I did the following:

Prerequisites:

  • Docker is installed properly on my Ubuntu 14.04 machine, the "Hello World" Container works like expected
  • The Nvidia driver on the host machine is working properly. I did it after the instruction on http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html and everything works fine
  • nvidia-docker is installed properly after the instructions on https://github.com/NVIDIA/nvidia-docker. Everything works fine. When I make the example and run the "nvidia-smi" example I get the expected output.
    . The nvidia-docker-plugin is installed and working. When I "sudo nvidia-docker-plugin -l :3476" and on the other hand do a "curl localhost:3476/v1.0/gpu/info" I get the desired output.

Finally my problem:

It looks the like this:

FROM cuda
RUN apt-get update && apt-get install -y openssh-server
RUN mkdir /var/run/sshd
RUN echo 'root:screencast' | chpasswd
RUN sed -i 's/PermitRootLogin without-password/PermitRootLogin yes/' /etc/ssh/sshd_config

# SSH login fix. Otherwise user is kicked off after login
RUN sed 's@session\s*required\s*pam_loginuid.so@session optional pam_loginuid.so@g' -i    /etc/pam.d/sshd

ENV NOTVISIBLE "in users profile"
RUN echo "export VISIBLE=now" >> /etc/profile

EXPOSE 22
CMD ["/usr/sbin/sshd", "-D"]

I changed the password but that should not be an issue. The I build the container with docker with "docker build -t image_name_goes_here".

When I start the container interactively with "nvidia-docker run -it --name name_goes_here -p 10022:22 image_goes_here /bin/bash" I can use "nvidia-smi -q" to get the desired output.

BUT when I ssh into the same running container even a "which nvidia-smi" fails though it is in the right place.

Any ideas what I missed to get the desired behavior? I what the ssh-container solution because I do not want every user to work on the host machine though I know I does not completely fulfill the docker philosophy.

Regards,

Stefan

@3XX0
Copy link
Member

3XX0 commented Jan 19, 2016

Your issue comes from the fact that the CUDA environment is not passed to the SSH session.
You need to export it in your /etc/profile as shown in your example.
The following should do the trick:

RUN echo "export PATH=$PATH" >> /etc/profile && \
    echo "ldconfig" >> /etc/profile

@spalkovits
Copy link
Author

Indeed that solved it. Thank you very much.

May I add another two questions then:

I hope my questions are not too abstract.

Regards,

Stefan

@3XX0
Copy link
Member

3XX0 commented Jan 19, 2016

  1. The documentation of nvidia-docker and nvidia-docker-plugin explains it. The plugin is needed if you want to deploy NVIDIA Docker on a remote host (say AWS) or if you don't want to setup your volumes manually.
  2. You can, however your GPU processes will have to share the GPU. You can use NVIDIA MPS for that purpose.

@spalkovits
Copy link
Author

Thanks a lot. I think I can go on with your information.

Regards,

Stefan

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants