Skip to content
This repository has been archived by the owner on Apr 26, 2022. It is now read-only.

GPU branch #27

Open
jakirkham opened this issue Mar 2, 2017 · 7 comments
Open

GPU branch #27

jakirkham opened this issue Mar 2, 2017 · 7 comments

Comments

@jakirkham
Copy link
Member

jakirkham commented Mar 2, 2017

While we may not be able to handle the build of Caffe with CUDA and cuDNN support, I would like us to coordinate a way to have a branch that one is able to build offline for one's own use for these feature. Am opening this issue so we can discuss this.

cc @sdvillal

@jakirkham
Copy link
Member Author

Related to this I had been thinking about whether we should have recipes for CUDA and cuDNN even if we don't build or upload packages for them. That way, one could more easily track and maintain these dependencies locally.

@sdvillal
Copy link

sdvillal commented Mar 2, 2017

Although they need more testing and there are still some rough edges, we are quite happy now with our conda-forge friendly cuda packages of caffe and neon. Our loopbio channel is layered on top of conda-forge to provide us with builds that would be hard to integrate into conda-forge itself, let it be because they depend on CUDA or because they require very specific features whose inclusion in the conda-forge channel would be hard to motivate. Since usually our packages feature-sets are supersets of the same packages in conda-forge, they just play nicely with the conda-forge ecosystem.

So the first thing that is needed, at least for linux (we do not care about anything else), is to get an nvidia-docker powered docker image. Ours is just an update to this PR plus a small modification to make everything work well. It is not a big deal and we would be happy to contribute it back if you think it would be useful.

Then, for each relevant package we maintain two branches, CPU and GPU. In the GPU branch we keep these lines in conda-forge.yml. They instruct conda-smithy to use the nvidia images. These recipes build nicely as long as the machine running them has nvidia-docker installed. Of course, this means that they fail in the public CIs. The wishful thinking ideal is that conda-forge would be provided with a GPU-powered CI server; the GPU does not need to be anything fancy. Some pro libraries, like tensorflow or pytorch, run their GPU tests on custom machines with jenkins.

An alternative option would be to disable all GPU tests, do not use nvidia-docker and let the build pass as long as it gets to compile. I think that would not be a good way to go, but would definitely enable building these packages in conda-forge CIs.

We also make these packages require the environment to provide a "cuda" feature. If that is not present in the environment, the CPU version gets installed. You can decide what it means for a feature to be present. In our case, we just have an empty package that tracks that feature and we take care ourselves of having whatever else is needed in good shape.

Of course you can instead decide to provide all the runtime dependencies, and that is probably what most users would highly appreciate. But it could be unwanted by some of us, so I think in any case there should always be a way of installing cuda-dependent packages without installing anything cuda in the environment.

So providing runtime dependencies is a tricky hairy issue. In order of difficulty, we would need to provide the CUDA runtime (as needed by a cuda opencv build), the CuDNN runtime (as needed by a cudnn accelerated caffe) and CUDA itself (as required by pycuda). Trickiness comes, as you know very well, from unclear or click licenses and other technical difficulties (e.g. providing matching compilers). If licensing gets cleared up, I think distributing the CUDA and CuDNN runtimes is perfectly possible; an example is in the pytorch channel, they distribute all three MKL, CUDA and CuDNN. Their recipes for doing so, if I understand them correctly, are fairly simple.

In any case, whether anything nvidia related can make it to conda-forge or not, I agree with you. Just providing standard means to build these kind of packages is both desiderable and easy.

@sdvillal
Copy link

sdvillal commented Mar 8, 2017

I have revamped our cuda packages. In particular, as a proof of concept, I'm now building three packages for the GPU branch or our caffe package - to get some convenience while coping with some conda features shortcomings.

This is an example environment:

name: caffe

channels:
  - loopbio
  - conda-forge
  - defaults

dependencies:

  # Need to install explicitly numpy atm
  - numpy

  # This declares "we want gpu support".
  - cuda-feature=8.0
  - cudnn-feature=5.1

  # This installs caffe with GPU support
  - caffe-cuda

  # This would install caffe without GPU support
  # - caffe
conda create -n caffe -f caffe-environment.yaml
source activate caffe
python -c "import caffe; caffe.set_mode_gpu(); caffe.set_device(0)"

I'm making the GPU version of any package that has a CPU-only counterpart to depend on a per-package cuda feature (in particular here, caffe gpu asks for the "caffe_cuda" feature). This allows to install together CPU and GPU versions of different packages (for example, caffe cuda and neon cpu-only) in the same environment.

@sdvillal
Copy link

@msarahan I have noticed that there is now a package for cudnn (+cuda runtime, although only 7.5 at the moment in the main anaconda channel. Are these going to be actively supported now? Are they amenable to be added to conda-forge? If so, I can give a hand and then try to get our caffe (and other DL libraries) recipes into CF.

@jjhelmus
Copy link
Contributor

@sdvillal GPU accelerated version of tensorflow, caffe and keras are available in the defaults channel for Linux. These, as well as some additional gpu accelerated packages (PyTorch, caffe2?, ...), will be added and updated as time permits. These depend on the cudatoolkit and the cudnn packages which we are currently pinning to CUDA 7.5. There is no timeline for an update to 8.0, but it should be possible.

As time permits, I will try to move some of these recipes to conda-forge and would welcome any help with this or with recipe for other DL libraries. I would give a word of caution to anyone looking to package up the cuda runtime and cudnn; neither are open source and they have proprietary licenses which must be followed unless other arrangements are made.

@msarahan
Copy link
Member

Thanks @jjhelmus. To clarify further, we have obtained explicit permission from NVIDIA to redistribute CUDNN.

We intend to support CUDNN going forward, and will continue supporting the cuda runtimes in the cudatoolkit package. We don't yet have permission to ship the developer tools (nvcc).

The hard part about supporting cuda 8.0 is that we don't have good ways of building the same package with different dependency versions. That is a major aim of conda-build 3, but will also take some work on the conda-side to allow specification of arbitrary metadata (then some agreed-upon community standards for what metadata keys need to be, and what their values are expected to be.

Conda-build 3 is in beta right now. You can get it from the conda-canary channel if you'd like.
The new changes in Conda came in with conda/conda#4158 - there is probably going to be an alpha conda 4.4 release. We don't have the exact metadata storage or specification mechanism in place yet, so I don't think it's quite ready to play with. It's close, though.

@jakirkham
Copy link
Member Author

This should be possible today if it is of interest to people. Please see the CUDA docs on how to do this

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants