Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to extract Convolution code from cuda-convnet2 #830

Closed
sguada opened this issue Jul 30, 2014 · 7 comments
Closed

Try to extract Convolution code from cuda-convnet2 #830

sguada opened this issue Jul 30, 2014 · 7 comments

Comments

@sguada
Copy link
Contributor

sguada commented Jul 30, 2014

According to some benchmarks Caffe was the fastest Convolution until the recent release of cuda-convnet2
https://github.com/soumith/convnet-benchmarks

So maybe someone would like to look into the code and try to extract some code for doing convolutions in Caffe.
https://code.google.com/p/cuda-convnet2/

@bhack
Copy link
Contributor

bhack commented Jul 30, 2014

Could be interesting to see this benchmark with #544 but it is still CPU only.

@kloudkl
Copy link
Contributor

kloudkl commented Aug 6, 2014

cuda-convnet2 has three major new features relative to cuda-convnet:

  1. Improved training times on Kepler-generation Nvidia GPUs (Geforce Titan, K20, K40).
  2. Multi-GPU training support implementing data parallelism, model parallelism, and the hybrid approach described in One weird trick for parallelizing convolutional neural networks [1].
  3. Less-polished code and incomplete (but improving) documentation.

[1] Alex Krizhevsky. One weird trick for parallelizing convolutional neural networks. arXiv:1404.5997 [cs.NE]

@kloudkl
Copy link
Contributor

kloudkl commented Aug 6, 2014

It was said that "Caffe is fastest forward+backward". What is the "banded approach for im2col" in the comment?

@kloudkl
Copy link
Contributor

kloudkl commented Aug 6, 2014

There have been a lot of interests in running Caffe on multiple GPUs (#194, #301, #423, #519, #547, #630, #653). Alex is keeping ahead of Caffe by implementing the data prallelism, model parallelism, and the hybrid approach. Can something be done to catch up?

@rodrigob
Copy link
Contributor

rodrigob commented Aug 6, 2014

The memory consumption aspect #852 should also be considered.

@shelhamer
Copy link
Member

For parallelism, please discuss at #876. It's certainly planned, but given the freedom in this direction don't hesitate to attempt parallelism after your own fashion for comparison.

@shelhamer
Copy link
Member

Closing -- cuDNN and the Caffe layer integration supersedes the custom cuda-convnet2 kernels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants