-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shared col buffer between Convolution Layers #520
Conversation
Conflicts: include/caffe/vision_layers.hpp src/caffe/layers/conv_layer.cpp src/caffe/net.cpp
Conflicts: src/caffe/layers/conv_layer.cpp
Although the memory reduction for inputs of 227x227 is small, for bigger inputs the savings are big. |
@@ -102,6 +102,8 @@ class ConvolutionLayer : public Layer<Dtype> { | |||
} | |||
virtual inline int ExactNumBottomBlobs() const { return 1; } | |||
virtual inline int ExactNumTopBlobs() const { return 1; } | |||
virtual Blob<Dtype>* col_buffer() { return &col_buffer_; } | |||
virtual bool share_col_buffer() { return share_col_buffer_; } | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
share_col_buffer_
looks too similar with shared_col_buffer_
and sounds like a verb rather than a noun. Isn't is_col_buffer_shared_
easier for the users to quickly figure out that it is a boolean variable than share_col_buffer_
?
@jeffdonahue although in the future we may want to handle |
Can you give some details about what kind of memory savings you've seen with what kinds of architectures? I still think it's pretty messy to have the net aware of specific layer type implementations, and it doesn't really seem like it would be much more work to avoid that messiness. Couldn't we just add to the base Layer interface a |
+1 for doing this nicely now then not having to look at it again by On Wed, Jul 2, 2014 at 11:25 AM, Jeff Donahue notifications@github.com
|
For usual input size of 227x227 the savings are small 50MB but for bigger input sizes are big, for example for images of 1723x1723 the savings are 665MB. I like the idea of having a more general way to handle temporary One question is which Right now I don't have the time to do that change, but feel free to contribute. |
4278286
to
c01f07a
Compare
Although trimming the memory of the Caffe convolution and having a general mechanism for temporaries would be helpful, the cuDNN convolution reduces memory usage too. Note however the Caffe matrix multiplication approach can be faster in certain cases if it fits in memory all the same, so a shared buffer is still valuable. |
In a post-#594 world, I do believe sharing the column buffer is much simpler: diff --git a/include/caffe/vision_layers.hpp b/include/caffe/vision_layers.hpp
index f94ac07..8b4373c 100644
--- a/include/caffe/vision_layers.hpp
+++ b/include/caffe/vision_layers.hpp
@@ -61,7 +61,7 @@ class ConvolutionLayer : public Layer<Dtype> {
bool bias_term_;
// For the Caffe matrix multiplication convolution.
int M_, K_, N_;
- Blob<Dtype> col_buffer_;
+ static Blob<Dtype> col_buffer_;
Blob<Dtype> bias_multiplier_;
};
diff --git a/src/caffe/layers/conv_layer.cpp b/src/caffe/layers/conv_layer.cpp
index f4f2b6b..a6649ed 100644
--- a/src/caffe/layers/conv_layer.cpp
+++ b/src/caffe/layers/conv_layer.cpp
@@ -9,6 +9,9 @@
namespace caffe {
template <typename Dtype>
+Blob<Dtype> ConvolutionLayer<Dtype>::col_buffer_;
+
+template <typename Dtype>
void ConvolutionLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
ConvolutionParameter conv_param = this->layer_param_.convolution_param(); No worries about computing the max size or coordinating through |
Solved by #1291 |
This PR replaces #517.
Inspired by conversations with @forresti about using a shared_col_buffer between different ConvolutionLayers, this PR creates a
shared_col_buffer
inNet
to be shared by theConvolutionLayers
The size of the
shared_col_buffer_
will be maximum of the sizes of the sharedcol_buffer
.Now by default, any ConvolutionLayer would use a shared col_buffer, but it can have its own col_buffer by setting the corresponding
ConvolutionParam.share_col_buffer
to false in the correspondingprototxt
.This PR further reduces the memory consumption of Caffe.