-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deconv #269
Deconv #269
Conversation
expandInData += subK * subN; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Actually, the forward of de-conv is the backward of convolution layer. And the backward is the forward of convolution layer. So it's better to reuse the implementation in convolution layer. This can refer the implementation method in Caffe.
- The name DeConvLayer may be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @qingqing01 Thanks a lot for your comments~
- I did reuse most of the code from convolution layer and swap the forward and backward. However, there is still some subtle differences that keep me from directly using the original code in convolution layer. For example, the multiple inputs are on the convoluted side rather than the image side, and the biases are added on different sides too (you can compare ExpandConvLayer.cpp and ExpandConvTransLayer.cpp for details). So if we want to use the same code for both conv and deconv, we might need to refactor conv-layer too.
- There was a discussion that convolution transpose might be more technically correct here. So tensorflow actually changed its name from "tf.nn.deconv2d" to "tf.nn.conv2d_transpose". You can also see the references I attached below.
Transpose convolution layer for tensorflow (was deconvolution) tensorflow/tensorflow#256 (comment)
http://datascience.stackexchange.com/questions/6107/what-are-deconvolutional-layers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. It's better to refactor code. It might be able to extract the same code into forwardWeight/backwardWeight and forwardBias/backwardBias (or other suitable name), then call these functions in ExpandConvLayer and ExpandConvTransLayer.
Hi @qingqing01 |
@@ -27,6 +27,9 @@ class ConvBaseLayer : public Layer { | |||
protected: | |||
typedef std::vector<int> IntV; | |||
|
|||
/// True if it's convolution layer, false if it's deconv layer | |||
bool isConv_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
变量名改成isDeConv或者别的,因为都是ConvLayer,不适合用isConv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/// The spatial dimensions of height of output feature map. | ||
IntV outputH_; | ||
/// The spatial dimensions of width of output feature map. | ||
IntV outputW_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imgSizeH_, imgSizeW_, outputH_, outputW_这四个变量在ConvBaseLayer里面有,可以直接继承,不用再写一遍了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variables in ConvBaseLayer are imgSize_ outputX_ which does not distinguish width and height
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't realize that there was a change by @qingqing01 made to ConvBaseLayer and ExpandConvLayer
/** | ||
* @brief A subclass of ConvBaseLayer that is a superclass of both | ||
* ExpandConvLayer and ExpandConvTransLayer | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
既然是ExpandConvLayer and ExpandConvTransLayer的superclass,名字不要叫ConvBaseLayerCpu。可以叫ExpandConvBaseLayer,或者别的,名字要有所反应。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have renamed it to ExpandConvBaseLayer
|
||
/*The expandInput_ and transOutValue_ are used for CPU expand conv calc*/ | ||
/// Expand one sample at a time. shape: | ||
/// (numChannels * filterPixels_, outputSizeH * outputSizeW) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
46-48行的注释,要不全用///,要不就用/**/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -166,7 +70,7 @@ void ExpandConvLayer::forward(PassType passType) { | |||
image = prevLayer->getOutputValue(); | |||
for (size_t off = 0; off < image->getHeight(); off++) { | |||
REGISTER_TIMER_INFO("expandFwdOnce", getName().c_str()); | |||
expandFwdOnce(image, i, off); | |||
expandFwdOnce(image, getOutputValue(), i, off); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在for循环外:MatrixPtr outV = getOutputValue(); for循环里面直接调用outV,可以少访问image->getHeight()次指针。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -218,109 +98,16 @@ void ExpandConvLayer::backward(const UpdateCallback &callback) { | |||
|
|||
for (size_t i = 0; i != inputLayers_.size(); ++i) { | |||
/* First, calculate the input layers error */ | |||
bpropActs(outGrad, i); | |||
if (NULL != getPrev(i)->getOutputGrad()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if语句改成:if (getPrev(i)->getOutputGrad()),多处需修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
resetOutput(batchSize, getSize()); | ||
|
||
MatrixPtr output = nullptr; | ||
for (size_t i = 0; i != inputLayers_.size(); ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for循环里面的终止条件建议写成 i < inputLayers_.size()
,多处需修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -1067,6 +1067,37 @@ def parse_conv(conv, input_layer_name, conv_conf): | |||
1 + int(math.ceil((2 * conv.padding + conv_conf.img_size \ | |||
- conv.filter_size) / float(conv.stride))) | |||
|
|||
|
|||
def parse_convt(conv, input_layer_name, conv_conf, num_filters): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
改成parse_conv_trans比较容易懂,实现的时候,1072-1098行有些可以复用parse_conv代码
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@wrap_bias_attr_default() | ||
@wrap_act_default(act=ReluActivation()) | ||
@layer_support(DROPOUT) | ||
def img_convTrans_layer(input, filter_size, num_filters, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
img_convTrans_layer,能在img_conv_layer中加一个trans的变量来实现么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge conflict
@@ -262,6 +262,8 @@ void testConvLayer(const string& type, bool trans, bool useGpu) { | |||
config.layerConfig.num_filters()); | |||
|
|||
testLayerGrad(config, "conv", 100, trans, useGpu); | |||
// Use small batch_size and useWeight=true to test biasGrad | |||
testLayerGrad(config, "conv", 2, trans, useGpu, true, 0.02); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batchsize=2也太小了,10或者100
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I set the batchsize to 2 is that the bug in calculating the bpropSharedBias cannot be caught with large batchsize due to averaging out errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请问有什么bug?可以贴出来么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void ExpandConvBaseLayer::bpropSharedBias(MatrixPtr biases, MatrixPtr v) {
size_t mapW = getOutputSize() / numFilters_;
size_t mapH = v->getElementCnt() / mapW;
MatrixPtr vTmp = Matrix::create(v->getData(), mapH, mapW, false, useGpu_);
Matrix::resizeOrCreate(transOutValue_, mapW, mapH, false, useGpu_);
vTmp->transpose(transOutValue_, false); // false means no memory allocation
transOutValue_->reshape(transOutValue_->getElementCnt() / numFilters_,
numFilters_);
biases->collectBias(*transOutValue_, 1.0f);
}
The last two lines was
vTmp->reshape(transOutValue_->getElementCnt() / numFilters_,
numFilters_);
biases->collectBias(*vTmp, 1.0f);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你的意思是改过后,就不会出bug?那可以修正下吧
(outputSize - 1) * stride + filterSize - 2 * padding - stride + 1; | ||
} else { | ||
imageSize = (outputSize - 1) * stride + filterSize - 2 * padding; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I feel that writing out two equations explicitly here is better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以直接调用mathUtils.h中的outputSize函数(就多了最后一个caffeMode_的变量),不用在这里重新写一份。另外,把imageSize函数也放到mathUtils.h里吧
Hi @luotao1 I have modified my codes per your suggestions. @qingqing01 , when I rebase my code, I realized that you have also modified ConvBaseLayer and ExpandConvBaseLayer at the same time (pull#218). I have merged my changes with yours, so could you please also have a look at it to make sure that I didn't break your code? Thanks! |
Note that 'All checks have failed', you can click 'Details' to see the log. |
@@ -74,6 +77,8 @@ class ConvBaseLayer : public Layer { | |||
/// of output size. | |||
bool caffeMode_; | |||
|
|||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
多余的两行空行(80-81行)可以去掉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
* convTrans, and in other functions too. | ||
* */ | ||
int channel; | ||
int nf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nf变量改成numFilters,比较直接
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/* Initialize the projection */ | ||
for (auto &inputConfig : config_.inputs()) { | ||
const ConvConfig &conf = inputConfig.conv_conf(); | ||
nf = (!isDeconv_) ? numFilters_ : conf.channels(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nf = isDeconv_ ? conf.channels() : numFilters_; 41行也同样
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -36,7 +36,7 @@ | |||
"pooling_layer", "lstmemory", "last_seq", "first_seq", | |||
"cos_sim", "hsigmoid", "conv_projection", | |||
"regression_cost", 'classification_cost', "LayerOutput", | |||
'img_conv_layer', 'img_pool_layer', 'batch_norm_layer', | |||
'img_conv_layer', 'img_convTrans_layer', 'img_pool_layer', 'batch_norm_layer', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
img_convTrans_layer去掉,不然导致单测不过
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -1105,6 +1105,37 @@ def parse_conv(conv, input_layer_name, conv_conf): | |||
1 + int(math.ceil((2 * conv.padding + conv_conf.img_size \ | |||
- conv.filter_size) / float(conv.stride))) | |||
|
|||
|
|||
def parse_conv_trans(conv, input_layer_name, conv_conf, num_filters): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parse_conv_trans和parse_conv的共用代码很多,能否在parse_conv里面加一个trans变量。同样ConvTransLayerBase和ConvLayerBase的共用代码也很多,能否在ConvLayerBase里面加一个trans变量。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done for the parse_conv_trans
However for the ConvTransLayerBase, I am not sure I can do the same thing since it has a decorator and the overlapping of the codes is not that big.
…s with pull#218 commit 45c81a4
@luotao1 @qingqing01 The build on mac is not stable. Build #724 failed and build #725 passed although they have identical code base. |
fix conv typo in develop
* debug nan * debug nan * skip weight_decay
* add docs for xlnet module introduction * add docs for xlnet module introduction
Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>
* fix * fix * update * update
Implement CPU version of convolution transpose (deconv) layer using expand method.