-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Fix transposed convolution in CPU w/o MKLDNN. #14031
Conversation
@zhreshold @thomelane Please help to review. Thanks! |
Can you verify the result in unittest? |
@zhreshold unit test added. |
@mxnet-label-bot add [pr-awaiting-review] |
src/operator/nn/deconvolution-inl.h
Outdated
@@ -226,22 +227,24 @@ class DeconvolutionOp { | |||
CHECK_EQ(in_data.size(), expected); | |||
CHECK_EQ(out_data.size(), 1U); | |||
Stream<xpu> *s = ctx.get_stream<xpu>(); | |||
#if defined(__CUDACC__) | |||
CHECK_EQ(s->blas_handle_ownership_, Stream<xpu>::OwnHandle) | |||
<< "Must init CuBLAS handle in stream"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"cuBLAS" is the official abbreviation :)
@@ -503,6 +503,40 @@ def test_deconv(): | |||
# layer = nn.Conv3DTranspose(16, (3, 3, 3), layout='NDHWC', in_channels=4) | |||
# # check_layer_forward(layer, (1, 10, 10, 10, 4)) | |||
|
|||
@with_seed() | |||
def test_deconv_dilation(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since deconv is a really important OP, I suggest to visit the original deconv test cases and add dilation > 1 cases alongside the old tests. This ensures better coverage than this single test case.
Feel free to keep this unittest which LGTM as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Looks good.
@@ -485,7 +455,6 @@ class DeconvolutionOp { | |||
DeconvolutionParam param_; | |||
mshadow::Shape<2> shape_colunit_; | |||
mshadow::Shape<3> shape_dstunit_; | |||
index_t nstep_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please tell me why was this removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The col2im
method does not support such step.
const index_t nbatch = data.size(0); | ||
Tensor<xpu, 1, DType> workspace = | ||
ctx.requested[deconv::kTempSpace].get_space_typed<xpu, 1, DType>( | ||
Shape1(this->InitTemp(out.shape_, data.shape_)), s); | ||
for (index_t i = 0; i < nbatch; i += nstep_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know what was "nstep_" doing earlier? It would help understand the problem with the earlier code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The col2im
method does not support such step.
|
||
const index_t nbatch = data.size(0); | ||
Tensor<xpu, 1, DType> workspace = | ||
ctx.requested[deconv::kTempSpace].get_space_typed<xpu, 1, DType>( | ||
Shape1(this->InitTemp(grad.shape_, data.shape_)), s); | ||
for (index_t i = 0; i < nbatch; i += nstep_) { | ||
const index_t step = std::min(nstep_, nbatch - i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again can you tell what was the purpose of "step" in the previous code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's used to convert multiple batch of image data into columns in the prevous library. However, it is not supported in the col2im
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code changes are consistent with the optimized way to perform Deconv operation on just CPU but I have some questions that will help me understand what was happening earlier and why was it that way. Rest your code is correct and precise. Good Work !
@apeforest can you please rebase and resolve the merge conflicts? |
@apeforest Could you please have a look at the CI failures? |
@apeforest Gentle ping... |
@apeforest Can you take a look at failing CI build? |
@apeforest Could you please provide an updates on this PR about your progress and thoughts so that the other community members get help from this. Thanks! |
@karan6181 The new function im2col has different signature and calling sequence from old col_unpack(). The changes fail in a few unit tests and I ended up re-implementing the operator itself. Given that current MKLDNN is default in CPU and it has no issue with Conv2DTranspose operator, I would like to treat this issue as lower priority and get it complete in a few weeks. |
@mxnet-label-bot Update[pr-work-in-progress] @apeforest Can you look into the CI failures ? |
@apeforest Hi! Any update in this PR? The PR is important: ) |
Description
transposed convolution operator in CPU w/o MKLDNN is not working properly when dilation is set. This is because the mshadow library function
unpack_patch2col
andpack_col2patch
generate incorrect results with dilation parameter. This PR replaced these two functions with MXNet native functionim2col
andcol2im
This PR fixs issue #11203
Passed the local test in the issue:
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.