-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MKLDNN layout: Support for sum operator #11102
MKLDNN layout: Support for sum operator #11102
Conversation
da0b3f4
to
6c7c210
Compare
d99777e
to
fcd0dd1
Compare
@luotao1, @tensor-tang The code is prepared to the code-review process. |
@tensor-tang Creates out-of-place sum_primitive_desc for sum of n inputs multiplied by scale with resulting output_desc memory descriptor. This code supports only LODTensor as a format of input. Otherwise I have done the rolling back to the naive version of this code. In the nearly future I will implement the support for the other case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any unit test for this changes since added use_mkldnn flag.
if (src_tz.size() == 1 && (input_format == memory::format::nchw || | ||
input_format == memory::format::nhwc)) { | ||
input_format = memory::format::x; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about else
? Then do not set input_format which should be undef or any?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, The format is set at least once, it means that I expect that the size of format is different than 1 or 2, mainly is 4. The code is triggered in this place
auto& input0 = in_vars[0]->Get<LoDTensor>(); | ||
PADDLE_ENFORCE(input0.layout() == DataLayout::kMKLDNN && | ||
input0.format() != memory::format::format_undef, | ||
"Wrong layout/format for inputs[0]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about other inputs? Why only check the first one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other inputs are checking inside the loop: line 96-98
auto& input = in_vars[i]->Get<LoDTensor>(); | ||
PADDLE_ENFORCE(input.layout() == DataLayout::kMKLDNN && | ||
input.format() != memory::format::format_undef, | ||
"Wrong layout/format for inputs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it will throw error if do not set format. But it seems you only set one input, or does it mean you only accept one input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to have the best performance and the fluent flow in our graph of primitives, the format of input should be compatible with the others primitives. When we have mkldnn's primitives we can assume that the format is always defined as a mkldnn's format.
|
||
std::shared_ptr<memory> dst_mem; | ||
if (in_place) | ||
dst_mem.reset(new memory(sum_pd.dst_primitive_desc())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you should keep the style
if (...) {
...
} else {
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
auto sum_prim = mkldnn::sum(sum_pd, inputs, *dst_mem); | ||
output_format = | ||
(memory::format)sum_pd.dst_primitive_desc().desc().data.format; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe could use a function to get format from pd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
#ifdef PADDLE_WITH_MKLDNN | ||
if (library == framework::LibraryType::kPlain && | ||
platform::CanMKLDNNBeUsed(ctx)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this CanMKLDNNBeUsed
could directly return false ifndef PADDLE_WITH_MKLDNN
. then we do not need add https://github.com/PaddlePaddle/Paddle/pull/11102/files#diff-a8d897ab383e5245127bb03da3d9830cR74 everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tensor-tang. I think that is a splendid way to gets a little bit more clean checking when the primitive can be executed. If it is not problem I will take advantage of this idea, and I make it as other pull-request.
707e7af
to
60623a6
Compare
@luotao1 Could you continue the checking of this code, please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add unit-test in test_sum_mkldnn_op.py
?
60623a6
to
6715524
Compare
6715524
to
b88cda8
Compare
@luotao1 Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @mozga-intel
Waiting for supports of the mkldnn’s layout.
Please have a look at this sum operator supported by the MKLDNN’s layout and assess if we can keep this design of the code.
This code uses the implementation of layout which is available in the last pull-request. Therefore some of the function in this code are not available in this pull-request, but are available here
This version of code can be merged into the main branch when the pull-request with layout is accepted.
The concept of splits of the long code was proposed by @luotao1
Pull-request is related to #11040