-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[oneDNN] Fix to issue #34554 #34623
[oneDNN] Fix to issue #34554 #34623
Conversation
@chenwhql Could you please approve PR-CI-APPROVAL. Its failure is a false positive due to fact that mechanism of checking PADDLE_ENFORCE_** works only on added lines of code. If we modify existing PADDLE_ENFORCE_** that spawn on multiple lines then CI mechanism will only see part of this call and pattern of VALID check may fail. |
@OliverLPH Could you please review it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
paddle/fluid/platform/mkldnn_reuse.h
Outdated
PADDLE_ENFORCE_NOT_NULL( | ||
bwd_w_pd_, | ||
platform::errors::Unavailable( | ||
"Error: BWD_W_PD should be set when getting BWD grad of weights.")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error:
here is needless, other similar cases are same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
paddle/fluid/platform/mkldnn_reuse.h
Outdated
|
||
std::shared_ptr<TBackward_params> AcquireBackwardWeightsPrimitive() { | ||
PADDLE_ENFORCE_NOT_NULL(bwd_w_pd_, platform::errors::Unavailable( | ||
"Error: BWD_PD should be set when " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not very sure.. if user see this error message, what changes they need to make? does the user understand BWD_PD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User of this API is a developer of oneDNN PaddlePaddle operator.This error should be understandable as it implies than some other calls of this API should be called before this one when implementing operator.
mkldnn_engine, cpu_place) { | ||
PADDLE_ENFORCE_EQ( | ||
out_grad->dims(), in_x_grad->dims(), | ||
platform::errors::InvalidArgument("The shape of softmax_grad's input " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we recommend that the expected value and actual value are also reported in the error message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
paddle/fluid/platform/mkldnn_reuse.h
Outdated
: platform::MKLDNNHandlerNoCachingT<T, dnnl::binary>(engine, cpu_place) { | ||
PADDLE_ENFORCE_EQ( | ||
x->layout(), DataLayout::kMKLDNN, | ||
platform::errors::InvalidArgument("Wrong layout set for X tensor.")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recommend give expected value and actual value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
// For Inplace src and and dst are the same memory object | ||
// (jczaja) UT mechanics is testing inplace for this op | ||
// regardless shapes, which is wrong when X is to be broadcasted as output | ||
// is of bigger shape that X. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify this comment, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. Comment was updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@chenwhql Please continue your review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for PADDLE_ENFORCE
@chenwhql Please approve PR-CI-APPROVAL PR-CI-APPROVAL is failing as I modified first and third line of PADDLE_ENFORCE_EQ (bold black): And problem is that script evaluating this change is only recieving modified lines . So it will not recieve : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
hi @jczaja , I compiled this PR and test #34554, there are still some models will crash. when ./build/model_test --single_thread=false --single_instance=false --test_groups="3" could you please try re-test these models in your environment? |
@OliverLPH Hi, I will test given commandline and fix if there is any crash |
@jczaja this commit cause paddle compile failed, ci passed may be related by ccatch, so we revert this pr temporarily |
This reverts commit 0a5c99e.
@chenwhql Could you please share a full log of this build that failed? |
PR types
Bug fixes
PR changes
OPs
Describe
It is a fix to #34554 where models are using inplace execution. This inplace execution does not work well with caching mechanism so this PR is disabling caching for inplace capable ops: softmax, elementwise_*** and activation.