-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
Is this PR to fix the problem in #10089? |
Does it fix this as well?
|
@zheng-da test_bce_loss could be passed with the fix: OK |
@jinhuang415 I think we can enable all activation in this PR and close the previous #10089 . |
@pengzhao-intel Added change to enable all MKLDNN activation |
agreed. I have closed #10089 |
@@ -82,7 +82,7 @@ void ActivationGradComputeExCPU(const nnvm::NodeAttrs& attrs, | |||
const ActivationParam& param = nnvm::get<ActivationParam>(attrs.parsed); | |||
if (SupportMKLDNN(inputs[0])) { | |||
MKLDNN_OPCHECK_INIT(true, outputs.size(), inputs, outputs); | |||
MKLDNNActivationBackward(attrs, ctx, inputs[0], inputs[1], req[0], | |||
MKLDNNActivationBackward(attrs, ctx, inputs[0], inputs[2], req[0], | |||
outputs[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inputs[2]
doesn't exist. You need to modify ActivationGrad
to pass input data to backward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments, have updated diff to pass input data and updated a few other functions as well.
Looks good to me now. |
src/operator/nn/mkldnn/mkldnn_act.cc
Outdated
// problems. | ||
return param.act_type == activation::kReLU; | ||
#if 0 | ||
return param.act_type == activation::kReLU | ||
|| param.act_type == activation::kSigmoid | ||
|| param.act_type == activation::kSoftReLU; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why tanh is not enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TaoLv I added support for Tanh and it can pass UT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests for triggering previous bugs?
There is no previous bug. mkldnn was just disabled for these cases |
OK. Feel free to dismiss my previous review in this PR if the change is already covered by tests. |
@szha @piiswrong The related test case is tests/python/gpu/test_operator_gpu.py:test_activation_with_type, previously it will fail if we enable sigmoid/softrelu/tanh for mkldnn, after the fix the test can pass so we can enable sigmoid/softrelu/tanh for MKLDNN now. |
@piiswrong MKLDNN activation backward uses input data (activation::kData) to compute in_grad, but the original code provides output data (activation::kOut). That's why it worked only for relu. I'm not sure why it always works. This PR fixes this bug. And now we can use MKLDNN for other types of activation. |
assert_almost_equal(out1.asnumpy(), out2.asnumpy()) | ||
assert_almost_equal(out1.asnumpy(), out3.asnumpy()) | ||
assert_almost_equal(out1.asnumpy(), out2.asnumpy(), rtol=1e-3) | ||
assert_almost_equal(out1.asnumpy(), out3.asnumpy(), rtol=1e-3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should set a smaller tolerance. changing from 1e-5 to 1e-3 seems to be a big jump.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would still fail intermittently if setting to 1e-4, by checking another similar function check_consistency(), the threshold is set to 1e-3 for FP32, since for this test_lambda() case the data type is FP32 (mx.nd.random.uniform() default output FP32 type) so I set the rtol to 1e-3 as well.
* Fix MKLDNN sigmoid/softrelu issue * Enable Sigmoid and SoftRelu for MKLDNN * Add activation kData for backward calculation for MKLDNN * Add tanh support for MKLDNN activation * Adjust rtol to pass tanh tests for MKLDNN
* Fix MKLDNN sigmoid/softrelu issue * Enable Sigmoid and SoftRelu for MKLDNN * Add activation kData for backward calculation for MKLDNN * Add tanh support for MKLDNN activation * Adjust rtol to pass tanh tests for MKLDNN
* Fix MKLDNN sigmoid/softrelu issue * Enable Sigmoid and SoftRelu for MKLDNN * Add activation kData for backward calculation for MKLDNN * Add tanh support for MKLDNN activation * Adjust rtol to pass tanh tests for MKLDNN
Description
This PR is to fix the sigmoid and softrelu test failure for MKLDNN (related test case tests/python/gpu/test_operator_gpu.py:test_activation_with_type), MKLDNN eltwise primitive requires to input activation input data instead of output data for backward primitive (it will do activation forward for input data and calculate gradient based on forward result), so need to change to use input data to adapt for that. Tests passed for sigmoid/relu/softrelu after the fix.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.