-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Properly handling custom op exception by modify engine #14693
Conversation
@junrushao1994 could you take a look too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you for the fix!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice work ! Added a comment for tests, otherwise LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
irrelevant flaky test in julia: |
* fix custom except handling by modify engine * add test * fix lint * update test * fix test * trigger CI
@arcadiaphy Can you take a look at this failure? http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-14783/3/pipeline#step-676-log-1586 |
@TaoLv Sure, let me have a look. |
* fix custom except handling by modify engine * add test * fix lint * update test * fix test * trigger CI
Description
In #14522, problems of custom op exception handling are reported. #14575 tries to fix them by modify custom op implementation, which causes blocking in custom thread.
This PR tries to address the issue by modifying engine:
Throw
method is added in engine to re-throw associated exception on var.asnumpy
orwait_to_read
in custom op to block custom thread, so unlimited number of threads are still needed to avoid deadlock.Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments