You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
PR9495 fixes one operation where the backward pass of an operator did not obey the "req" argument properly (resp. not at all) and so ignored the "kAddTo" directive (which directs the operator to add the gradients to the output tensor(s) and not just assign them. This leads to wrong gradients in case that an operator fans out its output to more than one other operator.
We should examine all operators whether they all properly handle the "req" parameter in backward pass. There is at least one more that doesn't which is svm_output.
As this are basic and easy to fix but hard to detect problems (they may just slightly derail the training over time) we should really prioritize this sanity checking,
The text was updated successfully, but these errors were encountered:
PR9495 fixes one operation where the backward pass of an operator did not obey the "req" argument properly (resp. not at all) and so ignored the "kAddTo" directive (which directs the operator to add the gradients to the output tensor(s) and not just assign them. This leads to wrong gradients in case that an operator fans out its output to more than one other operator.
We should examine all operators whether they all properly handle the "req" parameter in backward pass. There is at least one more that doesn't which is svm_output.
As this are basic and easy to fix but hard to detect problems (they may just slightly derail the training over time) we should really prioritize this sanity checking,
The text was updated successfully, but these errors were encountered: