-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix bug to support dropout eval grad computing. #37305
fix bug to support dropout eval grad computing. #37305
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
dX.device(place) = static_cast<T>(0) * dY; | ||
if (is_test) { | ||
if (dropout_implementation == "upscale_in_train") { | ||
dX.device(place) = static_cast<T>(1) * dY; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是不是memcpy好一些
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯嗯,该pr只用于恢复 #35122 的修改哈。后续可以进一步优化。
* fix bug to support dropout eval grad computing. * Remove useless code.
PR types
Bug fixes
PR changes
OPs
Describe
Question:
#35122 support dropout in eval mode (paddle2.0已经将eval和no_grad解绑,eval下做反向是合理的,竞品也都支持这个行为). But #35621 remove these modifications of #35122.
In this PR, we recover the modifications in #35122.
test code:
before this PR:
after this PR: