-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flush denormals to zero, test=develop #29924
Conversation
#ifdef DENORM_USE_INTRINSICS | ||
#ifdef PADDLE_WITH_SSE3 | ||
// Restore flags | ||
_MM_SET_FLUSH_ZERO_MODE(flush_zero_mode ? _MM_FLUSH_ZERO_ON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加注释,以及相应的引用链接,讲明这里目的是啥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
namespace paddle { | ||
namespace platform { | ||
|
||
class ScopedRestoreFlushDenormalState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加注释,说明,这个功能是啥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
01a0190
to
52ddf63
Compare
2811d96
to
6e7ae0d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
建议同步QA同学测试其他模型的精度是否有差异。
好的。 |
* flush denormals to zero, test=develop * add comments, test=develop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Bug fixes
PR changes
Others
Describe
将 SSE3 计算中非正规值置零,以提高性能。
参考链接:
https://stackoverflow.com/questions/9314534/why-does-changing-0-1f-to-0-slow-down-performance-by-10x