-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
delete rank switch in broadcast_function.h for compile #42645
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
你的PR提交成功,感谢你对开源项目的贡献! |
bmb0537
previously approved these changes
May 13, 2022
AnnaTrainingG
changed the title
compile kp
delete rank switch in broadcast_function.h for compile
May 16, 2022
bmb0537
approved these changes
May 16, 2022
ZzSean
approved these changes
May 16, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for OP-Benchmark
xingfeng01
approved these changes
May 16, 2022
lanxianghit
pushed a commit
that referenced
this pull request
Jun 6, 2022
删除Broadcast function中rank例化以及Elementwise调用,降低编译时间。 从develop分支中的#42645 PR修改而来,由于develop分支与release分支相差较大,无法实现cherry-pick,因此针对release2.3重新提交PR. Broadcast中关于rank的例化会导致底层模板展开较多,造成reduce_sum_grad_kernel.cu.o文件体积过大,修改后可以降低.o体积及编译时间
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR types
Others
PR changes
Others
Describe
Broadcast中关于rank的例化会导致底层模板展开较多,造成reduce_sum_grad_kernel.cu.o文件体积过大,修改后可以降低.o体积及编译时间
OP-benchmark失败原因:
本次修改,将影响所有调用broadcast调用的OP:
3.p_norm_kernel_2 forward中调用的为elementwiseKernel
4.p_norm_kernel_2 backward调用的为Eigen,均与本次修改无关
本地测试如下: