-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
split reduce op into multiple libraries, accelerate the compiling #11029
Conversation
The reduce op will have performance problems because it uses a lot of |
I think we might rewrite this function by not using template generation too much. Splitting them into files just a temporary solution. |
Yes. I can't agree with you more. I have clarified that the template needs a cleanup and the broadcast issues, so we need to rewrite the reduce op. Just leave that job when we have more spare time. |
f59326f
to
7e0fc47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool
fix #10874 #9306
what's more, separate the functor into different source files can reduce the compiled templated code in exponential order.
why: [Speed up compiling]: reduce the NVCC compiling (some .cu operators can be compiled by G++) #5491
before split reduce op into multiple operators
after