-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Consolidate SUM reductions across raft #2366
Labels
feature request
New feature or request
Comments
rapids-bot bot
pushed a commit
that referenced
this issue
Jul 24, 2024
This PR consists of multiple parts: 1. redirect custom reduction kernels within `stats `namespace to `linalg::reduce` 2. Specialize reduction kernels for addition utilizing the _Kahan-Babushka-Neumaier-Sum_ [link](https://en.wikipedia.org/wiki/Kahan_summation_algorithm) 3. Slightly adjust kernel heuristics for coalesced reductions This should address #2366 and #2205. With the kernel heuristics adjusted the maximum performance drop is 4%. FYI, @tfeher Authors: - Malte Förster (https://github.com/mfoerste4) Approvers: - Tamas Bela Feher (https://github.com/tfeher) - Corey J. Nolet (https://github.com/cjnolet) URL: #2381
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe.
Brought up by #2205 the sum kernel has been adapted to account for underflow issues when adding large amounts of values. However, there are several other locations within the code where summations are done naively like stdev/var, and the more general reduction abstraction.
Describe the solution you'd like
I would prefer a more robust implementation (e.g. KahanBabushkaNeumaierSum as in the sum kernel ) to be placed as a special implementation for the add-operator within the general reduction implementation (coalesced/strided). We already have a specialization for add reduction operator in the strided reduction. APIs like
sum
andstdev
should delegate to these abstractions.Additional context
See #2205 for the initial bug report.
The text was updated successfully, but these errors were encountered: