-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] min/max segmented reduce for string and decimal #10417
Comments
@revans2 I discussed adding decimal support with @isVoid and we're going to work on this soon! It may or may not make the 22.04 release but we have a game plan. Can you elaborate a bit on what you expect for string reductions? I didn't find much in the code about existing reductions for string types -- can you list (or find in the code) the reduction types you would expect to see? Thanks! 👍 |
Pretty much all other reductions support min/max on strings and just use the Typically we have to add a level of indirection and actually compute |
How about decimal types? Each fixed point column is supposed to have a uniform scale. However, since the length of each segment can be different, the product of these segments may possess different scales. One way I can think of this is since |
I care about min and max I am not concerned with product right now and min/max do not change the scale. All of these problems have been solved by groupby aggregations. If the groupby aggregation is not supported for this type, then I would not see any reason to support it for a segmented reduction. If it is supported then I would expect the result to be the same as what groupby does. Being inconsistent is a bigger problem. |
This issue has been labeled |
A status update since this was marked as stale: #10447 will address the string min/max case. We will need another PR to address decimal types. I would probably limit the scope to min/max aggregations for now if that is enough to resolve this issue. |
This PR adds `min/max` segmented reduction to string type. Part of #10417 Authors: - Michael Wang (https://github.com/isVoid) - Bradley Dice (https://github.com/bdice) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Bradley Dice (https://github.com/bdice) URL: #10447
This issue has been labeled |
This PR adds support to min/max segmented reduction to fixed point type. Together with #10447, this PR closes #10417 Besides, this PR refactors `segmented_reduce` to accept output iterators instead of allocating the result column from within. Authors: - Michael Wang (https://github.com/isVoid) - Bradley Dice (https://github.com/bdice) Approvers: - Bradley Dice (https://github.com/bdice) - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu) - David Wendt (https://github.com/davidwendt) URL: #10794
Is your feature request related to a problem? Please describe.
This is a follow on request for #9621.
It would be really great if we could also support String and DECIMAL_32, DECIMAL_64, and DECIMAL_128 with this same API for min and max.
In the short term I am going to use the old code I had for String/DECIMAL_128 and will try to bit cast the DECIMAL_32 and DECIMAL_64 values, before and after.
The text was updated successfully, but these errors were encountered: