-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RowCoverter::convert
that targets an existing Rows
#4479
Comments
I'm not very familiar with the Instruments profile, but it seems off to me that alloc_zeroed is under update_group_state and not convert_columns? If the allocation was occurring as part of row conversion I would have perhaps expected it to appear under convert_columns? Is it possible there are other allocations taking place here? |
I couldn't find anything else that does allocations per group, but I agree the profile is somewhat ambiguous |
(BTW this is like 2% of the total time, so more like a nice to have feature, rather than anything critical or really important) |
|
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I am trying to make group by really fast in DataFusion apache/datafusion#6800
The grouping code uses the Arrow Row format 👍 and calls
RowConverter::convert_columns
My traces imply that some non trivial amount of time is spent zeroing out newly allocated memory for
Rows
:Describe the solution you'd like
I would like a method like
RowConverter::convert_columns_in_place
that writes to an pre-existing rows (clearing it out first)Perhaps something like
https://docs.rs/arrow-row/43.0.0/src/arrow_row/lib.rs.html#712
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: