-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Accumulator size
function performance (fix regression on clickbench)
#5325
Comments
Thank you @comphead I think this would be relatively straightforward to implement for the common case (the one in this code) of fixed size values Basically instead of looping over all scalar values to count their sizes ( Instead add some code that checks "if is ScalarType::Int8, UInt8, etc then size = size[0]*vec.len()" This would be a good first issue I think -- a good result and straightforward implementation |
size
function performancesize
function performance (fix regression on clickbench)
@alamb I'll take this ticket if not other volunteers, also I want to play a bit if we need super accurate size, probably we can do approx size which will serve to get the structure size with minor inaccuracy but will be faster |
Thanks @comphead |
Maybe the size can be accumulated as well during updates. |
This is an excellent idea that might work very well for variable length structures (like strings) Though in general the distinct accumulators are going to be fairly poor at any high cardinality usecase (as they store all values as |
that was not that trivial I expected, so I have ran some experiments.
@alamb @Dandandan let me know your thoughts |
Is the problem the |
Thank you for looking into this @comphead I think we should use this approach for fixed length (non variable length) data -- it will solve the performance regression we saw for clickbench In terms of handling variable length data more efficiently, I am not sure it is worth a lot of time optimizing the I think a separate project to handle Thus I propose:
|
For the further improvement @alamb mentioned, we could rewrite an aggregate query with distinct aggregations into an expanded double aggregation. This would also eliminate the need for Before
After rewrite:
|
Thanks -- this is a neat idea @yjshen One challenge I have seen with this approach in the past is it will result in a "diamond shaped plan" (where the same input stream is split into two output streams (to the different aggregates) and then brought back together. In general, this approach may required unbounded buffering if using sort based aggregation. But I think would definitely be worth considering |
Another approach might be speeding up the accumulator. It might be worth looking at the dictionary and parquet interner implementations from @tustvold in arrow-rs, and use a similar approach here which should have non-trivial performance impact. |
Thanks @Dandandan if you could provide a link would be great! |
If people are serious about wanting to improve the performance of the aggregators in general, I think we should consider combining our efforts as there are several people who seem interested and I think the work will collide if not done carefully. Here is one ticket that tracks ideas: #4973 |
Could you please elaborate more on 1. why two output streams are generated and 2. why it requires unbounded buffering? Thanks! |
Hi @yjshen Here was my understanding of what you were proposing, which shows the diamond I am referring to. I may be misunderstanding your proposal; SELECT date,
COUNT(DISTINCT x),
COUNT(DISITNCT y)
FROM
t;
|
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
During regression benchmarks it was found that DISTINCT queries has a performance drop. The analysis showed the
size
function implementation forDistinctCountAccumulator
is inefficient.Describe the solution you'd like
Need to improve size function or the way how number of bytes is collected
Describe alternatives you've considered
None
Additional context
Analysis details #5313
Original benchmarks ticket #5276
The text was updated successfully, but these errors were encountered: