Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoiding spilling in TopK queries by reinserting the to-spill data to memory buffer #3579

Open
Dandandan opened this issue Sep 21, 2022 · 3 comments
Labels
enhancement New feature or request

Comments

@Dandandan
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We recently added optimizations for ORDER BY expr LIMIT by pushing limits to individual operations (saving memory, CPU time + limiting output rows) and executing sorts in parallel.

The disk spill operation in SortExec currently still assumes the to-spill disk doesn't fit in memory.
However after sorting we only have to keep the batch(es) with top fetch rows and store those, which probably avoids spilling to disk.

Describe the solution you'd like
We can identify that the to-spill data fits in memory after being merged / sorted and avoid spilling to disk.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

@Dandandan Dandandan added the enhancement New feature or request label Sep 21, 2022
@isidentical
Copy link
Contributor

I've been looking into this and noticed that there are currently two places where this needs to be handled (and from what I've understood, they are actually separate optimizations).

The first place is what you mentioned in this issue, which is during the spill process we sort everything we have and limit the actual output (which would mean that if we have 10 in-memory batches of limit 50, and if we can easily handle 100 records, we won't need to spill after sorting everything in place since the resulting batch would only have 50 items at most):
https://github.com/apache/arrow-datafusion/blob/add10a67c8e16aca0a683957ddbea29a2a3a4156/datafusion/core/src/physical_plan/sorts/sort.rs#L281-L282

The second place is actually where we make the first allocation. We calculate the size of the given batch, and then we assume down there that the size of the batch we got from the sort is actually the same as the size of the input batch.
https://github.com/apache/arrow-datafusion/blob/add10a67c8e16aca0a683957ddbea29a2a3a4156/datafusion/core/src/physical_plan/sorts/sort.rs#L119-L121

But thanks to the TopK PR, this is no longer the case. The sorting might actually reduce the size while we are still inserting it, so we could in theory free all the redundant memory first. This would actually help a lot, since the next insert_batch will have a lot of free-space to operate on. I've had a very rough benchmark (of a very small subset of the data from ClickHouse benchmark) and it seems like we are saving around ~320 spills (it goes from 320 spills to no spills) just by properly adjusting the used memory size.

@Dandandan
Copy link
Contributor Author

Great find and analysis @isidentical!

I assumed it was already tracking the correct size of the reduced batch. Could you create a new issue for this "memory tracking bug"?

@isidentical
Copy link
Contributor

@Dandandan definitely! I've created #3596 for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants