-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Push down limit to sort #3530
Push down limit to sort #3530
Conversation
) -> ArrowResult<BatchWithSortArray> { | ||
// TODO: pushup the limit expression to sort | ||
let sort_columns = expr | ||
.iter() | ||
.map(|e| e.evaluate_to_sort_column(&batch)) | ||
.collect::<Result<Vec<SortColumn>>>()?; | ||
|
||
let indices = lexsort_to_indices(&sort_columns, None)?; | ||
let indices = lexsort_to_indices(&sort_columns, fetch)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key optimization: this returns only n indices after the change.
269e2b0
to
8c19b9b
Compare
bf01caf
to
4306134
Compare
Codecov Report
@@ Coverage Diff @@
## master #3530 +/- ##
=======================================
Coverage 85.80% 85.81%
=======================================
Files 300 300
Lines 55382 55424 +42
=======================================
+ Hits 47520 47561 +41
- Misses 7862 7863 +1
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
4306134
to
2959049
Compare
.as_any() | ||
// SortExec preserve_partitioning=False, fetch=Some(n)) | ||
// -> SortPreservingMergeExec (SortExec preserve_partitioning=True, fetch=Some(n)) | ||
let parallel_sort = plan_any.downcast_ref::<SortExec>().is_some() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we now have the pushdown - we can use fetch
, and support more than just a limit directly after sort.
Support skip, fix test Fmt Add limit directly after sort Update comment Simplify parallel sort by using new pushdown Clippy
2959049
to
4b1a86a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really neat idea @Dandandan -- very beautiful implementation
@@ -657,15 +660,18 @@ pub struct SortExec { | |||
metrics_set: CompositeMetricsSet, | |||
/// Preserve partitions of input plan | |||
preserve_partitioning: bool, | |||
/// Fetch highest/lowest n results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see -- this seems like it it now has the information plumbed to the SortExec to implement "TopK" within the physical operator's implementation. 👍
Very cool
let sort_columns = expr | ||
.iter() | ||
.map(|e| e.evaluate_to_sort_column(&batch)) | ||
.collect::<Result<Vec<SortColumn>>>()?; | ||
|
||
let indices = lexsort_to_indices(&sort_columns, None)?; | ||
let indices = lexsort_to_indices(&sort_columns, fetch)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this will effectively get us much of the benefit of a special TopK operator as we don't have to copy the entire input -- we only copy the fetch
limit, if specified
Although I suppose SortExec
still buffers all of its input where a TopK could buffer them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I wonder if you could also apply the limit here:
as part of sorting each batch -- rather than keeping the entire input batch, we only need to keep at most fetch
rows from each batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lexsort_to_indices
already returns only fetch
indices per batch, this is used to take
that nr. of indices per batch, throwing away the rest of the rows.
The remaining optimization I think is tweaking SortPreservingMergeStream
to only maintain fetch
records in the heap instead of all fetch
top records for each batch in the partition as mentioned here #3516 (comment). After this I think we have a full TopK implementation that only needs to keep n number of rows in memory (per partition).
I would like to do this in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A separate PR is a great idea 👍
lexsort_to_indices already returns only fetch indices per batch, this is used to take that nr. of indices per batch, throwing away the rest of the rows.
Right, the point I was trying to make is that there are 2 calls to lexsort_to_indices
in sort.rs. I think this PR only pushed fetch
to one of them. The second is https://github.com/apache/arrow-datafusion/blob/3a9e0d0/datafusion/core/src/physical_plan/sorts/sort.rs#L826 and I think it is correct to push fetch
there too
I was thinking if we applied fetch
to the second call, we could get close to the same effect without changing SortPreservingMergeStream
.
- After this PR, sort buffers
num_input_batches * input_batch_size
rows. - Adding
fetch
to the other call tolexsort_to_indices
would would buffernum_input_batches * limit
rows - Extending
SortPreservingMergeStream
would allow us to buffer onlylimit
rows.
So clearly extending SortPreservingMergeStream
is optimal in terms of rows buffered, but it likely requires a bit more effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't look to much at the rest of the implementation, I think you're right that providing fetch
to the other lexsort_to_indices
would be beneficial as well. I will create a issue for this and issue a PR later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current change already buffers num_input_batches * limit
by the way, as it is applied before adding them to the buffer. As far as I can see adding the second to lexsort_to_indices
will reduce mainly the output of the individual sorts to fetch
rows - which is of course beneficial too as that reduces time to sort and limit the input again to take
and input to SortPreservingMergeExec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Benchmark runs are scheduled for baseline = c7f3a70 and contender = 81b5794. 81b5794 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes: #3528
Rationale for this change
vs after #3527
What changes are included in this PR?
Are there any user-facing changes?