-
Notifications
You must be signed in to change notification settings - Fork 1.8k
perf: Fix NLJ slow join with condition array_has
#18161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified that with this fix the reproducer from @ianthetechie from this issue is fixed:
However, am not sure this code has test coverage. I checked via
nice cargo llvm-cov --html test --test sqllogictestsnice cargo llvm-cov --html test -p datafusion -p datafusion-physical-planI will look into adding some coverage
Now the performance are similar, I suspect the most time is spend evaluating the expensive array_has so the optimization in #16996 can't help much.
Yes, I looked at the array_has implementation and it is doing a lot of work. I will file a follow on ticket
Also, it seems to me that the fix / improvement for ScalarValue::to_array_of_size() is more general than just NLJ, so I will also file a ticket about that as well
| scalar_value.to_array_of_size(filtered_probe_batch.num_rows())? | ||
| // Avoid using `ScalarValue::to_array_of_size()` for `List(Utf8View)` to avoid | ||
| // deep copies for buffers inside `Utf8View` array. See below for details. | ||
| // https://github.com/apache/datafusion/issues/18159 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The root cause is tracked in
| DataType::List(field) | DataType::LargeList(field) | ||
| if field.data_type() == &DataType::Utf8View => | ||
| { | ||
| let indices_iter = std::iter::repeat_n( | ||
| build_side_index as u64, | ||
| filtered_probe_batch.num_rows(), | ||
| ); | ||
| let indices_array = UInt64Array::from_iter_values(indices_iter); | ||
| take(original_left_array.as_ref(), &indices_array, None)? | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach could be ported into ScalarValue::to_array_of_size itself rather than special cased here -- which would improve performance in potentially other places
datafusion/datafusion/common/src/scalar/mod.rs
Lines 3238 to 3245 in 556eb9b
| fn list_to_array_of_size(arr: &dyn Array, size: usize) -> Result<ArrayRef> { | |
| let arrays = repeat_n(arr, size).collect::<Vec<_>>(); | |
| let ret = match !arrays.is_empty() { | |
| true => arrow::compute::concat(arrays.as_slice())?, | |
| false => arr.slice(0, 0), | |
| }; | |
| Ok(ret) | |
| } |
That being said, I think this is a nice point fix that we can safely backport to the datafusion 50 branch, so I think we should merge this PR / backport it and I will file a follow on PR to further improve the code
|
THANK YOU very much for this fix and diagnosis @2010YOUY01 |
Thank you for the review. The feedback makes sense to me, but I can only address it tomorrow. If you're waiting on this patch to be included in the release, feel free to push changes directly to the PR. |
|
Thank you @2010YOUY01 I just pushed a test that adds coverage for this case I verified it is covered using |
| @@ -0,0 +1,63 @@ | |||
| # Licensed to the Apache Software Foundation (ASF) under one | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new test added here
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @2010YOUY01
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#18070 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> See the above issue and its comment apache#18070 (comment) ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> In nested loop join, when the join column includes `List(Utf8View)`, use `take()` instead of `to_array_of_size()` to avoid deep copying the utf8 buffers inside `Utf8View` array. This is the quick fix, avoiding deep copy inside `to_array_of_size()` is a bit tricky. Here is `ListArray`'s physical layout: https://arrow.apache.org/rust/arrow/array/struct.GenericListArray.html If multiple elements is pointing to the same list range, the underlying payload can't be reused.So the potential fix in `to_array_of_size` can only avoids copying the inner-inner utf8view array buffers, but can't avoid copying the inner array (i.e. views are still copied), and deep copying for other primitive types also can't be avoided. Seems this can be better solved when `ListView` type is ready 🤔 ### Benchmark I tried query 1 in apache#18070, but only used 3 randomly sampled `places` parquet file. 49.0.0: 4s 50.0.0: stuck > 1 minute PR: 4s Now the performance are similar, I suspect the most time is spend evaluating the expensive `array_has` so the optimization in apache#16996 can't help much. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Existing tests ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> No <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
|
Filed a ticket to track making |
… (#18179) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Related to #18070 - Part of #18072 ## Rationale for this change Fix performance regression in Datafusion 50 ## What changes are included in this PR? Backport #18161 to `branch-50` ## Are these changes tested? Yes ## Are there any user-facing changes? Fix performance regression Co-authored-by: Yongting You <2010youy01@gmail.com>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#18070 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> See the above issue and its comment apache#18070 (comment) ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> In nested loop join, when the join column includes `List(Utf8View)`, use `take()` instead of `to_array_of_size()` to avoid deep copying the utf8 buffers inside `Utf8View` array. This is the quick fix, avoiding deep copy inside `to_array_of_size()` is a bit tricky. Here is `ListArray`'s physical layout: https://arrow.apache.org/rust/arrow/array/struct.GenericListArray.html If multiple elements is pointing to the same list range, the underlying payload can't be reused.So the potential fix in `to_array_of_size` can only avoids copying the inner-inner utf8view array buffers, but can't avoid copying the inner array (i.e. views are still copied), and deep copying for other primitive types also can't be avoided. Seems this can be better solved when `ListView` type is ready 🤔 ### Benchmark I tried query 1 in apache#18070, but only used 3 randomly sampled `places` parquet file. 49.0.0: 4s 50.0.0: stuck > 1 minute PR: 4s Now the performance are similar, I suspect the most time is spend evaluating the expensive `array_has` so the optimization in apache#16996 can't help much. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Existing tests ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> No <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Which issue does this PR close?
Rationale for this change
See the above issue and its comment #18070 (comment)
What changes are included in this PR?
In nested loop join, when the join column includes
List(Utf8View), usetake()instead ofto_array_of_size()to avoid deep copying the utf8 buffers insideUtf8Viewarray.This is the quick fix, avoiding deep copy inside
to_array_of_size()is a bit tricky.Here is
ListArray's physical layout: https://arrow.apache.org/rust/arrow/array/struct.GenericListArray.htmlIf multiple elements is pointing to the same list range, the underlying payload can't be reused.So the potential fix in
to_array_of_sizecan only avoids copying the inner-inner utf8view array buffers, but can't avoid copying the inner array (i.e. views are still copied), and deep copying for other primitive types also can't be avoided. Seems this can be better solved whenListViewtype is ready 🤔Benchmark
I tried query 1 in #18070, but only used 3 randomly sampled
placesparquet file.49.0.0: 4s
50.0.0: stuck > 1 minute
PR: 4s
Now the performance are similar, I suspect the most time is spend evaluating the expensive
array_hasso the optimization in #16996 can't help much.Are these changes tested?
Existing tests
Are there any user-facing changes?
No