-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat: Enhance array_slice functionality to support ListView and LargeListView types
#18432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @brancz |
array_slice functionality to support ListView and LargeListView types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice refactor! The SlicePlan and extraction of functions makes the rest of the code a lot more readable.
| let field = match array.data_type() { | ||
| ListView(field) | LargeListView(field) => Arc::clone(field), | ||
| other => { | ||
| return Err(internal_datafusion_err!( | ||
| "array_slice got unexpected data type: {}", | ||
| other | ||
| )); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest moving this to before the slice to avoid unnecessary work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
| ); | ||
| } | ||
|
|
||
| #[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tests! This PR for reverse of FixedSizeList also added sqllogictest cases https://github.com/apache/datafusion/pull/16423/files#diff-317c67cc9ce87268e4ccec1cb75316eed82f99ae2ffc226874e5897913ffa4c8
It doesn't look like that is currently possible for ListView on arrow-rs 57, as it hits this path, which did not have a branch for ListView at the time of release
https://github.com/apache/arrow-rs/blob/062d766a9c3070d191d1a1fd0baca01b9d13994f/arrow-schema/src/datatype_parse.rs#L70-L96 (but it was added recently: apache/arrow-rs#8649)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can add to SLTs and expect it to fail on the cast (leaving the expected results as comments) so when we bump to arrow with that fix we will automatically know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the introduction of SlicePlan
| } | ||
| } | ||
|
|
||
| fn adjusted_from_index<O: OffsetSizeTrait>(index: i64, len: O) -> Result<Option<O>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this + adjusted_to_index are pulled out from their inner function general_array_slice verbatim without changes 👍
(Just a note for myself so I know there isn't an actual diff on the function)
| let field = match array.data_type() { | ||
| ListView(field) | LargeListView(field) => Arc::clone(field), | ||
| other => { | ||
| return internal_err!("array_slice got unexpected data type: {}", other); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let field = match array.data_type() { | |
| ListView(field) | LargeListView(field) => Arc::clone(field), | |
| other => { | |
| return internal_err!("array_slice got unexpected data type: {}", other); | |
| } | |
| }; | |
| let field = match array.data_type() { | |
| ListView(field) | LargeListView(field) => Arc::clone(field), | |
| _ => unreachable!() | |
| }; |
Given array is most definitely a GenericListViewArray
Though I wonder why we handle this differently to how its done in general_array_slice 🤔
defabd2 to
9dee50f
Compare
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be good to go after fixing the CI checks
59175d6 to
8ff4008
Compare
…LargeListView` types (apache#18432) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#18351 ## Rationale for this change `array_slice` accepts `ListView` / `LargeListView` inputs. <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? - Extend array_slice_inner to handle `ListView`/`LargeListView` arrays directly. - Share the stride/bounds logic between list and list‑view implementations via a new `SlicePlan`. <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? Yes <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> Yes. `array_slice` now accepts `ListView` and `LargeListView` arrays without requiring an explicit cast.
…LargeListView` types (apache#18432) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#18351 ## Rationale for this change `array_slice` accepts `ListView` / `LargeListView` inputs. <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? - Extend array_slice_inner to handle `ListView`/`LargeListView` arrays directly. - Share the stride/bounds logic between list and list‑view implementations via a new `SlicePlan`. <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? Yes <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> Yes. `array_slice` now accepts `ListView` and `LargeListView` arrays without requiring an explicit cast.
Which issue does this PR close?
Rationale for this change
array_sliceacceptsListView/LargeListViewinputs.What changes are included in this PR?
ListView/LargeListViewarrays directly.SlicePlan.Are these changes tested?
Yes
Are there any user-facing changes?
Yes.
array_slicenow acceptsListViewandLargeListViewarrays without requiring an explicit cast.