Fix validation for offsets of StructArrays #942
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Resolves #940
Rationale for this change
It appears the arrow-rs slice logic for structs introduced in #389 by @nevi-me effectively slices the child data when a struct array is sliced.
The validation code from validate.cc (and perhaps the C++ code) assumes that the children are not sliced, so when I ported that logic over it is not correct for sliced structs. You can see by the comment I was somewhat confused about the need for offset even when it was originally introduced
What changes are included in this PR?
In pictures
In pictures, here is what the testcase looks like:
In rust, when we do
slice(1,3)
the offset is applied to both the ArrayData and its children:However, in the C++ validation logic, the assumption is that the children have no offsets (and the offset of the parent is applied):
Are there any user-facing changes?
Sliced
StructArrays
can be created without validation errors (or using unsafe)