Add arrow feature to re_chunk and conversions to RecordBatch #7355
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What
Basic type conversions from TransportChunk to RecordBatch and back.
Adding the round-trip test turned up an interesting issue.
TransportChunk <-> RecordBatch fails to round-trip successfully because we lose the ExtensionType encapsulation that used to be encoded by arrow2.
While on the surface this isn't immediately problematic, as we don't care about ExtensionTypes, the discussion indicates there are in fact going to be very real pain points when it comes to writing semantic data processing engines using arrow-rs. This is because the metadata is attached to the FIELD, not the DATATYPE, and there exist many processing contexts where the context of that field itself is lost.
apache/arrow-rs#4472
Checklist
main
build: rerun.io/viewernightly
build: rerun.io/viewerCHANGELOG.md
and the migration guideTo run all checks from
main
, comment on the PR with@rerun-bot full-check
.