Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't treat Vector{UInt8} as Arrow Binary type #439

Merged
merged 2 commits into from
May 19, 2023

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented May 18, 2023

Fixes #411. Alternative to #419.

This PR should be compatible with or without the ArrowTypes changes. I think it's fine to do compat things in Arrow like this as long as they don't get out of hand and we can eventually remove them as we bump required ArrowTypes versions and such.

The PR consists of not treating Vector{UInt8} as the Arrow Binary type, which is meant for "binary string"s. Julia has a pretty good match for that in Base.CodeUnits, so instead, we use that to write Binary and Vector{UInt8} is treated as a regular List of Primitive UInt8 type.

Fixes #411. Alternative to #419.

This PR should be compatible with or without the ArrowTypes changes.
I think it's fine to do compat things in Arrow like this as long as
they don't get out of hand and we can eventually remove them as we
bump required ArrowTypes versions and such.

The PR consists of not treating `Vector{UInt8}` as the Arrow Binary
type, which is meant for "binary string"s. Julia has a pretty good
match for that in `Base.CodeUnits`, so instead, we use that to write
Binary and `Vector{UInt8}` is treated as a regular List of Primitive UInt8
type.
@quinnj
Copy link
Member Author

quinnj commented May 18, 2023

Shoot, I forgot we run tests both ways; I'll add some more compat here

@codecov-commenter
Copy link

codecov-commenter commented May 18, 2023

Codecov Report

Attention: Patch coverage is 96.87500% with 1 line in your changes missing coverage. Please review.

Project coverage is 87.06%. Comparing base (e893c32) to head (a08bb51).
Report is 47 commits behind head on main.

Files with missing lines Patch % Lines
src/arraytypes/list.jl 96.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #439      +/-   ##
==========================================
+ Coverage   87.02%   87.06%   +0.03%     
==========================================
  Files          26       26              
  Lines        3261     3277      +16     
==========================================
+ Hits         2838     2853      +15     
- Misses        423      424       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@quinnj
Copy link
Member Author

quinnj commented May 19, 2023

This should be ready to review/merge; cc: @Moelf @baumgold

@quinnj quinnj merged commit 899ecb0 into main May 19, 2023
@quinnj quinnj deleted the jq-byte-vector-not-binary branch May 19, 2023 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vector{UInt8} mis-represented when writing to disk
3 participants