Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Accept decoding parquet's i64 into u32 written by pyarrow #1090

Merged
merged 1 commit into from
Jun 22, 2022

Conversation

jorgecarleitao
Copy link
Owner

Despite parquet's statement that uint32 are encoded as i32, pyarrow writes UInt32 columns in v1 files in parquet's i64 physical type without logical annotations but with an arrow schema declared as UInt32.

This PR makes us accept this configuration when reading parquet (from i64 to u32).

Thanks @mhtrinhLIC for reporting this at pola-rs/polars#3754

@jorgecarleitao jorgecarleitao added the enhancement An improvement to an existing feature label Jun 21, 2022
@jorgecarleitao jorgecarleitao changed the title Accept decoding u32 from parquet's i64 written by pyarrow Accept decoding parquet's i64 into u32 written by pyarrow Jun 21, 2022
@codecov
Copy link

codecov bot commented Jun 21, 2022

Codecov Report

Merging #1090 (6c69c6b) into main (d1ab4ef) will increase coverage by 0.04%.
The diff coverage is 34.78%.

@@            Coverage Diff             @@
##             main    #1090      +/-   ##
==========================================
+ Coverage   81.07%   81.12%   +0.04%     
==========================================
  Files         367      367              
  Lines       35309    35503     +194     
==========================================
+ Hits        28628    28802     +174     
- Misses       6681     6701      +20     
Impacted Files Coverage Δ
src/io/parquet/read/deserialize/mod.rs 53.95% <0.00%> (-2.33%) ⬇️
src/io/parquet/read/statistics/mod.rs 92.33% <37.50%> (-1.35%) ⬇️
src/io/parquet/read/deserialize/simple.rs 54.91% <72.22%> (+0.10%) ⬆️
src/bitmap/mutable.rs 98.17% <0.00%> (+0.65%) ⬆️
src/io/ipc/read/reader.rs 96.18% <0.00%> (+0.76%) ⬆️
src/bitmap/immutable.rs 85.71% <0.00%> (+2.69%) ⬆️
src/chunk.rs 90.47% <0.00%> (+7.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d1ab4ef...6c69c6b. Read the comment docs.

@jorgecarleitao jorgecarleitao merged commit 5569595 into main Jun 22, 2022
@jorgecarleitao jorgecarleitao deleted the fix_error branch June 22, 2022 04:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement An improvement to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant