Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Simplified infering arrow schema from a parquet schema #819

Merged
merged 2 commits into from
Feb 6, 2022

Conversation

jorgecarleitao
Copy link
Owner

@jorgecarleitao jorgecarleitao commented Feb 6, 2022

Doing so no longer returns an error in reading unknown logical or converted types and instead simply falls to the natural type (e.g. an int64 parquet type with an unknown logical type is read as DataType::Int64)

Backward incompatible

  • Function to do this was renamed from io::parquet::read::get_schema ro io::parquet::read::infer_schema
  • Function no longer errors on unknown parquet logical types; only on unreadable arrow schema declared in the ARROW:schema

@codecov
Copy link

codecov bot commented Feb 6, 2022

Codecov Report

Merging #819 (52dfa5b) into main (f02da8a) will decrease coverage by 0.01%.
The diff coverage is 74.54%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #819      +/-   ##
==========================================
- Coverage   71.28%   71.26%   -0.02%     
==========================================
  Files         327      327              
  Lines       17571    17544      -27     
==========================================
- Hits        12525    12503      -22     
+ Misses       5046     5041       -5     
Impacted Files Coverage Δ
src/io/parquet/read/mod.rs 81.10% <ø> (ø)
src/io/parquet/read/statistics/fixlen.rs 48.48% <ø> (-1.52%) ⬇️
src/io/parquet/read/schema/convert.rs 74.34% <71.13%> (-0.52%) ⬇️
src/io/parquet/read/file.rs 69.38% <100.00%> (ø)
src/io/parquet/read/schema/metadata.rs 85.00% <100.00%> (+1.00%) ⬆️
src/io/parquet/read/schema/mod.rs 100.00% <100.00%> (ø)
src/io/parquet/read/statistics/mod.rs 78.94% <100.00%> (ø)
src/bitmap/utils/slice_iterator.rs 91.04% <0.00%> (-1.50%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f02da8a...52dfa5b. Read the comment docs.

@jorgecarleitao jorgecarleitao merged commit 4fbbd90 into main Feb 6, 2022
@jorgecarleitao jorgecarleitao deleted the parquet_simpler branch February 6, 2022 10:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant