Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Removed some panics reading invalid parquet files #1106

Merged
merged 1 commit into from
Jun 27, 2022
Merged

Conversation

jorgecarleitao
Copy link
Owner

@jorgecarleitao jorgecarleitao commented Jun 26, 2022

This PR removes common panics that emerge when reading parquet files.

This still does not shield us from all the panics - only from panics derived from invalid metadata, negative lengths, etc. - deserializing pages still panics.

@jorgecarleitao jorgecarleitao added the enhancement An improvement to an existing feature label Jun 26, 2022
@codecov
Copy link

codecov bot commented Jun 26, 2022

Codecov Report

Merging #1106 (3415cd2) into main (b942a84) will decrease coverage by 0.00%.
The diff coverage is 86.86%.

@@            Coverage Diff             @@
##             main    #1106      +/-   ##
==========================================
- Coverage   81.33%   81.32%   -0.01%     
==========================================
  Files         367      367              
  Lines       35458    35414      -44     
==========================================
- Hits        28840    28801      -39     
+ Misses       6618     6613       -5     
Impacted Files Coverage Δ
src/io/parquet/read/deserialize/boolean/nested.rs 67.05% <28.57%> (ø)
src/io/parquet/read/deserialize/binary/nested.rs 75.42% <62.50%> (-1.70%) ⬇️
src/io/parquet/read/deserialize/binary/basic.rs 75.59% <74.07%> (+0.39%) ⬆️
src/io/parquet/read/deserialize/dictionary.rs 81.75% <85.71%> (+0.58%) ⬆️
src/io/parquet/read/deserialize/nested_utils.rs 83.63% <90.90%> (-0.31%) ⬇️
src/io/parquet/read/deserialize/boolean/basic.rs 94.21% <100.00%> (-0.28%) ⬇️
...arquet/read/deserialize/fixed_size_binary/basic.rs 95.53% <100.00%> (-0.31%) ⬇️
src/io/parquet/read/deserialize/primitive/basic.rs 95.60% <100.00%> (-0.17%) ⬇️
...rc/io/parquet/read/deserialize/primitive/nested.rs 88.23% <100.00%> (ø)
src/io/parquet/read/deserialize/utils.rs 78.46% <100.00%> (-0.31%) ⬇️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b942a84...3415cd2. Read the comment docs.

@jorgecarleitao jorgecarleitao changed the title Bumped parquet (against git) - removes some panics in reading Bumped parquet (against git) - removes some panics reading invalid files Jun 27, 2022
@jorgecarleitao jorgecarleitao marked this pull request as ready for review June 27, 2022 17:30
@jorgecarleitao jorgecarleitao changed the title Bumped parquet (against git) - removes some panics reading invalid files Removed some panics reading invalid parquet files Jun 27, 2022
@jorgecarleitao jorgecarleitao merged commit b14cd61 into main Jun 27, 2022
@jorgecarleitao jorgecarleitao deleted the bump_parquet branch June 27, 2022 20:36
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement An improvement to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant