-
Notifications
You must be signed in to change notification settings - Fork 838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parquet Preserve BitMask #1037
Comments
tustvold
added
the
enhancement
Any new improvement worthy of a entry in the changelog
label
Dec 13, 2021
This was referenced Dec 13, 2021
Closed
tustvold
added a commit
to tustvold/arrow-rs
that referenced
this issue
Dec 17, 2021
tustvold
added a commit
to tustvold/arrow-rs
that referenced
this issue
Jan 11, 2022
tustvold
added a commit
to tustvold/arrow-rs
that referenced
this issue
Jan 11, 2022
tustvold
added a commit
to tustvold/arrow-rs
that referenced
this issue
Jan 11, 2022
alamb
pushed a commit
that referenced
this issue
Jan 13, 2022
alamb
removed
the
enhancement
Any new improvement worthy of a entry in the changelog
label
Jan 20, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently the parquet reader reads out definition levels into
[i16]
and then constructs a bitmask from this. In the case of a max definition level of 1, the data is likely encoded already as bitpacked with a bitwidth of 1. It is possible to just use this encoded representation as is, without decoding it to[i16]
and then re-encoding it as a bitmask.FWIW parquet2 performs this optimisation
Describe the solution you'd like
Reuse the already encoded bitmask directly
The text was updated successfully, but these errors were encountered: