Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parquet: Support Encoding::BYTE_STREAM_SPLIT #4102

Closed
simonvandel opened this issue Apr 20, 2023 · 3 comments · Fixed by #5293
Closed

Parquet: Support Encoding::BYTE_STREAM_SPLIT #4102

simonvandel opened this issue Apr 20, 2023 · 3 comments · Fixed by #5293
Labels
enhancement Any new improvement worthy of a entry in the changelog help wanted

Comments

@simonvandel
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I would like to evaluate whether using the BYTE_STREAM_SPLIT encoding helps a Float64 column compress better. But it seems like it is not supported yet:

e => return Err(nyi_err!("Encoding {} is not supported", e)),

Describe the solution you'd like
An implementation of the encoding. Even a naive, non-optimized version would resolve this issue. The implementation can be improved iteratively.

Describe alternatives you've considered
PyArrow seems to support it, but I would really like to stay within the Rust world.

Additional context

@simonvandel simonvandel added the enhancement Any new improvement worthy of a entry in the changelog label Apr 20, 2023
@simonvandel
Copy link
Contributor Author

I'll give it a go myself, if possible.

@Weijun-H
Copy link
Member

Hello @simonvandel, I was wondering if you're currently working on this task. If not, I would be happy to take on the project.

@simonvandel
Copy link
Contributor Author

Hi @Weijun-H I have a working version that I can clean up a bit. I'll update you here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog help wanted
Projects
None yet
3 participants