-
Notifications
You must be signed in to change notification settings - Fork 804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is the highest compression level in gzip? #6282
Comments
It seems flate2 documentation is wrong. /// Returns an integer representing the compression level, typically on a
/// scale of 0-9
pub fn level(&self) -> u32 {
self.0
} But internally, inside |
@JakkuSakura After discussing with For consistency with zlib (which supports up to 9), the documentation states the compression range as I've opened a pull request in flate2 to explicitly mention this in the docs, so that there's no confusion around this behavior. I hope this resolves your query? |
One more thing to add here. Parquet (and the entire arrow project) uses The PR I created in @alamb @tustvold what do you think? And should we close this? |
Maybe we can add a note to the arrow documentation with a link to flate2 and close this issuse? |
Sounds like a good idea. I'll make this a part of #37 exercise itself. |
@alamb @tustvold have a look please before I add anything to the docs. rust-lang/flate2-rs#427 (comment) and rust-lang/flate2-rs#429 |
I recommend linking to the docs added in rust-lang/flate2-rs#430 -- they are pretty clear to me. Basically we can say the max is 10 but offer the caveat that nothing else will be able to read the parquet files |
- add pragma `#![warn(missing_docs)]` to the following - `arrow-flight` - `arrow-ipc` - `arrow-integration-test` - `arrow-integration-testing` - `object_store` - also document the caveat with using level 10 GZIP compression in parquet. See apache#6282.
Thanks for clarifying everything here. After the docs changes, nobody be confused by this |
Summarizing for future visitors:
Parquet (and Arrow) uses Users should not use level 10 compression with parquet if they intend to read the file with other readers as well. References:rust-lang/flate2-rs#427 (comment) |
* chore: add docs, part of #37 - add pragma `#![warn(missing_docs)]` to the following - `arrow-flight` - `arrow-ipc` - `arrow-integration-test` - `arrow-integration-testing` - `object_store` - also document the caveat with using level 10 GZIP compression in parquet. See #6282. * chore: resolve PR comments from #6453
Which part is this question about
What is the highest compression level in gzip?
Describe your question
I see from other sources, including
flate2
, the highest compression level for gzip is 9 instead of 10. If we pass 10, it should be accepted by parquet but rejected by flate2. Am I getting misunderstanding somewhere?The text was updated successfully, but these errors were encountered: