Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt] minor compression ratio improvement #2983

Merged
merged 3 commits into from
Jan 21, 2022
Merged

[opt] minor compression ratio improvement #2983

merged 3 commits into from
Jan 21, 2022

Conversation

Cyan4973
Copy link
Contributor

@Cyan4973 Cyan4973 commented Jan 7, 2022

Following @terrelln's diagnosis #2980,
this is a small follow up
which enforces a minimum cost, of 1 bit per literal, in the cost evaluation of the optimal parser stage,
which is consistent with the hypothesis of a later huffman compression stage.

As can be guessed, this minimum cost enforcement has generally no impact,
since in the vast majority of cases, no literal is dominant.
As explained in #2980, this change is rather meant to take care of some special corner cases.
Nonetheless, I could find 2 files from public corpus which the new policy results in a small compression difference,
favorable in both circumstances :

file v1.5.1 this PR
calgary/pic 43534 43492
silesia/mr 3114356 3114063

The benefits also extend to some of the regression tests.

@Cyan4973 Cyan4973 merged commit 71921e5 into dev Jan 21, 2022
@Cyan4973 Cyan4973 deleted the minLitPricev2 branch January 13, 2023 04:28
@Cyan4973 Cyan4973 mentioned this pull request Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants