[opt] minor compression ratio improvement #2983
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Following @terrelln's diagnosis #2980,
this is a small follow up
which enforces a minimum cost, of 1 bit per literal, in the cost evaluation of the optimal parser stage,
which is consistent with the hypothesis of a later huffman compression stage.
As can be guessed, this minimum cost enforcement has generally no impact,
since in the vast majority of cases, no literal is dominant.
As explained in #2980, this change is rather meant to take care of some special corner cases.
Nonetheless, I could find 2 files from public corpus which the new policy results in a small compression difference,
favorable in both circumstances :
The benefits also extend to some of the regression tests.