-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zstd: Improve best compression's match selection #705
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The best encoder selects matches based on the criterion a.est+(a.s-b.s)*bitsPerByte>>10 < b.est+(b.s-a.s)*bitsPerByte>>10 If this were computed on the reals, it would be equivalent to a.est < b.est, so the added terms only capture round-off error (this is also why CSE doesn't eliminate them). Changing the formula to a.est-b.est+(a.s-b.s)*bitsPerByte>>10 < 0 captures the intention better, I think, and improves compression: enwik9 260989017 259699309 -0.4942% silesia/dickens 3233958 3222189 -0.3639% silesia/mozilla 16980973 16912341 -0.4042% silesia/mr 3505223 3505553 0.0094% silesia/nci 2313702 2289871 -1.0300% silesia/ooffice 2915199 2896410 -0.6445% silesia/osdb 3364752 3390871 0.7763% silesia/reymont 1658404 1656006 -0.1446% silesia/samba 4330660 4326783 -0.0895% silesia/sao 5399736 5416932 0.3185% silesia/webster 9987784 9966351 -0.2146% silesia/xml 542081 538378 -0.6831% silesia/x-ray 5756210 5733061 -0.4022% ... as well as throughput: name old speed new speed delta Encoder_EncodeAllSimple/best-8 12.1MB/s ± 1% 12.2MB/s ± 1% +1.17% (p=0.000 n=18+20) Encoder_EncodeAllSimple4K/best-8 10.4MB/s ± 1% 10.5MB/s ± 1% +0.82% (p=0.000 n=20+20)
klauspost
approved these changes
Dec 2, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏼
kodiakhq bot
referenced
this pull request
in cloudquery/cloudquery
Feb 1, 2023
…7575) This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [github.com/klauspost/compress](https://togithub.com/klauspost/compress) | indirect | patch | `v1.15.11` -> `v1.15.15` | --- ### Release Notes <details> <summary>klauspost/compress</summary> ### [`v1.15.15`](https://togithub.com/klauspost/compress/releases/tag/v1.15.15) [Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.14...v1.15.15) ##### What's Changed - zstd: Add delta encoding support by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/728](https://togithub.com/klauspost/compress/pull/728) - huff0: Reduce bounds checking by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/734](https://togithub.com/klauspost/compress/pull/734) - huff0: Assembler improvements by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/736](https://togithub.com/klauspost/compress/pull/736) - deflate: Improve level 7-9 by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/739](https://togithub.com/klauspost/compress/pull/739) - gzhttp: Add SuffixETag() and DropETag() options to prevent ETag collisions on compressed responses by [@​willbicks](https://togithub.com/willbicks) in [https://github.com/klauspost/compress/pull/740](https://togithub.com/klauspost/compress/pull/740) - zstd: Don't allocate dataStorage when using byteBuf by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/741](https://togithub.com/klauspost/compress/pull/741) - huff0: Speed up compression of short blocks by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/744](https://togithub.com/klauspost/compress/pull/744) - zstd: Handle dicts by pointer, always by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/743](https://togithub.com/klauspost/compress/pull/743) - fse: Optimize compression by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/745](https://togithub.com/klauspost/compress/pull/745) - Retract v1.14.1-v.1.14.3 by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/742](https://togithub.com/klauspost/compress/pull/742) ##### New Contributors - [@​willbicks](https://togithub.com/willbicks) made their first contribution in [https://github.com/klauspost/compress/pull/740](https://togithub.com/klauspost/compress/pull/740) **Full Changelog**: klauspost/compress@v1.15.14...v1.15.15 ### [`v1.15.14`](https://togithub.com/klauspost/compress/releases/tag/v1.15.14) [Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.13...v1.15.14) #### What's Changed - flate: Improve speed in big stateless blocks. by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/718](https://togithub.com/klauspost/compress/pull/718) - zstd: Trigger BCE by switching on lengths by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/716](https://togithub.com/klauspost/compress/pull/716) - zstd: Shave some instructions off the amd64 asm by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/720](https://togithub.com/klauspost/compress/pull/720) - export NoGzipResponseWriter for custom ResponseWriter wrappers by [@​harshavardhana](https://togithub.com/harshavardhana) in [https://github.com/klauspost/compress/pull/722](https://togithub.com/klauspost/compress/pull/722) - s2: Add example for indexing and existing stream by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/723](https://togithub.com/klauspost/compress/pull/723) - tests: Tweak fuzz tests by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/719](https://togithub.com/klauspost/compress/pull/719) #### New Contributors - [@​harshavardhana](https://togithub.com/harshavardhana) made their first contribution in [https://github.com/klauspost/compress/pull/722](https://togithub.com/klauspost/compress/pull/722) **Full Changelog**: klauspost/compress@v1.15.13...v1.15.14 ### [`v1.15.13`](https://togithub.com/klauspost/compress/releases/tag/v1.15.13) [Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.12...v1.15.13) #### What's Changed - zstd: Add MaxEncodedSize to encoder by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/691](https://togithub.com/klauspost/compress/pull/691) - zstd: Improve "best" end search by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/693](https://togithub.com/klauspost/compress/pull/693) - zstd: Replace bytes.Equal with smaller comparisons by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/695](https://togithub.com/klauspost/compress/pull/695) - zstd: Faster CRC checking/skipping by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/696](https://togithub.com/klauspost/compress/pull/696) - zstd: Rewrite matchLen to make it inlineable by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/701](https://togithub.com/klauspost/compress/pull/701) - zstd: Write table clearing in a way that the compiler recognizes by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/702](https://togithub.com/klauspost/compress/pull/702) - zstd: Use individual reset threshold by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/703](https://togithub.com/klauspost/compress/pull/703) - huff0: Check for zeros earlier in Scratch.countSimple by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/704](https://togithub.com/klauspost/compress/pull/704) - zstd: Improve best compression's match selection by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/705](https://togithub.com/klauspost/compress/pull/705) - zstd: Select best match using selection trees by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/706](https://togithub.com/klauspost/compress/pull/706) - zstd: sync xxhash with final accepted patch upstream by [@​lizthegrey](https://togithub.com/lizthegrey) in [https://github.com/klauspost/compress/pull/707](https://togithub.com/klauspost/compress/pull/707) - zstd: Import xxhash v2.2.0 by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/708](https://togithub.com/klauspost/compress/pull/708) **Full Changelog**: klauspost/compress@v1.15.12...v1.15.13 ### [`v1.15.12`](https://togithub.com/klauspost/compress/releases/tag/v1.15.12) [Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.11...v1.15.12) ##### What's Changed - zstd: Tweak decoder allocs. by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/680](https://togithub.com/klauspost/compress/pull/680) - gzhttp: Always delete `HeaderNoCompression` by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/683](https://togithub.com/klauspost/compress/pull/683) **Full Changelog**: klauspost/compress@v1.15.11...v1.15.12 </details> --- ### Configuration 📅 **Schedule**: Branch creation - "before 3am on the first day of the month" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNC4xMDkuMSIsInVwZGF0ZWRJblZlciI6IjM0LjEwOS4xIn0=-->
kodiakhq bot
referenced
this pull request
in cloudquery/filetypes
Mar 1, 2023
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [github.com/klauspost/compress](https://togithub.com/klauspost/compress) | indirect | minor | `v1.15.11` -> `v1.16.0` | --- ### ⚠ Dependency Lookup Warnings ⚠ Warnings were logged while processing this repo. Please check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>klauspost/compress</summary> ### [`v1.16.0`](https://togithub.com/klauspost/compress/releases/tag/v1.16.0) [Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.15...v1.16.0) #### What's Changed - s2: Add Dictionary support by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/685](https://togithub.com/klauspost/compress/pull/685) - s2: Add Compression Size Estimate by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/752](https://togithub.com/klauspost/compress/pull/752) - s2: Add support for custom stream encoder by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/755](https://togithub.com/klauspost/compress/pull/755) - s2: Add LZ4 block converter by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/748](https://togithub.com/klauspost/compress/pull/748) - s2: Support io.ReaderAt in ReadSeeker by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/747](https://togithub.com/klauspost/compress/pull/747) - s2c/s2sx: Use concurrent decoding by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/746](https://togithub.com/klauspost/compress/pull/746) - tests: Upgrade to Go 1.20 by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/749](https://togithub.com/klauspost/compress/pull/749) - Update all (command) dependencies by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/758](https://togithub.com/klauspost/compress/pull/758) **Full Changelog**: klauspost/compress@v1.15.15...v1.16.0 ### [`v1.15.15`](https://togithub.com/klauspost/compress/releases/tag/v1.15.15) [Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.14...v1.15.15) #### What's Changed - zstd: Add delta encoding support by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/728](https://togithub.com/klauspost/compress/pull/728) - huff0: Reduce bounds checking by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/734](https://togithub.com/klauspost/compress/pull/734) - huff0: Assembler improvements by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/736](https://togithub.com/klauspost/compress/pull/736) - deflate: Improve level 7-9 by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/739](https://togithub.com/klauspost/compress/pull/739) - gzhttp: Add SuffixETag() and DropETag() options to prevent ETag collisions on compressed responses by [@​willbicks](https://togithub.com/willbicks) in [https://github.com/klauspost/compress/pull/740](https://togithub.com/klauspost/compress/pull/740) - zstd: Don't allocate dataStorage when using byteBuf by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/741](https://togithub.com/klauspost/compress/pull/741) - huff0: Speed up compression of short blocks by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/744](https://togithub.com/klauspost/compress/pull/744) - zstd: Handle dicts by pointer, always by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/743](https://togithub.com/klauspost/compress/pull/743) - fse: Optimize compression by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/745](https://togithub.com/klauspost/compress/pull/745) - Retract v1.14.1-v.1.14.3 by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/742](https://togithub.com/klauspost/compress/pull/742) #### New Contributors - [@​willbicks](https://togithub.com/willbicks) made their first contribution in [https://github.com/klauspost/compress/pull/740](https://togithub.com/klauspost/compress/pull/740) **Full Changelog**: klauspost/compress@v1.15.14...v1.15.15 ### [`v1.15.14`](https://togithub.com/klauspost/compress/releases/tag/v1.15.14) [Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.13...v1.15.14) #### What's Changed - flate: Improve speed in big stateless blocks. by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/718](https://togithub.com/klauspost/compress/pull/718) - zstd: Trigger BCE by switching on lengths by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/716](https://togithub.com/klauspost/compress/pull/716) - zstd: Shave some instructions off the amd64 asm by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/720](https://togithub.com/klauspost/compress/pull/720) - export NoGzipResponseWriter for custom ResponseWriter wrappers by [@​harshavardhana](https://togithub.com/harshavardhana) in [https://github.com/klauspost/compress/pull/722](https://togithub.com/klauspost/compress/pull/722) - s2: Add example for indexing and existing stream by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/723](https://togithub.com/klauspost/compress/pull/723) - tests: Tweak fuzz tests by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/719](https://togithub.com/klauspost/compress/pull/719) #### New Contributors - [@​harshavardhana](https://togithub.com/harshavardhana) made their first contribution in [https://github.com/klauspost/compress/pull/722](https://togithub.com/klauspost/compress/pull/722) **Full Changelog**: klauspost/compress@v1.15.13...v1.15.14 ### [`v1.15.13`](https://togithub.com/klauspost/compress/releases/tag/v1.15.13) [Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.12...v1.15.13) #### What's Changed - zstd: Add MaxEncodedSize to encoder by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/691](https://togithub.com/klauspost/compress/pull/691) - zstd: Improve "best" end search by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/693](https://togithub.com/klauspost/compress/pull/693) - zstd: Replace bytes.Equal with smaller comparisons by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/695](https://togithub.com/klauspost/compress/pull/695) - zstd: Faster CRC checking/skipping by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/696](https://togithub.com/klauspost/compress/pull/696) - zstd: Rewrite matchLen to make it inlineable by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/701](https://togithub.com/klauspost/compress/pull/701) - zstd: Write table clearing in a way that the compiler recognizes by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/702](https://togithub.com/klauspost/compress/pull/702) - zstd: Use individual reset threshold by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/703](https://togithub.com/klauspost/compress/pull/703) - huff0: Check for zeros earlier in Scratch.countSimple by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/704](https://togithub.com/klauspost/compress/pull/704) - zstd: Improve best compression's match selection by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/705](https://togithub.com/klauspost/compress/pull/705) - zstd: Select best match using selection trees by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/706](https://togithub.com/klauspost/compress/pull/706) - zstd: sync xxhash with final accepted patch upstream by [@​lizthegrey](https://togithub.com/lizthegrey) in [https://github.com/klauspost/compress/pull/707](https://togithub.com/klauspost/compress/pull/707) - zstd: Import xxhash v2.2.0 by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/708](https://togithub.com/klauspost/compress/pull/708) **Full Changelog**: klauspost/compress@v1.15.12...v1.15.13 ### [`v1.15.12`](https://togithub.com/klauspost/compress/releases/tag/v1.15.12) [Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.11...v1.15.12) #### What's Changed - zstd: Tweak decoder allocs. by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/680](https://togithub.com/klauspost/compress/pull/680) - gzhttp: Always delete `HeaderNoCompression` by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/683](https://togithub.com/klauspost/compress/pull/683) **Full Changelog**: klauspost/compress@v1.15.11...v1.15.12 </details> --- ### Configuration 📅 **Schedule**: Branch creation - "before 3am on the first day of the month" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNC4xMDkuMSIsInVwZGF0ZWRJblZlciI6IjM0LjE1NC4wIn0=-->
greatroar
added a commit
to greatroar/compress
that referenced
this pull request
Mar 12, 2023
name old speed new speed delta Encoder_EncodeAllSimple/best-8 14.8MB/s ± 3% 20.7MB/s ± 3% +39.53% (p=0.000 n=17+19) Encoder_EncodeAllSimple4K/best-8 11.8MB/s ± 1% 19.2MB/s ± 6% +62.17% (p=0.000 n=20+20) name old alloc/op new alloc/op delta Encoder_EncodeAllSimple/best-8 14.0B ± 0% 10.2B ± 8% -27.07% (p=0.000 n=16+19) Encoder_EncodeAllSimple4K/best-8 1.00B ± 0% 0.00B -100.00% (p=0.000 n=20+19) Also, compressing enwik9 takes 6.375% less wall clock time. Output from silesia corpus and enwik9 is about .05% bigger, due to the different order in which comparisons are done: dickens 3222189 3220994 (× 0.99963) enwik9 259699309 259846164 (× 1.00057) mozilla 16912341 16912437 (× 1.00001) mr 3505553 3502823 (× 0.99922) nci 2289871 2306320 (× 1.00718) ooffice 2896410 2896907 (× 1.00017) osdb 3390871 3390548 (× 0.99990) reymont 1656006 1657380 (× 1.00083) samba 4326783 4329898 (× 1.00072) sao 5416932 5416648 (× 0.99995) webster 9966351 9972808 (× 1.00065) xml 538378 542277 (× 1.00724) x-ray 5733061 5733121 (× 1.00001) total 319554055 319728325 (× 1.00055) This is still smaller than before klauspost#705.
klauspost
pushed a commit
that referenced
this pull request
Mar 13, 2023
name old speed new speed delta Encoder_EncodeAllSimple/best-8 14.8MB/s ± 3% 20.7MB/s ± 3% +39.53% (p=0.000 n=17+19) Encoder_EncodeAllSimple4K/best-8 11.8MB/s ± 1% 19.2MB/s ± 6% +62.17% (p=0.000 n=20+20) name old alloc/op new alloc/op delta Encoder_EncodeAllSimple/best-8 14.0B ± 0% 10.2B ± 8% -27.07% (p=0.000 n=16+19) Encoder_EncodeAllSimple4K/best-8 1.00B ± 0% 0.00B -100.00% (p=0.000 n=20+19) Also, compressing enwik9 takes 6.375% less wall clock time. Output from silesia corpus and enwik9 is about .05% bigger, due to the different order in which comparisons are done: dickens 3222189 3220994 (× 0.99963) enwik9 259699309 259846164 (× 1.00057) mozilla 16912341 16912437 (× 1.00001) mr 3505553 3502823 (× 0.99922) nci 2289871 2306320 (× 1.00718) ooffice 2896410 2896907 (× 1.00017) osdb 3390871 3390548 (× 0.99990) reymont 1656006 1657380 (× 1.00083) samba 4326783 4329898 (× 1.00072) sao 5416932 5416648 (× 0.99995) webster 9966351 9972808 (× 1.00065) xml 538378 542277 (× 1.00724) x-ray 5733061 5733121 (× 1.00001) total 319554055 319728325 (× 1.00055) This is still smaller than before #705.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The best encoder selects matches based on the criterion
If this were computed on the reals, it would be equivalent to a.est < b.est, so the added terms only capture round-off error (this is also why CSE doesn't eliminate them).
Changing the formula to
captures the intention better, I think, and improves compression:
... as well as throughput: