-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zstd: Big speedup on small dictionary encodes #345
Conversation
According to my benchmarks, the best results are achieved with tableShardSize = 64, (that would mean tableShardCnt = 1 << (tableBits - 6) in enc_fast.go). Other than that, it seems your version is just my patch on steroids :), so I still get the same results in the benchmarks (after adjusting the shard size, that is). RE: code beauty |
I will make it a tweakable constant and benchmark the numbers. It will mainly be a tradeoff between the cost of branching vs the cost of copying memory. So a system with less memory bandwidth will prefer smaller shards.
I looked into that. The main issue is that the Less important, but also important is the |
I would have liked to be able to the same for all like the "fast", where it falls back to the regular encode when exceeding a certain size:
.. but I couldn't find a neat way for that. |
Very much appreciated! |
There have been a few zstd improvements since v1.11.6: * zstd: Big speedup on small dictionary encodes klauspost/compress#345 * zstd: Add WithLowerEncoderMem klauspost/compress#336 * Faster "compression" of incompressible data klauspost/compress#314
All credit goes to @tony2001
As shown in #344 the speed of small dictionary compression tasks (< 32K) can be improved significantly by keeping track of the state of the hash table.
This effectively implements #344 but avoids a penalty for non-dictionary encodes and extends the functionality to the "better" compression mode as well.
This change will also make it easier to remove the copy of the literal dictionary every time an encode starts and have specialized code to deal with this.