-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When to recycle ZSTD_CCtx objects? #1796
Comments
Hey! Great write-up! We have the same pattern in a few places at facebook (e.g.: https://github.com/facebook/hhvm/blob/master/hphp/util/compression-ctx-pool.h), with exactly the same problem. My changesets #1712 and #1780 address this scenario. When possible (i.e., when there's enough room in the existing workspace buffer), the CCtx avoids the I hope that's useful. (In fact, if you tried your use case on the |
Yeah, this is exactly the kind of topic we want to address for our next release, feedback will be highly appreciated. |
Funny: https://github.com/facebook/hhvm/blob/master/hphp/util/compression-ctx-pool.h was submitted one day after https://github.com/google/riegeli/blob/master/riegeli/base/recycling_pool.h I verified with the |
The plan after #1780 is clear: always recycle, ignoring compression options. I am closing the issue. Thank you! (Now I only have to wait for a zstd release, and then for us to update the version of our zstdlib dependency.) |
Experiments indicate that recycling
ZSTD_CCtx
objects between separate streaming compression sessions (e.g. withZSTD_CCtx_reset
), instead of creating them anew, is much faster: compression takes microseconds instead of milliseconds.It looks like this is because existing entries in a hash table can be cleared much faster than it takes to fill a newly allocated table to all zeros. Most expensive is a
memset
inZSTD_reset_matchState
, its size can be over ten megabytes.But recycling
ZSTD_CCtx
objects is helpful only if compression options match. If compression options differ,ZSTD_CCtx
performs a costly reset of internal tables, and recycling them is as expensive as creating them anew, while losing the opportunity to reuse the object with matching compression options later.Our library does the recycling automatically, maintaining a pool of idle objects, and looking for an existing object with matching options instead of creating it anew: https://github.com/google/riegeli/blob/master/riegeli/base/recycling_pool.h. It is also used for other kinds of objects, but
ZSTD_CCtx
is one where proper recycling matters the most.The problem is that determining what does it mean that compression options match, for the purpose of recycling
ZSTD_CCtx
objects, is not trivial. For example:compressionLevel
andwindowLog
participate in this;checksumFlag
does not;estimatedSrcSize
(orpledgedSrcSize
if set) partially participate:tableID
computation inZSTD_getCParams
and split them into 4 classes of equivalence: https://github.com/google/riegeli/blob/8090178002b7f4e7fe5eba7cb756c526261921f0/riegeli/bytes/zstd_writer.cc#L63-L73Extracting the key for comparing options is done here: https://github.com/google/riegeli/blob/8090178002b7f4e7fe5eba7cb756c526261921f0/riegeli/bytes/zstd_writer.h#L213-L233; https://github.com/google/riegeli/blob/8090178002b7f4e7fe5eba7cb756c526261921f0/riegeli/bytes/zstd_writer.cc#L95-L97.
This is fragile.
Recently I discovered that if the first
ZSTD_compressStream2
call is withZSTD_e_end
, then the input size counts as specifyingpledgedSrcSize
for the purpose of recyclingZSTD_CCtx
objects. This prompted a change to create the compressor lazily, so that when it is known whether the first compressed fragment is also the last one, parameters for looking up an idleZSTD_CCtx
object to recycle can be appropriately adjusted: google/riegeli@8090178Can this be done better somehow?
I am not sure how this should be designed properly. Here are two possibilities:
Maybe zstd can somehow make that
memset
not as costly, to make the whole problem less important. Maybe there is an OS-specific way to get a zeroed memory more efficiently. Maybe zstd could make an internal pool of these tables.If we keep the existing basic design (i.e. recycling
ZSTD_CCtx
objects is the responsibility of a higher level library than zstd), what I need is to extract a “key” of compression parameters which matter for this purpose, a key which has equality and hashing defined. Maybe zstd could offer a more reliable way for this.The text was updated successfully, but these errors were encountered: