-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Support references to multiple DDicts #2446
Conversation
lib/decompress/zstd_decompress.c
Outdated
* Returns an index between [0, hashSet->ddictPtrTableSize] | ||
*/ | ||
static size_t ZSTD_DDictHashSet_getIndex(const ZSTD_DDictHashSet* hashSet, U32 dictID) { | ||
U32 hash = XXH64(&dictID, sizeof(U32), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I can't quite figure out is why on certain builds/CI tests, zstd_decompress.c
can't see XXH64
, but can see XXH64_digest(), etc.
? As far as I can tell, xxhash.h
is within the graph of dependencies, and locally, it seems fine. Is there an additional macro I need to invoke?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this related to the linux kernel test ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I do see a line about Replacing XXH64 prefix with xxh64
in the logs, and I assume it might be related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should just need to change this line:
zstd/contrib/freestanding_lib/freestanding.py
Line 579 in bc0a1e4
(re.compile(r"([^\w]|^)(?P<orig>XXH64)_"), self._xxh64_prefix) |
Something like: re.compile(r"([^\w]|^)(?P<orig>XXH64)[\(_]")
1b7c515
to
6705993
Compare
lib/decompress/zstd_decompress.c
Outdated
return 0; | ||
} | ||
if (idx == hashSet->ddictPtrTableSize - 1) { | ||
idx = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor (potential suggestion) : could also use a mask if ddictPtrTableSize
is a clean power of 2.
3544d64
to
5f5d100
Compare
5f5d100
to
9ae0dd9
Compare
This PR adds a new experimental param
ZSTD_d_refMultipleDDicts
which controls the behavior ofZSTD_refDDict()
.If enabled, then multiple calls to
ZSTD_refDDict()
will persist the multiple referencedZSTD_DDict*
in memory rather than overwriting the last reference. Under the hood, we use a hash set w/ linear probing to keep track of theZSTD_DDict*
. Example usage is in the included unit test infuzzer.c
, but is basically as simple as:The goal is no changes to the existing API, leaving existing behavior/usages of
ZSTD_refDDict()
unchanged.Only when we call
setParameter()
with the new experimental param can behavior change.There are still issues with this PR, but I wanted to put this up to get some initial comments to see if this was roughly the direction we wanted to head in.
Open questions/followups:
refDDict()
best?)