Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lib] Add ZSTD_c_deterministicRefPrefix #2616

Merged
merged 1 commit into from
May 6, 2021

Conversation

terrelln
Copy link
Contributor

@terrelln terrelln commented May 5, 2021

This flag forces zstd to always load the prefix in ext-dict mode, even
if it happens to be contiguous, to force determinism. It also applies to
dictionaries that are re-processed.

A determinism test case is also added, which fails without
ZSTD_c_deterministicRefPrefix and passes with it set.

Question: Should this be the default behavior? It isn't in this PR.

@terrelln terrelln force-pushed the deterministic-dict branch 2 times, most recently from efea18d to 1a30e62 Compare May 5, 2021 21:02
@Cyan4973
Copy link
Contributor

Cyan4973 commented May 5, 2021

We have an internal use case currently,
which is enough to warrant the existence of this flag,
but I don't see a strong need to make the new behavior the default.
We can always do that later if it happens to be preferable.

@@ -559,6 +559,11 @@ ZSTD_bounds ZSTD_cParam_getBounds(ZSTD_cParameter param)
bounds.upperBound = (int)ZSTD_urm_enableRowMatchFinder;
return bounds;

case ZSTD_c_deterministicRefPrefix:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened question :
flag name : ZSTD_c_deterministicRefPrefix
the name describes pretty well the objective, not the how.

In general, this is a good rule to select a name.

However, in this case, I was wondering :
in which case would anyone select "no" to this property ?
"Do you want a deterministic output ?" "No, I'm fine with unpredictable randomness"

I guess, one only answers "no" if there is something in exchange.
In this case, the only answer I could think of would probably be more speed
(and I don't even know how much speed we are talking about, it could be negligible).

Even then, making the effort to benefit from this speed only makes sense if the user consistently organizes layout so that the prefix always stands just before the destination buffer. In which case ... the output is deterministic.
So it feels strange to have to answer "no" to ZSTD_c_deterministicRefPrefix in order to benefit from better speed by concatenating output buffer right after dictionary prefix.

So, I was wondering : presuming someone would be interested in better speed, by ensuring that dictionary prefix and output buffer are contiguous, what would be the "right" parameter name that would feel "correct"?

I could imagine a 3-stages parameter : always separated, always contiguous (in which case it must fail if buffers are not contiguous), automatic (hence non deterministic).

Thoughts ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be fine to add this, it would be simple to do. But, it does add more complexity to the API, so there should be some gain, like speed.

I guess that means I should measure the speed difference between ext-dict & prefix dictionaries... I will do that, and if there is a significant gain, I can extend the API.

Copy link
Contributor

@senhuang42 senhuang42 May 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall measuring something like a 5% speed gain on a subset of the httparchive corpus with a 16KB dict level 1, for contiguous vs. non-contiguous.

Copy link
Contributor Author

@terrelln terrelln May 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix-mode is ~2.5% faster at level 1, 1.3% faster at level 3, and 2.5% faster at level 5.

I think that loss is not worth the complexity, especially because the user would probably have to copy their buffers to ensure they're contiguous. I will update the comments to explain the tradeoff a bit more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with the outcome.

I'm surprised though that prefix mode (which I understand as "contiguous buffers") ends up being slower at level 5. The expectation was that it would be at least as fast, if only by a tiny margin, but not slower.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised though that prefix mode (which I understand as "contiguous buffers") ends up being slower at level 5. The expectation was that it would be at least as fast, if only by a tiny margin, but not slower.

That was a typo, it is 2.5% faster.

This flag forces zstd to always load the prefix in ext-dict mode, even
if it happens to be contiguous, to force determinism. It also applies to
dictionaries that are re-processed.

A determinism test case is also added, which fails without
`ZSTD_c_deterministicRefPrefix` and passes with it set.

Question: Should this be the default behavior? It isn't in this PR.
@terrelln terrelln merged commit 207e33b into facebook:dev May 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants