-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple performance improvements for ldm #2464
Conversation
lib/compress/zstd_ldm.c
Outdated
unsigned const offset = *pOffset; | ||
|
||
*(ZSTD_ldm_getBucket(ldmState, hash, ldmParams) + offset) = entry; | ||
*pOffset = (offset + 1) & (((U32)1 << ldmParams.bucketSizeLog) - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I believe the right hand side needs an explicit cast to appease the warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint. This should be fixed by my latest commit.
Thanks for the improvements, this generally looks pretty good to me! The circle CI failure can probably get fixed with a rebase on top of the latest |
I've measured a 2.5% - 4.5% speed improvement on several different files, which is in line with the measurements you've provided. There are remaining calls to |
lib/compress/zstd_ldm.c
Outdated
ldmEntry_t entry; | ||
|
||
entry.offset = curr; | ||
entry.checksum = checksum; | ||
ZSTD_ldm_insertEntry(ldmState, hash, entry, *params); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also insert when we find an entry right? Could we use ZSTD_ldm_insertEntry()
in that case too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought it'd not bring much of a win but for the sake of consistency I updated it.
Thanks for your comments! I rebased on top of a recent dev branch and added one commit to address the feedback.
In |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! I just have one minor nit about a renaming and it looks good to me.
lib/compress/zstd_ldm.c
Outdated
@@ -299,10 +295,13 @@ static size_t ZSTD_ldm_generateSequences_internal( | |||
U64 rollingHash = 0; | |||
|
|||
while (ip <= ilimit) { | |||
U32 const currentOffset = (U32)(ip - base); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is not an offset it is an "index". Either keep curr
or use currentIndex
(note: current
cannot be used because it conflicts with a macro in the Linux kernel).
I know that it is called offset
in the ldmEntry_t
struct, but that is a bad name. It should be renamed to index
. If you want to include that renaming in this diff or a followup that would be great.
I renamed the variable to
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks for the PR! Looking forward to the next one :)
Thanks @mpu ! I tested your PR on a (fairly stable) desktop-class workstation, |
Description
This PR contains some simple performance improvements to the ldm matcher, together with some other cosmetic changes that I made "en passant" (variable renames, typo fixes, ...).
The changes to the ldm matcher only affect performance and the new code should produce precisely the same output as the baseline. Two changes helped improve the performance relatively uniformly:
ZSTD_ldm_generateSequences
we now compute a tag mask once before entering the loop and use it to check if the rolling hash satisfies the "tag condition". The tag mask is computed in such a way that the new tag condition is equivalent to the original one.ZSTD_ldm_insertEntry
instead ofZSTD_ldm_makeEntryAndInsertByTag
which performs some redundant checks.Benchmarks
The following files were used to assess the impact of the change:
hhvm-rt.tar
(5.9G) - an image of production version of HHVMl1m.tar
(2.0G) - two snapshots of the linux kernel source code taken a month apartl1y.tar
(1.9G) - two snapshots of the linux kernel source code taken a year apartl5.tar
(544M) - the first 544M of an archive containing the linux kernel source codeThe results are compiled in the table below. All times are computed as the best over a set of 5 runs.
--long=27 -1
--long=27 -1
--long=27 -1
--long=27 -1
--long=30 -1
--long=30 -1
--long=30 -1
--long=30 -1
--long=27 -8
--long=27 -8
--long=27 -8
--long=27 -8
--long=30 -8
--long=30 -8
--long=30 -8
--long=30 -8
--long=27 -16
--long=27 -16
--long=27 -16
--long=27 -16
--long=30 -16
--long=30 -16
--long=30 -16
--long=30 -16
--long=27 -19
--long=27 -19
--long=27 -19
--long=27 -19
--long=30 -19
--long=30 -19
--long=30 -19
--long=30 -19