Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce RowHash's tag space size by x2 #3543

Merged
merged 1 commit into from
Mar 10, 2023

Conversation

yoniko
Copy link
Contributor

@yoniko yoniko commented Mar 10, 2023

Allocate half the memory for tag space, which means that we get one less slot for an actual tag (needs to be used for next position index).
The results is a slight loss in compression ratio (up to 0.2%) and some regressions/improvements to speed depending on level and sample. In turn, we get to save 16% of the hash table's space (5 bytes per entry instead of 6 bytes per entry).

Update:
CR comments fixed.
Benchmark in spreadsheet.

Update 2:
Another interesting benchmark is fullbench -b41 -l9 enwik8.1k which repeatedly uses ZSTD_compressStream to recompress 1024 bytes out of enwik8. Since #3528 hasn't been deployed yet, we can observe that this PR makes the benchmark twice as fast for small data sizes as the majority of time is spent in the tag space intiailization.

Copy link
Contributor

@terrelln terrelln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good!

I did a quick benchmark and see just about neutral performance, +- a few percent based on the compiler. Also a small ratio loss, but it seems reasonable. I'll let you do the full benchmarks, just wanted to verify that the speed looked good on my machine too.

@@ -804,9 +803,10 @@ U16 ZSTD_rotateRight_U16(U16 const value, U32 count) {
* value to reflect the update. Essentially cycles backwards from [0, {entries per row})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should update this comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I did, thank you for noticing!

lib/compress/zstd_lazy.c Show resolved Hide resolved
@@ -888,7 +888,7 @@ FORCE_INLINE_TEMPLATE void ZSTD_row_update_internalImpl(ZSTD_matchState_t* ms,
U32 const rowMask, U32 const useCache)
{
U32* const hashTable = ms->hashTable;
U16* const tagTable = ms->tagTable;
U8* const tagTable = ms->tagTable;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use BYTE here and elsewhere

@@ -903,7 +903,7 @@ FORCE_INLINE_TEMPLATE void ZSTD_row_update_internalImpl(ZSTD_matchState_t* ms,
U32 const pos = ZSTD_row_nextIndex(tagRow, rowMask);

assert(hash == ZSTD_hashPtr(base + updateStartIdx, hashLog + ZSTD_ROW_HASH_TAG_BITS, mls));
((BYTE*)tagRow)[pos + ZSTD_ROW_HASH_TAG_OFFSET] = hash & ZSTD_ROW_HASH_TAG_MASK;
((BYTE*)tagRow)[pos] = hash & ZSTD_ROW_HASH_TAG_MASK;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can get rid of the BYTE* cast now.

@yoniko
Copy link
Contributor Author

yoniko commented Mar 10, 2023

@terrelln - did you benchmark on a mac or on an x86 server?

@yoniko yoniko force-pushed the reduce-tag-space-footprint branch 2 times, most recently from cc96945 to faf5f15 Compare March 10, 2023 02:11
@yoniko yoniko marked this pull request as ready for review March 10, 2023 02:21
@yoniko
Copy link
Contributor Author

yoniko commented Mar 10, 2023

Fixed CR comments and added a spreadsheet for benchmarks.
Compression ratio changes are less than 0.01% on average, there's a regression of up to 5% for levels 5-7 on Skylake but a performance increase for higher levels.

I've tried reducing this regression, but don't have a good solution at the moment and unless it's critical I'll not continue.

Allocate half the memory for tag space, which means that we get one less slot for an actual tag (needs to be used for next position index).
In turn, we slash the memory usage for slightly worse compression ratio or better ratio if we use the same memory size with a higher hashLog.
@yoniko yoniko force-pushed the reduce-tag-space-footprint branch from faf5f15 to 69be424 Compare March 10, 2023 06:23
@terrelln
Copy link
Contributor

@terrelln - did you benchmark on a mac or on an x86 server?

x86

Copy link
Contributor

@terrelln terrelln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@yoniko yoniko merged commit 33e3909 into facebook:dev Mar 10, 2023
yoniko added a commit to yoniko/zstd that referenced this pull request Mar 11, 2023
…ys invalid

facebook#3543 decreases the size of the tagTable by a factor of 2, which requires using the first tag position in each row for head position instead of a tag.
Although position 0 stopped being a valid match, it still persisted in mask calculation resulting in the matches loops possibly terminating before it should have.
The fix skips position 0 to solve this problem.
yoniko added a commit to yoniko/zstd that referenced this pull request Mar 11, 2023
…ys invalid

facebook#3543 decreases the size of the tagTable by a factor of 2, which requires using the first tag position in each row for head position instead of a tag.
Although position 0 stopped being a valid match, it still persisted in mask calculation resulting in the matches loops possibly terminating before it should have.
The fix skips position 0 to solve this problem.
yoniko added a commit that referenced this pull request Mar 13, 2023
#3543 decreases the size of the tagTable by a factor of 2, which requires using the first tag position in each row for head position instead of a tag.
Although position 0 stopped being a valid match, it still persisted in mask calculation resulting in the matches loops possibly terminating before it should have. The fix skips position 0 to solve this problem.
@shubhamchandak94
Copy link

shubhamchandak94 commented Jun 30, 2023

Hi, can this be reflected in the comment at https://github.com/facebook/zstd/blame/25822342be59d831bad65426ae51f5cc22157b09/lib/compress/zstd_lazy.c#L1116-L1118

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants