Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd: Tweak DecodeAll standard allocs #295

Merged
merged 1 commit into from
Nov 15, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions zstd/decoder.go
Original file line number Diff line number Diff line change
Expand Up @@ -323,19 +323,23 @@ func (d *Decoder) DecodeAll(input, dst []byte) ([]byte, error) {
}
if frame.FrameContentSize > 0 && frame.FrameContentSize < 1<<30 {
// Never preallocate moe than 1 GB up front.
if uint64(cap(dst)) < frame.FrameContentSize {
if cap(dst)-len(dst) < int(frame.FrameContentSize) {
dst2 := make([]byte, len(dst), len(dst)+int(frame.FrameContentSize))
copy(dst2, dst)
dst = dst2
}
}
if cap(dst) == 0 {
// Allocate window size * 2 by default if nothing is provided and we didn't get frame content size.
size := frame.WindowSize * 2
// Allocate len(input) * 2 by default if nothing is provided
// and we didn't get frame content size.
size := len(input) * 2
Copy link

@larry-cdn77 larry-cdn77 Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With compression ratio over 2, does this lead to the sequence decoder reallocating to 2MB maxBlockSize? Could that mean that with this change the library allocates 2MB rather than 1MB (previous cap) for small frames?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larry-cdn77 The 1MB cap check is still, see next lines.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larry-cdn77 And you can always just allocate with any custom size and send that as the destination.

Copy link

@larry-cdn77 larry-cdn77 Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly. Thank you for responding.

I wonder if I am right to observe that in certain situations one has to hit a sweetspot in the size of custom slice to pass in. I have a high throughput application with message sizes of less than 1KB. The custom slice passed in has to be relatively small to avoid garbage collection spinning up to a lot of CPU time (I've seen a 50% throughput penalty). And it has to big enough to stay clear of sequence decoder replacing it with a new 2MB one, again degrading performance.

Currently the size of my slice is a multiple of compressed size, and experimentally a multiplier of 32 appears to be the sweetspot (I didn't fully understand if this is how much compression my encoder has achieved, or there are other factors behind that number).

Related to this, would it make sense to adjust the library's maxBlockSize given that my data is zstd library encoded and they seem to use 128KB maximum block size?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larry-cdn77 If you are using this package to compress it doesn't matter, since it will have FrameContentSize which allows to make an exact allocation and this will never be relevant.

a multiplier of 32

So you have a 32:1 compression ratio? That is way above what most will see, so I can't use that metric. With an expected factor of 2 we should be able to decode most content with 1 or 2 allocations.

maxBlockSize is defined by the format, so it cannot be changed. This is the maximum size of an uncompressed block.

We could take a look at #253 which could be made to decode the frame and first block headers and return the size if there is only one block.

// Cap to 1 MB.
if size > 1<<20 {
size = 1 << 20
}
if uint64(size) > d.o.maxDecodedSize {
size = int(d.o.maxDecodedSize)
}
dst = make([]byte, 0, size)
}

Expand Down