Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Is this package faster than DataDog/zstd now? #463

Closed
bduffany opened this issue Jan 8, 2022 · 4 comments
Closed

Question: Is this package faster than DataDog/zstd now? #463

bduffany opened this issue Jan 8, 2022 · 4 comments

Comments

@bduffany
Copy link

bduffany commented Jan 8, 2022

I ran some benchmarks using a dataset consisting of a mixture of source code files and compiled binary files that have a compression ratio of ~0.33 on average.

I found that DataDog/zstd outperformed this library in only a few cases. In particular, I was seeing slightly less decompression throughput from kp/compress for small blobs when using the streaming APIs from each package (a few % difference). But for larger files, kp/compress had about 30% better decompression throughput. For compression, kp/compress nearly always has higher throughput when using the streaming API -- around 4X higher throughput for files less than 2MB, and around 10X higher for larger files (tested with files 24MB and up).

Is this expected? My experience using this lib does not match the benchmarks in the godoc, which show zstd as being faster across the board. Have improvements been made to this lib that aren't reflected in the godoc, or is it more likely that my methodology for benchmarking DataDog/zstd is not resulting in a fair comparison?

My methodology is the following:

Compression

  • For kp/compress, using sync.Pool to reuse encoders, and using encoder.ReadFrom the read-end of a pipe. Data is then written to the write-end of the pipe.
  • For datadog/zstd, using the same pipe approach, but creating a NewWriter every time. Using io.Copy to copy the read end of the pipe to the writer.

Decompression

  • For kp/compress, creating a new decoder for each file (reuse is not possible IIUC?) and using decoder.WriteTo to get bytes out of it.
  • Same for datadog, creating a NewReader every time and using io.Copy to get bytes out of the reader.

I can share all the data/code that I'm using if these results aren't expected, it will just take a bit of work to clean up and make it easily runnable -- figured I'd send out an initial probe to see whether these results are surprising or not.

@klauspost
Copy link
Owner

I wouldn't expect it to be faster. The C library is extremely optimized by a lot of dedicated people, but the wrapper may have its downsides. To be honest I don't really worry about it, since CGO is so undesirable to most.

For streams I'd say it is mostly about 0.7x the speed of the c library, which is reasonable for most. For smaller objects (EncodeAll), it varies a bit more, but often close enough.

My main goal is to have it "best in class" for Go algorithms for most workloads, which I think is reasonable to claim. Once I've finished up this rather large feature for s2 I will probably return to zstd.

I would like to add a "single-threaded" stream decoder, and maybe improve the multithreaded decoder, alongside with a fully multithreaded encoder, that can utilize all cores. Time permitting of course.

@bduffany
Copy link
Author

bduffany commented Jan 8, 2022

Thank you for the reply! Is there any chance you'd be able to share the benchmark source code that you're using to test Compression speed? Just now I tried compressing a 114 MB file using the streaming API, and your library appears 10X faster (300MB/s for your library, vs 30MB/s for datadog). I would like to compare the code I am using to see if I am just doing something wrong (although it seems unlikely, since I am just doing w := datadog_zstd.NewWriter(...), io.Copy(...), w.Close())

@klauspost
Copy link
Owner

I have this rather clunky test-application:

compress.go.gz

I use GOPATH for it, so I can test in-branch changes of my old stuff.

Here are results for different inputs: https://docs.google.com/spreadsheets/d/1nuNE2nPfuINCZJRMt6wFWhKpToF95I47XjSsc-1rbPQ/edit?usp=sharing

This is one of the test scripts I use (windows bat script):

SET GO111MODULE=off

SET OPTS=-mem -out=*
SET GZOPTS=-mem -out=*

go build compress.go
compress -in=%1 %OPTS% -stats -header=false -w="gzkp" -l=0
SET LEVEL=1


compress -in=%1 %GZOPTS% -stats -header=true -w="gzstd" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="gzkp" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="pgzip" -l=%LEVEL% >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zstd" -l=%LEVEL% >>results.txt

SET LEVEL=2
echo.>>results.txt

compress -in=%1 %GZOPTS% -stats -header=false -w="gzstd" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="gzkp" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="pgzip" -l=%LEVEL% >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zstd" -l=%LEVEL% >>results.txt

SET LEVEL=3
echo.>>results.txt

compress -in=%1 %GZOPTS% -stats -header=false -w="gzstd" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="gzkp" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="pgzip" -l=%LEVEL% >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zstd" -l=%LEVEL% >>results.txt

SET LEVEL=4
echo.>>results.txt

compress -in=%1 %GZOPTS% -stats -header=false -w="gzstd" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="gzkp" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="pgzip" -l=%LEVEL% >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zstd" -l=%LEVEL% >>results.txt

:five

SET LEVEL=5
echo.>>results.txt

compress -in=%1 %GZOPTS% -stats -header=false -w="gzstd" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="gzkp" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="pgzip" -l=%LEVEL% >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zstd" -l=%LEVEL% >>results.txt

:six

SET LEVEL=6
echo.>>results.txt

compress -in=%1 %GZOPTS% -stats -header=false -w="gzstd" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="gzkp" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="pgzip" -l=%LEVEL% >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zstd" -l=%LEVEL% >>results.txt

:level7
SET LEVEL=7
echo.>>results.txt

compress -in=%1 %GZOPTS% -stats -header=false -w="gzstd" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="gzkp" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="pgzip" -l=%LEVEL% >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zstd" -l=%LEVEL% >>results.txt

SET LEVEL=8
echo.>>results.txt

compress -in=%1 %GZOPTS% -stats -header=false -w="gzstd" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="gzkp" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="pgzip" -l=%LEVEL% >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zstd" -l=%LEVEL% >>results.txt

SET LEVEL=9
echo.>>results.txt

compress -in=%1 %GZOPTS% -stats -header=false -w="gzstd" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="gzkp" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="pgzip" -l=%LEVEL% >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zstd" -l=%LEVEL% >>results.txt

:extras
SET LEVEL=-2
echo.>>results.txt

compress -in=%1 %GZOPTS% -stats -header=false -w="gzstd" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="gzkp" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="gzkp" -l=-3 >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="pgzip" -l=%LEVEL% >>results.txt
compress -in=%1 %GZOPTS% -stats -header=false -w="pargzip" -l=0 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="snappy" -l=0 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="s2" -l=1 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="s2" -l=2 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="s2" -l=3 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zskp" -l=1 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zskp" -l=2 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zskp" -l=3 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="zskp" -l=4 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="lz4" -l=0 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="s2s" -l=1 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="s2s" -l=2 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="s2s" -l=3 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="s2zs" -l=1 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="s2zs" -l=2 >>results.txt
compress -in=%1 %OPTS% -stats -header=false -w="s2zs" -l=3 >>results.txt

:END
echo.>>results.txt

@bduffany
Copy link
Author

bduffany commented Jan 8, 2022

Ah, I think I made a basic mistake -- I am using bazel to build and run my benchmarks, and I forgot to run with -c opt, which apparently is key to making DataDog/zstd run fast. After enabling -c opt, I am seeing DataDog/zstd have much higher performance. (The default for bazel is -c fastbuild which I guess might be passing some flags to go build to disable optimizations).

Thanks! I would not have caught this without seeing go build in your script, which made me think about how I was building the benchmark.

@bduffany bduffany closed this as completed Jan 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants