Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[reproducibility] Zstd varies compressed output based on the presence of SSE / NEON in some cases #4099

Open
terrelln opened this issue Jul 19, 2024 · 5 comments

Comments

@terrelln
Copy link
Contributor

#if defined(ZSTD_ARCH_X86_SSE2) || defined(ZSTD_ARCH_ARM_NEON)
int const kHasSIMD128 = 1;
#else
int const kHasSIMD128 = 0;
#endif
if (mode != ZSTD_ps_auto) return mode; /* if requested enabled, but no SIMD, we still will use row matchfinder */
mode = ZSTD_ps_disable;
if (!ZSTD_rowMatchFinderSupported(cParams->strategy)) return mode;
if (kHasSIMD128) {
if (cParams->windowLog > 14) mode = ZSTD_ps_enable;
} else {
if (cParams->windowLog > 17) mode = ZSTD_ps_enable;

We haven't historically strongly guarantee reproducibility across compilations on different systems. However, we've been moving in this direction. We should consider removing this source of difference, either by default or by opting into reproducible mode via a flag. I'm leaning towards by default, because I think this is one of the few places left where we have differences in compressed output based on the platform.

@Cyan4973
Copy link
Contributor

Agreed

@felixhandte
Copy link
Contributor

IIUC, the diff in possible output is specific to the range 14 < windowLog <= 17. And you're proposing to close that gap?

@lulcat
Copy link

lulcat commented Aug 12, 2024

Actually, I am a bit miffed at the moment.. I am getting different output using the examples bundled in examples/

I tried vanilla on a laptop (same OS though), and a desktop and I am seeing this acros the boaard. Also in some bindings. I am rather shocked/confused whatever... How to nail this down? Is it my computers? is it arch like distros? Is it the examples?
using the threads_pool example say... lvl 3,
// a find example for files in curdir only. */ **(-D.) depending on shells and so on if desired.

find . -maxdepth 1 -type f -exec zstd -3 {} ;
ls *.zst | wc -l
74

// I save to .pst just to not overwrite here
find . -maxdepth 1 -type f -exec ./streaming_compression_threads_pool {} ;
for f in *.zst ; do diff $f ${f%.zst}.pst;done

Binary files a.zst and a.pst differ
Binary files b.zst and b.pst differ
Binary files c.zst and c.pst differ
Binary files struct_return.zst and struct_return.pst differ
Binary files class_array.zst and class_array.pst differ
Binary files hello.zst and hello.pst differ
Binary files poolz.zst and poolz.pst differ
Binary files usrbins.zst and usrbins.pst differ

... what gives?

I just am wondering as of course ideally I would want the shared binaries people have on linux systems to create same determinstic bytes in a file as long as same ocmpression level is set (I am disregarding any dicts here).

I am ok with this determinism to be platform specific, unlike the original topic, (so sorry about the slight digression); however, on the same platform, and same compr level I would expect libs, examples and upstrema binaries to all produce same output.

Something is off here and I need to figure out if it's this 'distro' (currently arch linux) or a hardware issue.

Despite getting 4-13? or something byte differences , hexdiff can give quite a huge patchset internally of these files, yet zstd --test (from the repo) is fine with these. The main binary from here also lists compression ratio and stuff the examples do not with zstd -l archive.zst so maybe it's just me or the examples are a bit short?

@Cyan4973
Copy link
Contributor

Exact same binary representation is only an objective with the following conditions :

  • same compression level
  • same library version
  • same parameters

The "same parameters" one can be a little bit difficult to nail. There are so many options that can be set differently when invoking the library directly.
For example, the zstd CLI typically enables the checksum by default, and tries to bundle the source size into the header if it has access to this information. It lets the library decide the block sizes, which are generally full sized. It triggers the multithreading mode even where there is only 1 thread.
An example program is likely to make different choices, such as not bundling the checksum, or flushing at arbitrary positions, creating different block sizes, or using the compression in --single-thread mode, which produces an output slightly different from the multithreaded mode.

So generally speaking, for reproducibility, we only compare the output of the zstd CLI, where the set of parameters is relatively well controlled, given just a version and a compression level.

@lulcat
Copy link

lulcat commented Aug 12, 2024

OK thank you Yann. I am ok with further testing the library and I am also fine with the 'standard zstd binary' :p NOT giving same output as a static library in my project.. as long as the library I use will always produce the same output with same parameters. (cLevel, lib version, etc). In other words as long as myProg -> reproduces , I can live with zstd only reproducing for itself. I just had to double check if I am 'tripping' :D because I have in the past had something like this, thinking it was an issue and turned out to be a gcc -O3 bug say.. (as an example only).

Anyway, cheers! (and for those pointers, on what the library might be outputting differently).
I will look a bit in the zstdcli binary then to see what I can do/want to do to possibly replicate.

Have a nice one.

P.S. (To answer then the likelihood for my own discrepancies for others; I think I have updated the git repo and built a new er libzstd.a vs. the libzstd.a I built in my system a little while ago).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants