[reproducibility] Zstd varies compressed output based on the presence of SSE / NEON in some cases #4099

terrelln · 2024-07-19T14:53:46Z

Lines 238 to 249 in 0ff651d

 #if defined(ZSTD_ARCH_X86_SSE2) || defined(ZSTD_ARCH_ARM_NEON) 

 int const kHasSIMD128 = 1; 

 #else 

 int const kHasSIMD128 = 0; 

 #endif 

 if (mode != ZSTD_ps_auto) return mode; /* if requested enabled, but no SIMD, we still will use row matchfinder */ 

 mode = ZSTD_ps_disable; 

 if (!ZSTD_rowMatchFinderSupported(cParams->strategy)) return mode; 

 if (kHasSIMD128) { 

 if (cParams->windowLog > 14) mode = ZSTD_ps_enable; 

 } else { 

 if (cParams->windowLog > 17) mode = ZSTD_ps_enable;

We haven't historically strongly guarantee reproducibility across compilations on different systems. However, we've been moving in this direction. We should consider removing this source of difference, either by default or by opting into reproducible mode via a flag. I'm leaning towards by default, because I think this is one of the few places left where we have differences in compressed output based on the platform.

Cyan4973 · 2024-07-19T16:36:31Z

Agreed

felixhandte · 2024-07-19T16:44:10Z

IIUC, the diff in possible output is specific to the range 14 < windowLog <= 17. And you're proposing to close that gap?

lulcat · 2024-08-12T17:58:51Z

Actually, I am a bit miffed at the moment.. I am getting different output using the examples bundled in examples/

I tried vanilla on a laptop (same OS though), and a desktop and I am seeing this acros the boaard. Also in some bindings. I am rather shocked/confused whatever... How to nail this down? Is it my computers? is it arch like distros? Is it the examples?
using the threads_pool example say... lvl 3,
// a find example for files in curdir only. */ **(-D.) depending on shells and so on if desired.

find . -maxdepth 1 -type f -exec zstd -3 {} ;
ls *.zst | wc -l
74

// I save to .pst just to not overwrite here
find . -maxdepth 1 -type f -exec ./streaming_compression_threads_pool {} ;
for f in *.zst ; do diff $f ${f%.zst}.pst;done

Binary files a.zst and a.pst differ
Binary files b.zst and b.pst differ
Binary files c.zst and c.pst differ
Binary files struct_return.zst and struct_return.pst differ
Binary files class_array.zst and class_array.pst differ
Binary files hello.zst and hello.pst differ
Binary files poolz.zst and poolz.pst differ
Binary files usrbins.zst and usrbins.pst differ

... what gives?

I just am wondering as of course ideally I would want the shared binaries people have on linux systems to create same determinstic bytes in a file as long as same ocmpression level is set (I am disregarding any dicts here).

I am ok with this determinism to be platform specific, unlike the original topic, (so sorry about the slight digression); however, on the same platform, and same compr level I would expect libs, examples and upstrema binaries to all produce same output.

Something is off here and I need to figure out if it's this 'distro' (currently arch linux) or a hardware issue.

Despite getting 4-13? or something byte differences , hexdiff can give quite a huge patchset internally of these files, yet zstd --test (from the repo) is fine with these. The main binary from here also lists compression ratio and stuff the examples do not with zstd -l archive.zst so maybe it's just me or the examples are a bit short?

Cyan4973 · 2024-08-12T18:44:40Z

Exact same binary representation is only an objective with the following conditions :

same compression level
same library version
same parameters

The "same parameters" one can be a little bit difficult to nail. There are so many options that can be set differently when invoking the library directly.
For example, the zstd CLI typically enables the checksum by default, and tries to bundle the source size into the header if it has access to this information. It lets the library decide the block sizes, which are generally full sized. It triggers the multithreading mode even where there is only 1 thread.
An example program is likely to make different choices, such as not bundling the checksum, or flushing at arbitrary positions, creating different block sizes, or using the compression in --single-thread mode, which produces an output slightly different from the multithreaded mode.

So generally speaking, for reproducibility, we only compare the output of the zstd CLI, where the set of parameters is relatively well controlled, given just a version and a compression level.

lulcat · 2024-08-12T19:12:41Z

OK thank you Yann. I am ok with further testing the library and I am also fine with the 'standard zstd binary' :p NOT giving same output as a static library in my project.. as long as the library I use will always produce the same output with same parameters. (cLevel, lib version, etc). In other words as long as myProg -> reproduces , I can live with zstd only reproducing for itself. I just had to double check if I am 'tripping' :D because I have in the past had something like this, thinking it was an issue and turned out to be a gcc -O3 bug say.. (as an example only).

Anyway, cheers! (and for those pointers, on what the library might be outputting differently).
I will look a bit in the zstdcli binary then to see what I can do/want to do to possibly replicate.

Have a nice one.

P.S. (To answer then the likelihood for my own discrepancies for others; I think I have updated the git repo and built a new er libzstd.a vs. the libzstd.a I built in my system a little while ago).

robimarko mentioned this issue Aug 22, 2024

trace-cmd: update to 3.3 openwrt/openwrt#16219

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[reproducibility] Zstd varies compressed output based on the presence of SSE / NEON in some cases #4099

[reproducibility] Zstd varies compressed output based on the presence of SSE / NEON in some cases #4099

terrelln commented Jul 19, 2024

Cyan4973 commented Jul 19, 2024

felixhandte commented Jul 19, 2024

lulcat commented Aug 12, 2024 •

edited

Loading

Cyan4973 commented Aug 12, 2024

lulcat commented Aug 12, 2024 •

edited

Loading

[reproducibility] Zstd varies compressed output based on the presence of SSE / NEON in some cases #4099

[reproducibility] Zstd varies compressed output based on the presence of SSE / NEON in some cases #4099

Comments

terrelln commented Jul 19, 2024

Cyan4973 commented Jul 19, 2024

felixhandte commented Jul 19, 2024

lulcat commented Aug 12, 2024 • edited Loading

Cyan4973 commented Aug 12, 2024

lulcat commented Aug 12, 2024 • edited Loading

lulcat commented Aug 12, 2024 •

edited

Loading

lulcat commented Aug 12, 2024 •

edited

Loading