-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-deterministic results on different platforms #60
Comments
Based on a quick look you could try this: Split the loop on line 1634 into two. The first one adds to |
Is unordered map + sort definitely faster than an ordered map? |
Usually, in insert or lookup-heavy loads. In this case? I don't know. Try them and find out. |
Okay, this is good work, thank you! I am finishing up the next milestone (ASTC, PVRTC alpha, etc.), and once it is done in 2-3 weeks I will work on this problem. I may just switch to my own fully deterministic containers. |
Here, each key is a vector with 6 float components. In the worst case, comparison Line 112 in 6002320
and equality Line 111 in 6002320
operators involve FP operations for all 6 components. For the unordered map, a simple byte-based hasher is used: Lines 71 to 80 in 6002320
Maybe, one more option is to use these hash values as comparison values for the ordered map to avoid FP operations. |
Also note that I have not touched the encoder at all for this milestone, just the transcoders, so any work you do will be easily merged in. |
I'm still getting non-deterministic results across systems with version 1.13 even though some custom containers have been added. All but one of my tests of ETC1S/BasisLZ encoding that pass on macOS fail on Windows and Linux. (I don't know if the Windows and Linux results are different from each other.) In one example the length of the compressed data is 1 byte less on Windows than macOS (47 vs. 48 bytes). The global dictionaries (selectors, endpoints, & tables) are the same size. Curiously in the one case that succeeds on all platforms, the input file is a .pgm file. All the other files are .png. Also it only has one component. |
The root cause seems to be still in place: basis_universal/encoder/basisu_enc.h Lines 1843 to 1846 in 041ad47
|
We now have custom contains for vector and hash tables/sets. However, I believe there's still a std hash table in there in one place. Still leaving this up for more investigation. |
Yes there must be. I just ran the KTX-Software tests after integrating 1.16.3. ETC1S-encoded images on Windows and Linux are different from those on macOS. There were 2 fewer images different from macOS on Linux than on Windows so likely the Windows and Linux images are different from each other. |
In the
generate_hierarchical_codebook_threaded
function, anstd::unordered_map
is used for counting unique training vectors.basis_universal/basisu_enc.h
Lines 1598 to 1608 in 6002320
Iteration over the unordered map is implementation-dependent, so this code builds
group_quant
differently on MSVC and GCC/Clang.basis_universal/basisu_enc.h
Lines 1634 to 1638 in 6002320
As a result, the encoder produces different selectors and codebooks on different platforms.
After switching
group_hash
to be an ordered map, files produced on different platforms are the same (likely with some performance cost):Maybe, there is a better way to achieve deterministic results.
The text was updated successfully, but these errors were encountered: