fix: Crash during PThread initialization on big-endian machine #23700

slavek-kucera · 2025-02-19T08:53:05Z

Notes:

There are other places using Atomic.something that likely have the same issue.
I tried keeping the nativeByteOrder32 function in the libpthread.js and use the dependency mechanism, but it was always optimized into x => x.
I tried to consider the MEMORY64 option, but it seems to not work at all in node (23 + appropriate experimental flag).
Simple hello world output was built using the instructions from the issue and tested on s390x z/os machine.

sbc100 · 2025-02-19T15:42:54Z

src/runtime_shared.js

+    var h16 = new Int16Array(1);
+    var h8 = new Int8Array(h16.buffer);
+    h16[0] = 42;
+    return h8[0] === 42 && h8[1] === 0; // little endian ordering


Surely we must already have helpers that perform this kind of thing in emscripten? Since all the non-atomic read/writes already work?

Other accesses use the DataView methods.

Similar test is performed in the runtime_debug.js, but it is only included with ASSERTIONS option.

kripken · 2025-02-19T21:28:32Z

The general mechanism we use for this is

emscripten/tools/acorn-optimizer.mjs

Lines 1103 to 1105 in bb7fd25

    
           // Replaces each HEAP access with function call that uses DataView to enforce 
        
           // LE byte order for HEAP buffer 
        
           function littleEndianHeap(ast) {

Perhaps the best thing is to update that pass so it handles Atomic operations?

slavek-kucera · 2025-02-19T21:47:30Z

@kripken

The runtime test will need to be performed somewhere.
The generic transformation might get quite complicated for e.g. waitAsync - wouldn't it be better at that point to introduce wrappers around these functions?

kripken · 2025-02-20T00:53:16Z

The runtime test will need to be performed somewhere.

I agree, yes, that is necessary. I am just saying that doing it in the processing pass has advantages. Specifically it will find all atomic operations, guaranteeing that we don't forget any (as you wrote above, "There are other places using Atomic.something that likely have the same issue").

The generic transformation might get quite complicated for e.g. waitAsync - wouldn't it be better at that point to introduce wrappers around these functions?

Oh, definitely, yes, we want to use wrapper functions. The processing pass should just add calls to those wrappers. That is what it does today: it replaces e.g. HEAP16[x] = a with LE_HEAP_STORE_I16(x * 2, a). All the real work happens in the called function. For more examples, see

test/js_optimizer/LittleEndianHeap.js
test/js_optimizer/LittleEndianHeap-output.js

And LE_HEAP_STORE etc. is not defined in the pass, but in the JS, see

src/lib/liblittle_endian_heap.js

I am suggesting that Atomics be implemented in a similar way.

slavek-kucera · 2025-02-20T16:40:54Z

@kripken @sbc100 Is this more in line with the direction you had in mind?

sbc100 · 2025-02-20T17:36:50Z

src/runtime_shared.js

 }

+#if SUPPORT_BIG_ENDIAN
+var nativeByteOrder = (() => {


I think this can live in liblittle_endian_heap.js too as a library function.

I'll try to move as much as possible.

sbc100 · 2025-02-20T17:38:46Z

src/runtime_shared.js

+   HEAPU32.unsigned = (x => x >>> 0);
+#if WASM_BIGINT
+   HEAPU64.unsigned = (x => x >= 0 ? x : BigInt(2**64) + x);
+#endif


Ideally we could avoid adding this here and keep all the code chagnes to liblittle_endian_heap.js.

Ideally we could also avoid adding extra methods to these heaps, but maybe thats not easy.

I didn't really find a better way to keep the extra methods out while not introducing per-type functions (and associated transformations).

Can you just use the uppercase version here? i.e. why do we need the new lower case versions?

I wonder if wen also move these lines to liblittle_endian_heap.js?

Perhaps they could be part of the __postset too?

re the new variables - if the uppercased variables are used they get replaced by e.g. the growable heap wrappers.

One option would be to rename updateMemoryViews to _updateMemoryViews. Assign _updateMemoryViews to updateMemoryViews and use to __postset to override the new updateMemoryViews variable.

If you prefer this option, I can look into that on Monday.

sbc100 · 2025-02-20T17:40:05Z

tools/link.py

+      '$LE_ATOMICS_SUB',
+      '$LE_ATOMICS_WAIT',
+      '$LE_ATOMICS_WAITASYNC',
+      '$LE_ATOMICS_XOR',


While I agree it would be good to add all of these, I also thing we might want to only add the ones that we actually use (and also test).

I thought that the whole point of this approach was to actually implement comprehensive support like was done to the non-atomic memory accesses done from JS code.

Yes, I suppose you right, seems reasonable to leave them all in then.

But lets skip 64-bit atomics and just fail in that case, to keep things simple. I don't think anyone should be doing that.

I've tested it on the following program:

#include <atomic> #include <print> #include <emscripten.h> int main(){ std::atomic<unsigned long long> x = 0x0123456789ABCDEF; std::println("x: {:X}", x.load()); EM_ASM({ Atomics.compareExchange(HEAP64, ($0)/8, BigInt("0x0123456789ABCDEF"), BigInt("0xFEDCBA9876543210")); }, &x); std::println("x: {:X}", x.load()); }

built with

em++ -o main.js -s USE_PTHREADS=1 -s PROXY_TO_PTHREAD=1 -s PTHREAD_POOL_SIZE=8 -s SUPPORT_BIG_ENDIAN=1 -pthread -s EXIT_RUNTIME=1 -s ALLOW_MEMORY_GROWTH=1 -std=c++23 main.cpp

And I do get the expected result on both LE and BE:

x: 123456789ABCDEF x: FEDCBA9876543210

sbc100 · 2025-02-20T17:53:19Z

src/runtime_shared.js

+      else
+        return res - BigInt(2**64);
+    }
+    ),


Do we need to support more than 4 byte access? Maybe lets just skip then completely? If anyone tries to do an atomic 64-bit access they would see a crash accessing nativeByteOrder OOB.

See the post above... I think that the 8B atomics will be needed eventually for the MEMORY64.

I dont think we want to be adding complexity to support APIs that we don't use in emscripten.

That was just an example, perhaps more realistic would be just accessing 8B atomic or perhaps 2x4B pointer+counter pair value.

Sure, but until we get a request that somebody wants to to actually do that from JS then I don't think we should support it. Better to simply error out for now maybe?

slavek-kucera force-pushed the fix_bigendian_pthread_init branch from 2dae9cd to 13d97b3 Compare February 19, 2025 08:55

sbc100 reviewed Feb 19, 2025

View reviewed changes

slavek-kucera force-pushed the fix_bigendian_pthread_init branch 2 times, most recently from 76f6d46 to 6d93671 Compare February 20, 2025 12:08

sbc100 reviewed Feb 20, 2025

View reviewed changes

fix: Crash during PThread initialization on big-endian machine

3a3138f

slavek-kucera force-pushed the fix_bigendian_pthread_init branch from 6d93671 to 3a3138f Compare February 21, 2025 11:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Crash during PThread initialization on big-endian machine #23700

fix: Crash during PThread initialization on big-endian machine #23700

slavek-kucera commented Feb 19, 2025

sbc100 Feb 19, 2025

slavek-kucera Feb 19, 2025

slavek-kucera Feb 19, 2025

kripken commented Feb 19, 2025

slavek-kucera commented Feb 19, 2025

kripken commented Feb 20, 2025

slavek-kucera commented Feb 20, 2025

sbc100 Feb 20, 2025

slavek-kucera Feb 20, 2025

sbc100 Feb 20, 2025

slavek-kucera Feb 20, 2025

sbc100 Feb 21, 2025

sbc100 Feb 21, 2025

slavek-kucera Feb 21, 2025

slavek-kucera Feb 21, 2025

sbc100 Feb 20, 2025

slavek-kucera Feb 20, 2025

sbc100 Feb 20, 2025

slavek-kucera Feb 21, 2025

sbc100 Feb 20, 2025

slavek-kucera Feb 21, 2025

sbc100 Feb 21, 2025

slavek-kucera Feb 21, 2025

sbc100 Feb 21, 2025

fix: Crash during PThread initialization on big-endian machine #23700

Are you sure you want to change the base?

fix: Crash during PThread initialization on big-endian machine #23700

Conversation

slavek-kucera commented Feb 19, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kripken commented Feb 19, 2025

slavek-kucera commented Feb 19, 2025

kripken commented Feb 20, 2025

slavek-kucera commented Feb 20, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment