-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pal thread handling crashes on latest Arch linux #6613
Comments
What JS code does this reproduce on? Even if it is started to happen after some changes in Arch, we might still be on the hook if something wrong is passed down to the system library. |
It's not specific to any particular JS - running the test suite it happened with over half the dynopogo tests. Running a simple console.log("hello") js file it happened intermittently. |
Arch seems to be available in WSL, I will give it a try. |
I'm the friend mentioned in the OP, the one who reported the issue to @rhuanjl. I ran into the issue in Arch on my desktop, laptop, a virtual machine, and just to be absolutely sure that I hadn't overlooked something, I just tried it in a fresh Arch WSL installation and got an almost identical stack trace. for(let i = 0; i < 32; i++) {
console.log(`#${i}`);
} |
The crash is when it tries to launch a helper thread, I'm guessing but the trigger is likely the first attempt to run the JIT OR the garbage collector (both of which normally operate off thread). |
This is reproducible for me on Manjaro, stacktrace looks the same. When building inside of an Ubuntu 20.04 container, everything works fine. I'll post an update if I'll be able to reproduce this on some later version of Ubuntu |
@nic11 Use Windows |
Fails on Ubuntu 21.10 (with a different stacktrace though). Probably it's caused by libicu, as it appears like the only needed library which has different major version in these three environments:
|
@nic11 can we fallback to old icu and check |
I had a feeling this would eventually happen. To my knowledge, the Arch devs only make changes when absolutely necessary, so if the issue isn't resolved soon, it'll likely start breaking with other distros as well. |
I'm 99% confident that the failure on Arch was to be do with PAL. Relevant info
ICU |
Would it be possible at all to replace PAL with a more natural system without having to rewrite the majority of the codebase for CC as a whole? |
I've tried building with |
I'd like to investigate this - the big piece to look into is thread handling and what calls exactly are being used to create and manage threads also how spread out across the codebase they are - it shouldn't actually be that invasive a change as there's only a few things that can spin up a thread. |
Could you use a debug or test build and post a stack trace with symbols in so we can see where it's going wrong? |
Actually on my main system it's just the same as in the issue description. Or what exactly do you mean? Actually tbh I didn't do a standalone build, I build ChakraCore as a vcpkg dependency, but the stacktrace matches the description this issue. And yeah, sorry, but it may take me a bit of time, I'm busy currently. I guess I'll try libicu 66 build, maybe it'll help. Btw if I create an Arch or Ubuntu 21.10 based Docker image or Dockerfile where it can be reproduced, would it be simpler for you to debug? |
https://sourceware.org/git/?p=glibc.git;a=commit;h=2e39f65b5ef11647beb4980c4244bac8af192c14 |
Also I've found some more info https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2161469.html |
Isn't it about SIGBUS? I thought it's a regular segfault here. |
Just read whole thread to get some info why futex error is thrown. This info is needed to understand why mentioned above glibc commit can lead to these terminations. |
@Wedmer this looks like a potential culprit, PAL might be trying to lock on unaligned address, it has its own approach to things at times. |
I've seen. Part of PAL is truly cross-platform and has implementation for different synchronization objects and APIs, but another part is heavily built around pthread. |
Using the latest commit as of this message (2af598f), I'm unable to build it with build.sh)
|
@Eggbertx It works using clang14 Installsudo apt install clang-14 BuildFollowing the instructions from the build pipeline ChakraCore/azure-pipelines.yml Line 79 in 2af598f
mkdir build
cd build
// prepare
cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=clang-14++ -DCMAKE_C_COMPILER=clang-14 ..
// build
ninja |
I was able to build it with clang14, and it doesn't appear to be crashing now. |
Do you know what compiler you were using? |
🎉 Thank you for checking; sorry we took far too long on this - we had several failed attempts at finding a fix and had pretty much parked working on CC - hoping we may be resuming again but will see... |
The version of clang in my PATH is 17.0.6, assuming running build.sh with no command line arguments uses clang and not gcc (which I'm guessing CMake would use by default). |
Hmm, I think I'll close this for now as the issue it was raised for is resolved. Thanks for the help. We can explore problems with Clang 17 if it affects anyone on an ongoing basis - though the snippet you gave suggests there may be a one line fix there if it really was a Clang 17 problem. (I was wondering if the script had failed to find Clang and somehow defaulted to gcc - which has never been supported) |
On Arch linux CC crashes when starting a helper thread with the below stack trace, this crash was confirmed with both 1.11 and master and was likely introduced by changes in Arch linux rather than changes in CC:
(Issue reported by a friend - I don't have a linux setup to verify with myself)
The text was updated successfully, but these errors were encountered: