-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault when thread using dynamically loaded Rust library exits #91979
Comments
Upon further research, I'm pretty sure the reason it "works" with newer glibc versions is that the library is actually never fully unloaded if there are TLS destructors registered via And in dlclose, this is checked: https://github.com/bminor/glibc/blob/90b37cac8b5a3e1548c29d91e3e0bff1014d2e5c/elf/dl-close.c#L186 This can be verified by running my reproduction program above with the |
Brooooooklyn/canvas#377 maybe relate to it |
@rustbot label T-libs A-thread-locals |
It may be useful/informative to read some of the prior discussion of dynamic libraries & thread local storage in this & linked issues: #28794 FWIW based on my experience the only "reliable" approach has been to simply never allow dylibs to be unloaded: follower/foreigner@3845586 Edit: Especially nagisa/rust_libloading#41 & also https://sourceware.org/glibc/wiki/Destructor%20support%20for%20thread_local%20variables (I first encountered this issue when using |
See also this upstream glibc bug: https://sourceware.org/bugzilla/show_bug.cgi?id=21032 |
A bit more context about how this affects real world code. This currently happens to all native Node.js modules that include thread locals, when loaded within Node's worker threads. Node controls when One potential solution is to register a destructor function in the |
Hmm, is this possibly caused by #88737 ? (edit: no, but it is related). |
Apprently, this was made significantly worse in v1.83.0, which has starting to segfault for me when using native addons written in Rust in Node.js. |
Scenario: I have a Rust cdylib, which is loaded by a C program via
dlopen
. The C program creates a thread, and loads the Rust module inside it. It proceeds to call one of the Rust functions, and closes the library viadlclose
. Then the thread exits. The Rust program has a thread local variable with a struct that implementsDrop
, which it modifies in the function called from C.Full reproduction here: https://github.com/devongovett/rust-threadlocal-bug
On CentOS 7, which uses glibc 2.17, it segfaults at
__nptl_deallocate_tsd()
inside pthread_create.c. With later versions of glibc, there is no crash. I believe the crash occurs because Rust creates a thread local key withpthread_key_create
but never callspthread_key_delete
(the call in the destructor is commented out):rust/library/std/src/sys_common/thread_local_key.rs
Lines 231 to 237 in 673d0db
When the thread exits, glibc tries to call the destructor for the key, but because the dynamic library has already been unloaded via
dlclose
at this point, the function no longer exists and we get a crash.My theory is that this only occurs with glibc 2.17 and not later versions is due to
__cxa_thread_atexit_impl
not existing in these older versions. This function is used when available to register destructors, otherwise a fallback implementation is used:rust/library/std/src/sys/unix/thread_local_dtor.rs
Lines 30 to 42 in 71965ab
I have not tested, but I think the bug could potentially be fixed if the commented out destructor linked above were actually called. The comment indicates something about windows not supporting this, so maybe it could be called conditionally?
glibc 2.17 is indeed pretty old, however, it is the version used by the current CentOS 7 version which is not EOL until 2024, so I do think this bug should be fixed.
Meta
rustc --version --verbose
:The text was updated successfully, but these errors were encountered: