-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
thread_local!
dtor registration can pass wrong __dso_handle
if dynamically linked
#88737
Comments
@rustbot modify labels: +A-thread-locals +T-libs |
@thomcc I believe I am facing this behaviour with a cdylib loaded from Deno. Only happens on Deno tests with dev compilation (release is fine). Issue does not arise on macOS (arm) but can be found on Debian and WSL2 Debian. However I have really a hard-time making sure this is indeed affected by your scenario. Is there some tests I can operate to dig further (I am not familiar with Rust internals)? Is there a workaround known? |
The best workaround is for you to not unload dynamic libraries, if at all possible. Honestly, we don't really guarantee that it's supported for libstd (it should be for libcore/liballoc), and it's extremely dangerous to unload a dynamic library that doesn't support being unloaded. |
Woah. I can’t believe it is only now that I’m seeing this. This sounds like something we should definitely look into fixing and seeing just how many of the issues this would end up resolving on the relevant platforms.
I believe that this sort of scenario is something that will come up as platform implementations approach some level of maturity. macOS, for example does not use the |
So this is actually much more broken than this issue lets on. I've had doing a writeup on it on my TODO for a bit now, so I guess it's time. The TLDR is:
The longer version follows: Partial unloading is broken and nonsensical.So, the partial-unload case seems really too busted to be supported at all. That is, the case in this issue, where "librustcode.so gets unloaded but the libstd.so stays loaded, and the former passed functions to the latter, such as TLS dtors". Like, there's not much we can do about the fact The saving grace here is that if we pass the Full unloading is broken too, but trickierSadly, for the case where were std and user code are statically linked, we do what we can in some cases, but it still leaves many platforms where we actually hit UB when libstd is unloaded. This means that on many platforms (probably most), Rust For example, this is true on any platform where we're using pthread keys for TLS (it's probably the same story on any non-static TLS platform), as we pass a function pointer to That said, in some cases we can do a better thing here, but don't1, and you're mistaken that macOS doesn't have a solution for this. On macOS, the FWIW, I believe Windows has a relevant story here too. I think it has similar issues for dynamic TLS, and different issues ones for static TLS? But I may be mis-remembering, @ChrisDenton would know more, as a while ago, I reached out to ask about the situation on Windows, and we discussed this in some detail. OverallSo, this basically sucks, as many platforms don't offer a way to solve this. In practice most of what we get is the ability to stop the unloading from happening (totally fine -- nobody should call dlclose anyway), but only sometimes. We can improve things slightly for both cases described, and should, even if this still is a "please never do this" thing, but it still leaves many targets in a bad way. Maybe some of them have a Alternatively, the main reason we want to replace what we have now with Anyway, while did a bunch of research on this at one point, P.S. One last bug in the Footnotes
|
To be clear, I think this is good behavior, and why we should fix code here to pass in the right Anyway, Footnotes
|
The Windows side of things is something I have been meaning to go into more when I have some time to really grapple with it. However, I feel like Windows is different enough that I would need to provide quite a bit of background information to avoid misunderstandings. For now I've quickly dashed off an overview, though it doesn't really go into all the details. It may also contain errors, but I hope not. |
Another issue that I thought I mentioned here but dont seem to have is that |
On linux and some other platforms, libstd uses
__cxa_thread_atexit_impl
to register destructors for thread locals.rust/library/std/src/sys/unix/thread_local_dtor.rs
Line 39 in fdf6505
The last argument to this is
&__dso_handle
, but this is only correct for threadlocals which are inside libstd, or are in code that is statically linked with libstd.__dso_handle
is a magic symbol which has a value that is unique to whatever DSO is references it. That is, iflibfoo.so
andlibbar.so
both look at__dso_handle
(or&__dso_handle
, which is really the value you use), it will have a different value in each.Yes, I know this isn't how things usually work, this is the entire point of
__dso_handle
(technically, it behaves as if a symbol with "hidden" visibility named__dso_handle
were declared inside each DSO automatically, see https://itanium-cxx-abi.github.io/cxx-abi/abi.html#dso-dtor-runtime-api, although this is clearly a hack).It's behaves very slightly differently depending on a number of moving pieces (libdl, libc, libcxxabi, the linker, the runtime loader, ... — collectively I'm going to call these "the runtime"), and shows up in a couple different APIs, but here it's being used to remember that that DSO has a pending thread-local dtor, which prevents the DSO from being unloaded until after the said dtors are all run (when all the threads in question are closed).
So, to the point:
libstd
always registers this using a__dso_handle
which is linked from inside itself. This defeats the point of the symbol, as now "the runtime" believes that the DSO containinglibstd
is the one responsible for the dtor. This can cause problems in scanarios where libstd is dynamically linked, and dlopen/dlclose is used to dynamically load rust code. (See "Memory unsafe scenario" for why)I believe the ideal fix here is to have
thread_local!
expand to contain the extern for__dso_handle
on these systems. Then,&__dso_handle
would be passed in as an argument to the call tounix::thread_local_dtor::register_dtor
. I don't know how this interacts with weak symbols, but I'm sure this can be made to work.(This is... inconvenient, but it's not that surprising — if library code could be the source for this value, there'd be no need for it to get passed in)
Memory unsafe scenario
Concretely, I think this can lead to a concerning memory unsafety problem in the following scenario:
libstd is dynamically linked into a program.
Some rust library (which also dynamically links libstd) is loaded via dlopen. Let's call this
libmycrate.so
for concreteness.libmycrate.so
contains athread_local!
(mycrate::THE_THREAD_LOCAL
) that needs its dtor to be registered.A thread
T0
is spawned, andT0
calls some function inlibcrate.so
.This function references
mycrate::THE_THREAD_LOCAL
, which causes the destructor is registered via__cxa_thread_atexit_impl
(insidestd::sys::unix::register_dtor
)mycrate::THE_THREAD_LOCAL
.The library
libcrate.so
is unloaded viadlclose
. This is prior toT0
ending, and it is not the last Rust crate to be unloaded.Later
T0
is joined, which runs the thread-specific destructors. This includesmycrate::THE_THREAD_LOCAL
's dtor, despite the fact that it has been unloaded.Note: between 6 and 7, some time may have to pass;
dlclose
is often performed in the background. Also, I'm assuming in this situation thatlibmycrate.so
hasn't done anything else to prevent being unloaded. Finally, often the memory from the library is pushed onto a free list for later use, rather than actually being unmapped.Anyway, this is concerning because:
On the other hand, this doesn't exist from purely safe stdlib APIs — someone had to
unsafe
ly calldlclose
(perhaps byDrop
ing alibloading::Library
), so it's on them.While I don't find this style of argument compelling, it unfortunately has to be the answer to some extent. We can't fix this everywhere, as only some platforms allow defending against this by accepting an equivalent to
&__dso_handle
.That said, this is clearly an example of us passing the wrong value, and I suspect there aren't really great arguments against fixing it. I think this is actually quite a bit of a footgun on platforms where it can't be addressed, but probably the solution is to somehow let people know that
dlclose
(and equivalent) are extremely spooky.The text was updated successfully, but these errors were encountered: