-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std::sync::Once can block forever in forked process #43448
Comments
How does |
More generally though, trying to continue on to do basically anything after a multithreaded fork is a dicy proposition. Every other thread disappearing in the middle of whatever it was doing does not produce not a particularly safe environment. |
@ezrosent found a StackOverflow post whose answers suggest that the solution is just to give up. This commit message implies that |
I think it would be a real shame to decide that Rust is going to be similarly hamstrung. "No async-signal-unsafe code before exec" is a pretty draconian requirement for, e.g., programs that rely on fork w/o exec to inherit certain state to continue operating on. I will note, however, that if we go the route of trying to support this behavior, we'll have to be really careful that we document the behavior well. The behavior we want is probably not the "obvious" behavior - e.g., having a Mutex magically unlock itself is not obvious, even if it's probably what you want. |
It's unclear to me what the path forward is overall with this. The SO post linked above indicates that there may not be a standards-compliant way to "fix" mutexes that were locked in the parent during a However, this issue is more circumscribed: do we want |
Related: #39575 |
Having a Mutex automatically unlock it self is not what you want - that would make it practically impossible to write correct unsafe abstractions. If a thread has a Mutex locked as it's half way through adjusting a bunch of pointers and then suddenly disappears, there is no real way to recover from that. You can't write code that properly checkpoints at every instruction. Notably, malloc is an example of a thing like that. Deadlock is preferable to hoping you don't run off the end of a busted linked list post fork. The same goes for "fixing" the state of an in-progress |
Ah, that's a very good point, and a general one at that (applies to pretty much all non-atomic synchronization). Alternative proposal then: Add an extra variant of being poisoned. Using essentially the same mechanism, we can detect whether the holder of a mutex/person executing a once/etc is a thread in our process or an ancestor process, and if it's the latter, return that the synchronization object has been poisoned. It could be the same poison value (probably the most backwards-compatible, especially since most of these data structures are stable) or a distinct poison value so callers can distinguish between "poisoned by a panic" and "poisoned by forking." |
@Stebalien good call; thanks! |
What would be done in response to this kind of poisoning? |
Well, I figure the two kinds of poisoning are pretty similar, actually, in the state that they leave things in. In both cases, all you know is that some part of an operation (in the case of a The one distinction is that it's technically possible for an application to know precisely where a function/critical section might panic, while a fork in thread A could case the memory being operated on by thread B to be snapshotted at virtually any point in the execution of the function/critical section in B. Thus, in theory, there could be cases in which it'd be possible to recover from a panic-induced poison but not from a fork-induced poison. However, I suspect those cases are exceedingly rare, and in any case, trying to reason about exactly where your code can panic (and to make assumptions about the state of the world on that basis) is very very sketchy, and probably not advised. |
Prototype implementation of the poisoning approach in the |
I'm still a bit confused as to when that extra information is actually actionable. Could you give a more concrete example of a thing that would need that information and what it would do in response? |
To clarify, my primary suggestion is to have forking poison a synchronization primitive in the same way that panicking does. I'm not wedded to the idea that the user needs to be able to distinguish between the two (whether poisoning was caused by panicking or forking). My vision is that code would deal with both types of poisoning in the same way, and thus wouldn't need to know the root cause. Does that answer your question, or were you asking about something else? |
Triage: it's unclear to me what would cause this ticket to be resolved. |
I think poisoning on panic is much more reasonable, as there you know that destructors and other things have run; in particular, unsafe code must be panic-safe to be sound. On the other hand, I suspect that the majority of unsafe code is not safe in the presence of fork - generally that's a pretty high bar to ask for, and we don't ask for it. With that in mind, I don't think that the standard library is well suited to trying to make guarantees around forking and what happens afterwards. I think if we wanted to introduce poisoning on forking to Once (and presumably other primitives) I would want to see an RFC laying out the rationale, and a summary of platform support (e.g. can we register at fork handlers on Windows? What about even more "esoteric" platforms?) -- panicking/unwinding has the advantage of being relatively well understood as to platform support... So, I'm going to close this, but if someone wants this, I would pursue the RFC process (starting with an internals thread to gather feedback). |
There's a bug in
std::sync::Once
that makes it so that, under certain conditions, a call tocall_once
in a process which was forked from another Rust process can block forever.How
Once
worksSimplifying a bit (ignoring poisoning), the algorithm employed by
Once
works as follows: TheOnce
can be in one of three states:INCOMPLETE
,COMPLETE
, andRUNNING
.Once
starts off in theINCOMPLETE
state.call_once
begins, theOnce
might be in any of the three states:Once
is in theINCOMPLETE
state, then it is transitioned to theRUNNING
state, and the function begins executing.Once
is in theRUNNING
state, then some other call tocall_once
is executing the function, so this call puts itself on a list of waiters and goes to sleep. It will be woken back up once the function is done executing in whatever thread is executing it.Once
is in theCOMPLETE
state, then the function has already been executed, socall_once
returns immediately without doing anything.Finally, when the function's execution completes, the thread doing the execution transitions the
Once
into theCOMPLETE
state, and wakes up any waiters that accumulated while it was executing the function.The issue
This algorithm is broken when forking. In particular, if a
Once
is in theRUNNING
state at the point that the process forks, when the child's memory space (which, by default, is a copy-on-write copy of the parent's) is created, theOnce
will still be in theRUNNING
state. However, in the child process, calls tocall_once
will fail for two reasons:Once
's state fromRUNNING
toCOMPLETE
will not be reflected in the child's memory space. Thus, a future call tocall_once
will spuriously find theOnce
still in theRUNNING
state even though it isn't really in that state anymore.These two problems can be seen in action in two proofs of concept that I wrote: This one demonstrates the first issue, while this one demonstrates the second.
A proposed fix
Joint credit for this proposal goes to @ezrosent.
The idea behind this fix is to record the process' PID when transitioning a
Once
fromINCOMPLETE
toRUNNING
, and having future accesses that find theOnce
in theRUNNING
state verify that it wasn't transitioned by a parent process. Unfortunately, this doesn't quite work because PIDs can be re-used, so if process A spawns process B, then process A quits, then process B spawns process C, it's possible for A and C to have the same PID.Instead, we introduce the idea of an "MPID" - a monotonically-increasing PID-like counter that is maintained by the process (e.g. . We increment it every time a process forks, and use it in the
Once
objects to record which process transitioned an object fromINCOMPLETE
toRUNNING
.More concretely, here are the components of the proposed solution:
usize
oru64
) that is initialized to 0 and is incremented immediately afterfork
. Note that this does not guarantee that no two processes anywhere in the tree of processes forked from a particular process have the same MPID. In fact, all processes forked by a given process will all have the same MPID. However, it does guarantee that a process will not share an MPID with any of its ancestors, and that is all we need.Once
object is modified to have anothermpid
field that is initialized to 0.mpid
field that is initialized to the MPID of the current process when the object is created.call_once
looks roughly like this:COMPLETE
, return.INCOMPLETE
, do then loadmpid
and:mpid
is equal to the current MPID, then try to CAS the state fromINCOMPLETE
toRUNNING
. If it fails, retry the entire loop. If it succeeds, you're responsible for running the function, so do the original algorithm.mpid
is not equal to the current MPID, then try to CAS it from its current value to the current MPID. If this succeeds, go to the previous step (wherempid
is equal to the current MPID), and if it fails, retry the entire loop.RUNNING
, then loadmpid
and:mpid
is equal to the current MPID, then the thread that transitioned theOnce
into theRUNNING
state is in the current process, so do the normal algorithm: wait for it to be done (recording the current MPID in the waiter object).mpid
is not equal to the current MPID, then the thread that transitioned theOnce
into theRUNNING
state is in an ancestor process. Thus, attempt to CASmpid
to the current MPID. If it fails, repeat the entire loop. If it succeeds, then it is your responsibility to run the function, so continue as if you had transitioned theOnce
into theRUNNING
state, with one exception: when waking up waiters, you need to check that they are not waiters in an ancestor process; do this by checking the waiter object'smpid
field, and only waking waiters with anmpid
field equal to the current MPID.One thing to note: It is safe to try to CAS
mpid
and then separately to transition intoRUNNING
even though the value ofmpid
needs to reflect the MPID of the thread that transitioned intoRUNNING
- a thread that successfully transitions aOnce
into theRUNNING
state will have previously verified thatmpid
is correct, and thus it will not change forever in the future (at least, not in this process) since the MPID of a process never changes.Open question
One open question is how to ensure that code is run just after
fork
(to increment the global MPID variable) and, critically, before any other code runs (especially code that usesOnce
).pthread
provides thepthread_atfork
function to register callbacks that run before and afterfork
calls, but obviously this doesn't address Windows, and I also don't know if there's a good way to ensure that the necessary call topthread_atfork
is made at process initialization time.Prior art
There's some prior art here. In particular, jemalloc has acknowledged a similar issue, and has a partial fix for it.
The text was updated successfully, but these errors were encountered: