-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite the park/unpark mechanism #528
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two implementations in tokio-threadpool
and tokio-executor
are literally the same - copy/paste and return Ok(())
from park
in tokio-executor
.
tokio-executor/src/park.rs
Outdated
NOTIFY => return Ok(()), | ||
IDLE => {}, | ||
_ => unreachable!(), | ||
if self.state.compare_exchange(NOTIFY, IDLE, SeqCst, SeqCst).is_ok() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tokio-executor/src/park.rs
Outdated
} | ||
|
||
// If the duration is zero, then there is no need to actually block | ||
if let Some(ref dur) = timeout { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is new in tokio-executor
. @carllerche, any idea why the previous implementation didn't have it? Maybe just an oversight?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oversight... it would be nice to unify the two impls instead of duplicating code.
tokio-executor/src/park.rs
Outdated
} | ||
|
||
// The state is currently idle, so obtain the lock and then try to | ||
// transition to a sleeping state. | ||
let mut m = self.mutex.lock().unwrap(); | ||
|
||
// Transition to sleeping | ||
match self.state.compare_and_swap(IDLE, SLEEP, Ordering::SeqCst) { | ||
NOTIFY => { | ||
match self.state.compare_exchange(IDLE, SLEEP, SeqCst, SeqCst) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tokio-executor/src/park.rs
Outdated
} | ||
} | ||
None => { | ||
loop { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tokio-executor/src/park.rs
Outdated
None => self.condvar.wait(m).unwrap(), | ||
match timeout { | ||
Some(timeout) => { | ||
m = self.condvar.wait_timeout(m, timeout).unwrap().0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to unifying the implementations. I'm a little hesitant to test this out though because we don't have a consistent reproducer, nor a good theory about why this change would help with the bug. I'm concerned about this turning into a "changing things until the problem goes away" exercise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a thought inline.
tokio-executor/src/park.rs
Outdated
// Notified before we could sleep, consume the notification and | ||
// exit | ||
self.state.store(IDLE, Ordering::SeqCst); | ||
self.state.store(IDLE, SeqCst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems problematic for the same reason stated in the issue.
Given that this is intended to acquire all memory from unpark
threads, it needs to establish a "happens-before" relationship with
tokio/tokio-executor/src/park.rs
Line 303 in e433282
match self.state.compare_exchange(IDLE, NOTIFY, SeqCst, SeqCst) { |
I do not think that this is done (though, again, MFENCE on x86 will work).
@jsgf Well, does reverting back to tokio-threadpool 0.1.4 fix the problem? |
@carllerche I've been holding off until we're sure we can get a clear signal. It repros relatively rarely, and it might have been happening for a while at some rate. I'm considering the possibility that the bug is actually elsewhere, and the changes to park/unpark are just changing the timing or something else so it presents differently. I'm currently re-learning TLA+ so I can model park/unpark and see if there's something subtle we're all missing - with luck that will turn something up and so we can move forward with some confidence. If it doesn't then either the problem is somewhere else or the model is bad (but model+inspection gives more confidence than inspection alone). |
@jsgf I agree that I am skeptical the bug is w/ park / unpark. |
@stjepang This looks good to me, but I am a bit hesitant to merge w/o a repro or at least an understanding as to what the bug is. Do you have thoughts on ways we might be able to repro? |
So far I haven't been able to find a problem with either the 0.1.4 or 0.1.5 park/unpark implementations. |
tokio-executor/src/park.rs
Outdated
// Notified before we could sleep, consume the notification and | ||
// exit | ||
self.state.store(IDLE, Ordering::SeqCst); | ||
self.state.swap(IDLE, SeqCst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using swap as a store seems like overkill since the value is discarded. On x86 this will generate an unnecessary locked instruction. (ie, the std version should use store.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that the version in std is incorrect. The swap
is required to establish ordering with this line in unpark. Without doing a "read" operation here, visibility cannot be acquired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow. I didn't check std, but I assume the swap
here came from there.
Also, my understanding is that SeqCst
should establish total ordering with other SeqCst
loads and stores, so this can be a plain store
and still be ordered with respect to compare_exchange
. (The std::atomic::Ordering
docs are irritatingly imprecise here though, so I don't know if I'm actually right - it seems to imply that SeqCst
isn't meaningful for store
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A store operation with Release
ordering (or stronger) writes to memory and allows other future load
operations to synchronize with it.
However, here we also want to perform a load operation at the same time (hence swap
) in order to synchronize with whoever last wrote to this location (i.e. acquire their writes to memory).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't SeqCst
already guarantee that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsgf, from an x86 codegen perspective, I don’t think a SeqCst
store or swap is going to matter - it’s going to be some instruction that entails LOCK like behavior for both. In fact, I see xchg reg, mem
on the playground for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without doing a "read" operation here, visibility cannot be acquired.
Isn't the read at line 257 sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the read at line 257 sufficient?
I was about to ask the same. The read of NOTIFY
with SeqCst
is sufficient to establish synchronization with the thread that notified.
Of course release/acquire would be sufficient as well but I am not going to try playing that game again ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because a third thread can come along and unpark
while the park
fn is between 257 and 262. The bug is specifically that the park
thread must acquire the memory from a third thread that calls unpark before the park thread hits line 262.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Continuing in rust-lang/rust#53366)
Did this ever get anywhere once it ended up becoming a rust-lang discussion? Is the erroneous behavior that led us to believe this was the bug still present? |
In #525, a deadlock was reported where #459 seems like the only meaningful change. Since the issue seems to be so difficult to track down, I think we should merge this PR and go with the same park/unpark implementation that has already been battle-tested in I really wonder if this will fix deadlocks. @carllerche, what do you think? |
I think unpark() needs to be changed as well. See rust-lang/rust#53366 and rust-lang/rust#54174. |
@stjepang What do you think about closing this since it seems unlikely that the bug is in our code. Also the std version has evolved since this PR. |
Or at least updating to track std's latest changes. |
@carllerche I'm going to update to reflect the std's latest implementation. A question regarding
Originally I thought of putting an instance of |
@stjepang right, well the one that is stored in the thread local is to avoid having to recreate all the state each time when calling |
Right. However, |
@stjepang it probably should be in a separate crate to be honest... maybe |
235: Add Parker for thread parking r=stjepang a=stjepang This is just an extracted copy of the current implementation of `thread::park()` and `thread::unpark()`. `Parker` is a low-level thread synchronization primitive useful for building others (like locks etc.). It would be useful in tokio-rs/tokio#528 and might remove some cruft and unnecessary TLS access from `context.rs` in `crossbeam-channel`. I've also added a fast-path check for `timeout == Duration::from_secs(0)`, which Tokio relies on. An interesting peculiarity of `Parker` is that it's not split into `Parker` and `Unparker`, which means any thread can call `park()` at any time. However, if multiple threads call `park()` at the same time, expect deadlocks or panics - that is the user's problem. Splitting the primitive into an owned `Parker` and shared `Unparker` would require us to wrap the inner structure into an `Arc`, which comes at a cost of allocation and indirection. Since this is a very low-level primitive, I chose not to do that. The reason why this is in `crossbeam-utils` is because it's a really simple primitive with no dependencies. Also, `tokio-executor` and `tokio-threadpool` really don't want to pull in the whole `crossbeam` in order to use this. cc @carllerche Co-authored-by: Stjepan Glavina <stjepang@gmail.com>
I've updated the PR to use |
@carllerche Just a ping for review :) |
The
park
/unpark
mechanism now has identical implementation tostd::thread::{park,park_timeout,unpark}
.Our previous implementation was cutting corners in a few places and while it seemed obviously correct, we decided to avoid playing with fire and just copy what
std::thread
does line by line.