-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Conversation
@@ -123,7 +123,7 @@ impl Worker { | |||
impl Drop for Worker { | |||
fn drop(&mut self) { | |||
trace!(target: "shutdown", "[IoWorker] Closing..."); | |||
let _ = self.wait_mutex.lock(); | |||
let _ = self.wait_mutex.lock().unwrap(); | |||
self.deleting.store(true, AtomicOrdering::Release); | |||
self.wait.notify_all(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the issue is that this call happens before the atomic store due to compiler or CPU isntruction reordering.
System condvars can have spurious wakeups, but parking_lot
ones can't. I don't have a windows machine right now, but could you test using parking_lot
and adding a call to atomic::fence(AtomicOrdering::SeqCst)
between the store to deleting
and the condvar notification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has nothing to do with deleting
as it does not change when the deadlock happens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If deleting
hasn't changed, then the deadlock is in the mutex, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be related to mutex unlocking on wait. It is hard to tell cause the logic inside is quite complex. I'll see if I can create an example and file an issue.
Also, we should avoid closing #1726 until we are 100% sure that the issue is indeed fixed... |
while !deleting.load(AtomicOrdering::Acquire) { | ||
{ | ||
let mut unverified = verification.unverified.lock(); | ||
let mut more_to_verify = verification.more_to_verify.lock().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are also some logical changes here (and in flush
). I'm just making sure it's intended, is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've used a different mutex for a Condvar
so that verification structures would continue using parking_lot mutex.
This reverts commit c65ee93.
This reverts commit c65ee93.
This reverts commit c65ee93.
Apparently there's a bug in the parking_lot resulting in the thread being stuck forever in
Condvar::wait
on windows.Reverted to
std::sync::Condvar
for now.