-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avoid deadlock when reporting ice #111352
Conversation
r? @TaKO8Ki (rustbot has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
let mut i = 0; | ||
|
||
#[cfg(not(parallel_compiler))] | ||
let query_map = qcx.try_collect_active_jobs(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can try_collect_active_jobs
be removed from the QueryContext
trait, and be made an inherent impl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to keep it, since rustc_query_system
needs it to handle cycle errors:
https://github.com/rust-lang/rust/blob/master/compiler/rustc_query_system/src/query/plumbing.rs#L269
let query_map = qcx.try_collect_active_jobs(); | ||
|
||
#[cfg(parallel_compiler)] | ||
let query_map = rustc_middle::ty::tls::with_context(|context| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment explaining why we spawn a thread here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done :)
@@ -788,3 +788,64 @@ macro_rules! define_queries { | |||
} | |||
} | |||
} | |||
|
|||
pub fn print_query_stack<'tcx>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an inherent method on QueryCtxt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done : )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: I changed to use rayon_core::join
now so no need to use with/enter_context
. Reverted the change to lift up print_query_stack
.
|
||
s.spawn(move || { | ||
rustc_middle::ty::tls::enter_context(context, || { | ||
let query_map = qcx.try_collect_active_jobs(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling try_collect_active_jobs
on an outside thread can run into issues with WorkerLocal
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think reporting ice is an action outside the thread pool, so this is not a regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used rayon_core::join instead of spawn so that there is no problem with WorkerLocal.
But I still have a bit doubts—— if reporting ice is running in rayon's thread pool, why the deadlock handler is not performed
7e0aadd
to
ad67dbe
Compare
This comment has been minimized.
This comment has been minimized.
How does the actual deadlock occur? I think running the deadlock handler on a Rayon thread as a Rayon job may be an improvement, but I'm not sure that would help for the actual deadlock you ran into. |
Not sure yet. |
I think @Zoxc's request for a description of the actual deadlock is important here. Its really important to understand the root causes of problems like that, if you can isolate them, in order to confirm that the proposed fix is actually a real fix and not just masking it. retagging as waiting-on-author to account for the need for an answer to that question. @rustbot label: +S-waiting-on-author -S-waiting-on-review |
OK, I gonna figure out the real cause of the deadlock. |
It looks like the problem is with I experimented like this in
And it printed:
|
☔ The latest upstream changes (presumably #108714) made this pull request unmergeable. Please resolve the merge conflicts. |
close this as #112708 has solved the issue |
Fixes the deadlock issue in #110284
When using the parallel compiler, we have added deadlock headler via rayon's thread pool, but outside the thread pool (printing the query stack when reporting ice) it is still possible to stuck into deadlocks. So I added a timeout to
print_query_stack
to get away from deadlocks.The impl of
print_query_stack
is transferred fromrustc_query_system
torustc_query_impl
because we need to usewith_context
to enable the sub thread to accessTLV
cc @Zoxc