-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics: add worker_park_unpark_count #6696
Conversation
This counts the number of times a worker was parked and unparked. Thus it is odd if the worker is parked and even if the worker is unparked.
tokio/tests/rt_unstable_metrics.rs
Outdated
let rt = current_thread(); | ||
let metrics = rt.metrics(); | ||
rt.block_on(async { | ||
time::sleep(Duration::from_millis(1)).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it work if you use yield_now
instead of sleeping here? We generally try to avoid adding sleeps in tests, as they make the test suite take much longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the current thread runtime it only works with sleep
. For the multi-threaded runtime I changed it to yield_now
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What!? That is very surprising to me. If yield_now
works on the multi-thread runtime, then it probaby also works if you get rid of the entire block_on
call. In the multi-thread runtime case, the yield_now
doesn't interact with the worker threads at all because it's in a block_on
and not spawned.
I'm guessing that just spawning the threaded()
runtime is enough for both threads to get to a park/unpark count of two. To actually test that our task changes the count, I think we can do something along these lines:
First wait for the count to reach 2 on both workers using a loop like this one. Then, update the code to spawn tasks instead so that the yield_now
actually runs on the worker threads:
rt.block_on(rt.spawn(async {}));
drop(rt);
assert!(4 <= metrics.worker_park_unpark_count(0) || 4 <= metrics.worker_park_unpark_count(1));
Here, the count already reached two before spawning, so spawning should result in a worker waking up to process the task, and the worker then goes back to sleep. Hence, one worker should now be at 4.
You probably need to use rt.spawn
with the current-thread runtime too, but I don't think you need the loop in that case as there is no threading involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the multithreaded case after launch the park/unpark count is 1 for both workers. This makes sense since they get parked after startup, because there is no work to do.
After spawning the task, the park/unpark count can be 1/3 or 3/3. I don't know why the scheduler sometimes unparks both threads.
After shutdown the count is 4/4 or 2/4. This makes sense because the threads are both unparked for shutdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. It makes more sense now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
on_thread_park()
and on_thread_unpark()
callbacks (for stuck worker watchdog)
#6353
This counts the number of times a worker was parked and unparked. Thus it is odd if the worker is parked and even if the worker is unparked.
Motivation
See #6353 and discussion in #6370.
Solution
A lightweight watchdog can watch the worker statistics and determine that a worker is stuck if its park/unpark count is even (thus it is active) and its poll count does not increase.