-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Tasks getting starved when trying to create HashMaps #11102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I can reproduce this but don't understand yet why there would be a difference in behavior based on the data structures constructed. I can imagine scenarios where the spinning would starve other tasks, but can't picture how the rng could be affected by a task spinning. |
Running with just one thread produces the following
Indicating that the hashmap tasks get descheduled at some point in favor of the spinning task. That indicates to me that thieves are failing to find the hashmap tasks. I remember thinking recently that the stealing code was a little racy and that was ok since it doesn't impact the actual correctness (the task does run eventually), but maybe we can tighten it up so that thieves always find work if it's available. I'd like to know why the task rng causes tasks to be descheduled though.... |
The task rng deschedules because it seeds via io of course. |
Right now if a scheduler fails to steal it pushes itself to the sleeper list and waits for another scheduler to notify it that there is work available. I'm guessing the window between those two events is where the hashmap tasks are descheduling. Closing that window is a little tricky, but I can imagine we might: steal, push to sleeper list, steal again, sleep. |
Oh, my previous guess my be wrong. If these tasks are descheduling to do I/O there may be nothing we can do for them. They can't run again until the local scheduler becomes available and the I/O event loop resumes. |
I think that @brson's analysis of the problem is correct. What happens is that one I/O loop has lots of "ready events", but the current task on that I/O loop is not yielding control back to the scheduler. I'm unsure if there's much that we can do about this, we'd have to interrupt the task to allow the I/O loop to wake up and do its business, but currently preemption is not possible (and also would break lots of code today), so this is definitely a tricky problem. |
Is it not possible to transfer tasks sleeping on io to a different scheduler? This failure pattern is not going to be super easy to reason about. |
Sadly no, the problem here is that the libuv event loop needs to run in order to execute the callbacks of the pending I/O handles. The callbacks are what will reawaken the tasks and allow them to get stolen to other schedulers, but there's no way to have a separate event loop run the callback of another event loop. |
I could have sworn there was talk about migrating descriptors across event loops in the distant past. I figured we just hadn't gotten around to it and such a thing would eventually address this problem. |
#17325 means this is no longer relevant to the standard libraries |
It's pretty easy to starve a task that wants to create a HashMap by having a busy task somewhere else. This appears to be caused by HashMap's need for some random numbers.
Slapping #[no_uv] on the crate makes it better, but not a great solution at this point due to the other resulting limitations.
Example code w/ some explanation at https://gist.github.com/jfager/8072694.
I'm on OSX 10.8.5, using rustc built from b3cee62.
cc @alexcrichton
The text was updated successfully, but these errors were encountered: