Skip to content

Tasks getting starved when trying to create HashMaps #11102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jfager opened this issue Dec 21, 2013 · 10 comments
Closed

Tasks getting starved when trying to create HashMaps #11102

jfager opened this issue Dec 21, 2013 · 10 comments
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows

Comments

@jfager
Copy link
Contributor

jfager commented Dec 21, 2013

It's pretty easy to starve a task that wants to create a HashMap by having a busy task somewhere else. This appears to be caused by HashMap's need for some random numbers.

Slapping #[no_uv] on the crate makes it better, but not a great solution at this point due to the other resulting limitations.

Example code w/ some explanation at https://gist.github.com/jfager/8072694.

I'm on OSX 10.8.5, using rustc built from b3cee62.

cc @alexcrichton

@brson
Copy link
Contributor

brson commented Dec 22, 2013

I can reproduce this but don't understand yet why there would be a difference in behavior based on the data structures constructed.

I can imagine scenarios where the spinning would starve other tasks, but can't picture how the rng could be affected by a task spinning.

@brson
Copy link
Contributor

brson commented Dec 22, 2013

Running with just one thread produces the following

brian@brian-X1:~/dev/rust3/build$ RUST_THREADS=1 ./starvingmap
Finished tightloop!
HashMap1 created!
HashMap2 created!
HashMap3 created!

Indicating that the hashmap tasks get descheduled at some point in favor of the spinning task. That indicates to me that thieves are failing to find the hashmap tasks. I remember thinking recently that the stealing code was a little racy and that was ok since it doesn't impact the actual correctness (the task does run eventually), but maybe we can tighten it up so that thieves always find work if it's available.

I'd like to know why the task rng causes tasks to be descheduled though....

@brson
Copy link
Contributor

brson commented Dec 22, 2013

The task rng deschedules because it seeds via io of course.

@brson
Copy link
Contributor

brson commented Dec 22, 2013

Right now if a scheduler fails to steal it pushes itself to the sleeper list and waits for another scheduler to notify it that there is work available. I'm guessing the window between those two events is where the hashmap tasks are descheduling. Closing that window is a little tricky, but I can imagine we might: steal, push to sleeper list, steal again, sleep.

@brson
Copy link
Contributor

brson commented Dec 22, 2013

Oh, my previous guess my be wrong.

If these tasks are descheduling to do I/O there may be nothing we can do for them. They can't run again until the local scheduler becomes available and the I/O event loop resumes.

@alexcrichton
Copy link
Member

I think that @brson's analysis of the problem is correct. What happens is that one I/O loop has lots of "ready events", but the current task on that I/O loop is not yielding control back to the scheduler. I'm unsure if there's much that we can do about this, we'd have to interrupt the task to allow the I/O loop to wake up and do its business, but currently preemption is not possible (and also would break lots of code today), so this is definitely a tricky problem.

@metajack
Copy link
Contributor

Is it not possible to transfer tasks sleeping on io to a different scheduler? This failure pattern is not going to be super easy to reason about.

@alexcrichton
Copy link
Member

Sadly no, the problem here is that the libuv event loop needs to run in order to execute the callbacks of the pending I/O handles. The callbacks are what will reawaken the tasks and allow them to get stolen to other schedulers, but there's no way to have a separate event loop run the callback of another event loop.

@metajack
Copy link
Contributor

I could have sworn there was talk about migrating descriptors across event loops in the distant past. I figured we just hadn't gotten around to it and such a thing would eventually address this problem.

@thestinger
Copy link
Contributor

#17325 means this is no longer relevant to the standard libraries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows
Projects
None yet
Development

No branches or pull requests

5 participants