-
-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate stateless co-assignment algorithm #7298
Comments
We're deciding not to move ahead with this because:
Summary
The AlgorithmAfter a couple tweaks, The Algorithm does an OK job at co-assignment. This looks pretty good, for example (first number is priority, second is cogroup, color is cogroup): But sometimes, it puts more things in one co-group than necessary. Here's the grouping of Notice how some groups are small and include just one part of a tree reduction, but others encompass multiple branches of a tree reduction. The co-assignment here isn't bad. But it's clearly imbalanced. And not all tasks in the big co-groups should run at the same time. Using cogroups on the schedulerTraditionally on the scheduler, we think about our With queuing, we need the inverse. When a task completes on a worker, it opens up a thread. We can pick any queued task to run on that worker. Which task is best to pick? This is effectively a Currently, we pick a task by just taking the highest-priority task off a global queue. This is so simple, we don't really think of it as To make co-assignment work alongside queuing, we need a more sophisticated That's not too hard, naively (implemented in #7394). Assuming you have some basic cogroup data structure, where given a If you have the global queue of tasks, maybe you just pick the next task off the front. But what if that task belongs to a co-group, where other tasks in the co-group are already running on a different worker? We effectively need a new state for these tasks—they're not quite You need some way to quickly find tasks whose co-groups aren't associated with any worker. For speed, you probably need a global queue of This then opens up all the typical problems:
In a sense, we'd be re-creating the This feels like too much infrastructure to add! Isn't there some clever shortcut? Yeah—what if we just put the tasks in You can implement this easily, no new state, smallish diff. And it works sometimes: But remember how sometimes, the cogroups algorithm makes groups that are too big? If you submit those all at once, then you get root task overproduction—just what queuing was meant to prevent. Here's a memory timeseries of Notice how if we queue, but submit all tasks in a co-group at once (oversaturating), it's exactly the same as on main when we just turn queuing off! And similarly, if we don't oversaturate a co-group, we get zero benefit—co-assignment effectively doesn't happen, because we don't have good ConclusionSo it comes back around to the co-group algorithm not being reliable enough. If the co-group algorithm could offer a stronger guarantee—namely that all tasks in a co-group must be in memory at the same time, on the same worker, to compute a common dependency—then the scheduler side would be much simpler. Because if all tasks in a co-group have to be in memory at once anyway, then it's fine to submit them all to a worker at once—they can't oversaturate. (This was the core observation in #7076.) Then, we could just have But if the co-group algorithm doesn't work that well, we have to add lots more infrastructure on the scheduler to hedge against fully committing to its recommendations. |
Something to add to a blog? |
@mrocklin sure, I'd be happy to copy it over to the dask or coiled blog if it might be interesting to a more general audience |
My sense is that it might be. If we've already done most of the effort in
describing the issue and our final thoughts on it then I think that it
might also be cheap to push out there. I think that this is a prime
example where we could "work in the open" a bit more. I'm hopeful that a
small amount of work (an hour?) could be enough to take what's already
written here and push it out to the public. I think that people like
reading more about what's under the hood than we currently give them.
You in particular are quite good at writing up summary issue comments. I
think that in many cases we could shift these over to the public.
…On Wed, Dec 14, 2022 at 10:39 AM Gabe Joseph ***@***.***> wrote:
@mrocklin <https://github.com/mrocklin> sure, I'd be happy to copy it
over to the dask or coiled blog if it might be interesting to a more
general audience
—
Reply to this email directly, view it on GitHub
<#7298 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTB37IODJZM5BW7CGCDWNHZ45ANCNFSM6AAAAAAR5O5YJ4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Recent changes to the scheduling heuristics to queue tasks on scheduler were removing the ability to co-assign root tasks. This feature helped to reduce network load by assigning tasks with likely shared dependents to the same worker ahead of time.
The previous implementation of this was a stateful heuristic that cannot be easily ported to the queuing system and has known shortcomings for non-trivial topologies, see #6597
An early attempt to formulate a stateless algorithm to identify groups of tasks that should be co-assigned, in this PR called "families" can be reviewed in #7076 but failed to deliver on our expectations. This approach was ultimately rejected due to complexity, runtime concerns and lack of classification power.
An alternative approach to identify "co-assignment groups" / "cogroups" has been proposed in #7141 with an alternative implementation of the same/similar algorithm in gjoseph92#6
Goals
We want to validate the algorithm in #7141 and perform necessary benchmarks to inform future work or make a decision to enable coassignment
Technical changes required for proper validation
Benchmarking / Perf validation
dask_worker_transfer_incoming_count_total
/dask_worker_transfer_outgoing_count_total
+ nbytes equivalentsNice to have but not required
Non-goals
The text was updated successfully, but these errors were encountered: