Cluster should support many worker types #2118

mrocklin · 2018-07-15T12:40:35Z

The various Cluster objects often allow the user to provide specifications of a worker (cores, memory, software environment, ...) and then provides mechanisms around increasing and decreasing the number of workers.

However sometimes a dask deployment has a few different kinds of workers, for example machines with GPUs or high memory, or machines from a queue that is more or less expensive or reliable in some way.

This suggests that maybe the Cluster object should accept a list of worker pools, and provide common functionality around them.

Things like the widget are easy to scale to multiple pools. Adaptivity is a bit weirder.

cc @lesteve @jhamman (dask-jobqueue) @jcrist (dask-yarn) @jacobtomlinson (dask-kubernetes)

Credit for this thought goes to @lesteve

lesteve · 2018-07-15T14:44:24Z

Just for completeness, when I mentioned this I was thinking more about it in a dask-jobqueue context, but it would be nice to have a feature like this in distributed for sure!

Just for completeness, a few remarks about the dask-jobqueue context I was thinking of

the main use case I had in mind was CPU vs GPU with a worflow where some of the tasks need a GPU but others don't. In our cluster this means there is a queue for CPU nodes and a queue for GPU nodes (GPU nodes is a scarcer resource as you can imagine so you want to do as much work as possible on CPU nodes if you can).
my initial attempt would have been to hack something quickly together in dask-jobqueue (essentially I was hoping that you could just pass arguments to create_worker and override the parameters that were passed in the JobQueueCluster constructor). In this context, adaptivity would be nice to have but not crucial I reckon.

jhamman · 2018-07-15T15:02:20Z

I'll just say that I generally like this idea. Currently, in jobqueue, it seems like it might be easier to create multiple Clusters with different configurations but if there were a nice way to orchestrate this sort of functionality within a Cluster, that would be cool.

sjperkins · 2018-07-16T08:21:45Z

Another use case is creating separate workers for I/O and compute (see for e.g. http://baddotrobot.com/blog/2013/06/01/optimum-number-of-threads/).

Compute threads are usually equal to number of cores and have their affinity set to a particular core to prevent thread migration. They also generally don't block.
I/O threads (disk access/network reads) are usually some multiple of the number of cores and block while waiting for data.

The ability to distinguish between the two would be useful for ensuring that "the cores are always fed".

Given the description above, would it be possible to create workers with separate I/O and compute threads? In the current distributed paradigm, one might specify these workers like so:

dask-worker scheduler:8786 --resources "io=32; compute=8" --nprocs=1 --nthreads=40

However, manually specifying io/compute resources for each task is somewhat laborious and error prone.

vincentschut · 2018-07-16T08:59:46Z

Another use case is creating separate workers to download data from third party servers with a limited number of connections. To make things more interesting, this could also be a max number of connections per ip, and thus per worker node/machine/instance. So when running on e.g. a single 32 core node, one would have 32 cpu-bound workers and for example 8 download workers for a certain server with max. 8 connections/ip. When adding a new node to the cluster, the number of cpu workers would always scale up with the extra number of cpus. However, the number of download workers would always increment with 8.

nicolls1 · 2018-07-16T10:49:11Z

A couple things that would be nice for my use case:

I also have CPU and GPU workers and would like a less strict resource mapping as my tasks can still be run on CPU machines but slower. In Kubernetes they have the concept of affinity where a pod can prefer to be scheduled on a certain node but still be run on other nodes/contexts if that one doesn't exist. It would be nice to have an analogous concept with tasks.
I would also like having the ability to create a node or worker pool that is reserved for time sensitive tasks. My cluster needs to process some jobs as fast as possible and I would like it if I could reserve node(s) and possibly monitor them separately from the rest of the workload. Going back to Kubernetes, there is the concept of taints which prevent pods from being placed on a node. Again would be nice to have an analogous concept for the tasks.

Curious how these multiple worker types could or could not support this.

sjperkins · 2018-07-19T12:18:19Z

I think this issue has some relation to #2127, where @kkraus14 wants to run certain tasks on CPU/GPU workers. I've also wanted to run tasks on specific workers, or require resources to be exclusive for certain tasks.

Currently, these task dependencies must be specified as additional arguments to compute/persist etc. rather than at the point of actual construction -- embedding resource/worker dependencies in the graph is not currently possible.

To support this, how about adding a TaskAnnotation type? This can be a namedtuple, itself containing nested tuples representing key-value pairs. e.g.

annot = TaskAnnotation(an=(('resource', ('GPU': '1'), ('worker', 'alice')))

dask array graphs tend to have the following structure:

dsk = {
    (tsk_name, 0) : (fn, arg1, arg2, ..., argn),
    (tsk_name, 1) : (fn, arg1, arg2, ..., argn),
}

How about embedding annotations within value tuples?

dsk = {
    (tsk_name, 0) : (fn, arg1, arg2, ..., argn, annotation1),
    (tsk_name, 1) : (fn, arg1, arg2, ..., argn, annotation2),
}

If the scheduler discovers an annotation in the tuple, it could remove it from the argument list and attempt to satisfy the requested constraints. In the above example, annotations are placed at the end of the tuple, but the location could be arbitrary and multiple annotations are possible. Alternatively, it might be better to put them at the start.

I realise the above example is somewhat specific to dask arrays (I'm not too familiar with the dataframe and bag collections) so there may be issues I'm not seeing.

One problem I can immediately identify would be modifying existing graph construction functions to support the above annotations (atop/top support is probably the first place to look).

mrocklin · 2018-07-19T12:26:17Z

The cluster object is about starting and stopping workers, not about assigning workers to tasks. I don't think that this is related.

sjperkins · 2018-07-19T12:29:38Z

@mrocklin Ah sorry. If you think the above is useful, should the discussion be moved to #2127 or a new issue be opened?

EDIT: Created issue in dask/dask#3783

mrocklin · 2018-07-19T12:32:54Z

I recommend a new issue.

…

On Thu, Jul 19, 2018 at 8:29 AM, Simon Perkins ***@***.***> wrote: @mrocklin <https://github.com/mrocklin> Ah sorry. If you think the above is useful, should the discussion be moved to #2127 <#2127> or a new issue be opened? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2118 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AASszMNokB2LZMBwaFcWWgSeIjYWWHLBks5uIHuygaJpZM4VQO52> .

Winand · 2019-02-04T18:55:46Z

I cannot figure out how does one deal with network/compute-bound tasks now. I use ThreadPoolExecutor for network tasks and then feed results to ProcessPoolExecutor workers to compute. If I get this right I need to start two LocalClusters to do the same thing in Dask.

mrocklin · 2019-02-04T23:06:41Z

Is there a particular workload that you're having trouble with? If not then I recommend using the defaults until difficulties arise.

…

On Mon, Feb 4, 2019 at 12:55 PM Makarov Andrey ***@***.***> wrote: I cannot figure out how does one deal with network/compute-bound tasks *now*. I use ThreadPoolExecutor for network tasks and then feed results to ProcessPoolExecutor workers to compute. If I get this right I need to start two LocalClusters to do the same thing in Dask. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2118 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AASszH32WMeU-rssmaClG0np9d8ICC_cks5vKIIygaJpZM4VQO52> .

Winand · 2019-02-05T07:28:04Z

@mrocklin well, currently I load html pages using 10 parallel threads, then parse and process results with BeautifulSoup using 5 worker processes. Of course number of workers can be tuned up, but in general in the 1st case we mostly wait and in the second one mostly compute.

As I understand I have to use different clusters for this workflow (and therefore separate dashboards, if I want to monitor the whole process somehow)

I recommend using the defaults until difficulties arise

Do you mean just use one cluster, tune up its settings and look at the overall execution time?

mrocklin · 2019-02-05T14:45:20Z

Yeah, the default settings on LocalCluster will give you a mix of threads and process that I suspect will be decent for your workload. I would try it out and see how it performs before worrying too much about it.

…

On Mon, Feb 4, 2019 at 11:28 PM Makarov Andrey ***@***.***> wrote: @mrocklin <https://github.com/mrocklin> well, currently load html pages using 10 parallel threads, then parse and process results with BeautifulSoup using 5 worker processes. Of course number of workers can be tuned up, but in general in the 1st case we mostly wait and in the second mostly compute. As I understand I have to use different clusters for this workflow (and therefore separate dashboards, if I want to monitor the whole process somehow) I recommend using the defaults until difficulties arise Do you mean just use one cluster, tune up its settings and look at the overall execution time? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2118 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AASszG1scqJjyDeQrAARu6yHzrPPiGf3ks5vKTKFgaJpZM4VQO52> .

arpit1997 · 2019-09-21T22:43:41Z

@mrocklin Worker pools like same as airflow implementation could certainly be useful.

The use case I encountered was

We have some very long running task (using dask core library) like a day long tasks and some very short duration tasks on same cluster.
Pools can provide a friendly way to group workers and assign them dedicated resources.

therc · 2020-08-29T04:07:24Z

Another scenario: you have a worker size that accommodates 95% of tasks or higher, but the occasional pathological cases will use 2x memory. I'd love to retry tasks that were killed by the nanny, giving them each time X% more memory. This assumes dask-kubernetes and some kind of autoscaling to bring up beefier machines on demand.

alexandervaneck · 2021-06-28T16:49:26Z

Hello 👋 I'm interested in this topic since the project I'm working on has need of different types of resources per worker.

@mrocklin @lesteve @jhamman @jacobtomlinson Is there still interest in this? and if so could someone point me in the direction where improvements/implementations could be made?

jacobtomlinson mentioned this issue Jul 19, 2018

Merging of distributed scheduler creation tools #2132

Closed

mrocklin mentioned this issue Jul 25, 2018

Heterogenous scheduler? dask/dask-jobqueue#103

Closed

guillaumeeb mentioned this issue Sep 6, 2018

Scale by number of cores or amount of memory #2208

Open

guillaumeeb mentioned this issue Oct 16, 2018

Redesign Cluster Managers #2235

Closed

guillaumeeb mentioned this issue Nov 28, 2018

Update cpu and memory requests after cluster creation dask/dask-kubernetes#88

Closed

guillaumeeb mentioned this issue Dec 6, 2018

Dask on HPC, what works and what doesn't dask/dask-blog#5

Closed

guillaumeeb mentioned this issue Dec 18, 2018

Update examples.rst dask/dask-jobqueue#211

Merged

dhirschfeld mentioned this issue May 7, 2019

[Feature] Add worker capabilities/labels #2616

Open

skeller88 mentioned this issue Nov 7, 2019

Recommended cluster setup (nthreads, n workers) for I/O workloads? #3205

Closed

lesteve mentioned this issue Jan 10, 2020

SGECluster in multiple queues dask/dask-jobqueue#378

Open

bw4sz mentioned this issue Feb 8, 2020

Right way to make repeated calls to SLURMCluster for workers with different resources/job_extra args dask/dask-jobqueue#381

Closed

Adam-D-Lewis mentioned this issue Apr 1, 2020

Create a simple text format that can represent a workflow and be converted to a Dask DAG Quansight-Labs/dask-jobqueue#1

Open

zanieb mentioned this issue Oct 28, 2020

Future: Support Multiple Dask Worker Types in DaskKubernetesEnvironment PrefectHQ/prefect#1586

Closed

alexandervaneck mentioned this issue Jun 27, 2021

Dask cluster worker specs based on task resource requirements PrefectHQ/prefect#4705

Closed

GenevieveBuckley added the discussion Discussing a topic with no specific actions yet label Oct 18, 2021

ljstrnadiii mentioned this issue Jan 18, 2022

Support Multiple Worker Types dask/dask-kubernetes#384

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster should support many worker types #2118

Cluster should support many worker types #2118

mrocklin commented Jul 15, 2018

lesteve commented Jul 15, 2018 •

edited

Loading

jhamman commented Jul 15, 2018

sjperkins commented Jul 16, 2018

vincentschut commented Jul 16, 2018

nicolls1 commented Jul 16, 2018 •

edited

Loading

sjperkins commented Jul 19, 2018

mrocklin commented Jul 19, 2018

sjperkins commented Jul 19, 2018 •

edited

Loading

mrocklin commented Jul 19, 2018 via email

Winand commented Feb 4, 2019

mrocklin commented Feb 4, 2019 via email

Winand commented Feb 5, 2019 •

edited

Loading

mrocklin commented Feb 5, 2019 via email

arpit1997 commented Sep 21, 2019

therc commented Aug 29, 2020

alexandervaneck commented Jun 28, 2021

Cluster should support many worker types #2118

Cluster should support many worker types #2118

Comments

mrocklin commented Jul 15, 2018

lesteve commented Jul 15, 2018 • edited Loading

jhamman commented Jul 15, 2018

sjperkins commented Jul 16, 2018

vincentschut commented Jul 16, 2018

nicolls1 commented Jul 16, 2018 • edited Loading

sjperkins commented Jul 19, 2018

mrocklin commented Jul 19, 2018

sjperkins commented Jul 19, 2018 • edited Loading

mrocklin commented Jul 19, 2018 via email

Winand commented Feb 4, 2019

mrocklin commented Feb 4, 2019 via email

Winand commented Feb 5, 2019 • edited Loading

mrocklin commented Feb 5, 2019 via email

arpit1997 commented Sep 21, 2019

therc commented Aug 29, 2020

alexandervaneck commented Jun 28, 2021

lesteve commented Jul 15, 2018 •

edited

Loading

nicolls1 commented Jul 16, 2018 •

edited

Loading

sjperkins commented Jul 19, 2018 •

edited

Loading

Winand commented Feb 5, 2019 •

edited

Loading