Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for configuring Dask distributed #2040

Closed
bouweandela opened this issue May 19, 2023 · 6 comments · Fixed by #2049 or #2616
Closed

Add support for configuring Dask distributed #2040

bouweandela opened this issue May 19, 2023 · 6 comments · Fixed by #2049 or #2616
Labels
dask related to improvements using Dask enhancement New feature or request

Comments

@bouweandela
Copy link
Member

bouweandela commented May 19, 2023

Since iris 3.6, it is possible to use Dask distributed with iris. This is a great new feature that will allow for better memory handling and distributed computing. See #1714 for an example implementation. However, it does require some extra configuration.

My proposal would be to allow users to specify the arguments to distributed.Client and to the associated cluster, e.g. distributed.LocalCluster or dask_jobqueue.SLURMCluster in this configuration. This could either be added under a new key in config-user.yml or in a new configuration file in the ~/.esmvaltool directory:

Add to the user configuration file

We could add these new options to config-user.yml under a new key dask, e.g.

Example config-user.yml settings for running locally using a LocalCluster:

dask:
  cluster:
    type: distributed.LocalCluster

Example settings for using an externally managed cluster (e.g. set it up from a Jupyter notebook)

dask:
  client:
    address: tcp://127.0.0.1:45695

Example settings for running on Levante:

dask:
  client: {}
  cluster:
    type: dask_jobqueue.SLURMCluster
    queue: interactive
    account: bk1088
    cores: 8
    memory: 16GiB
    local_directory: "/work/bd0854/b381141/dask-tmp"
    n_workers: 2

New configuration file

Or, we could add the new configuration in a separate file, e.g. called ~/.esmvaltool/dask.yml or ~/.esmvaltool/dask-distributed.yml.

Example config-user.yml settings for running locally using a LocalCluster:

cluster:
  type: distributed.LocalCluster

Example settings for using an externally managed cluster (e.g. set it up from a Jupyter notebook)

client:
  address: tcp://127.0.0.1:45695

Example settings for running on Levante:

client: {}
cluster:
  type: dask_jobqueue.SLURMCluster
  queue: interactive
  account: bk1088
  cores: 8
  memory: 16GiB
  local_directory: "/work/bd0854/b381141/dask-tmp"
  n_workers: 2

@ESMValGroup/esmvaltool-coreteam Does anyone have an opinion on what the best approach is here? A new file or add to config-user.yml?

@bouweandela bouweandela added the enhancement New feature or request label May 19, 2023
@bouweandela bouweandela added this to the v2.9.0 milestone May 19, 2023
@bouweandela bouweandela changed the title Configuring Dask distributed Add support for configuring Dask distributed May 19, 2023
@bouweandela
Copy link
Member Author

Another question: would we like to be able to configure Dask distributed from the command line? Or at least pass in the scheduler address if we already have a Dask cluster running, e.g. started from a Jupyter notebook?

@valeriupredoi
Copy link
Contributor

cheers @bouweandela - sorry I slacked at this - I'll come back with a deeper and more meanigful analysis (yeh, beware 🤣 ) but before I do that, here's two quick comments:

@remi-kazeroni
Copy link
Contributor

Thanks a lot for your work @bouweandela! I would also suggest not to put dask related settings to the config-user.yml. I think the Dask configuration topic is too advanced for many of our users and that should remain transparent for them. Also, if we were to modify the config-user.yml, that will need to be reflected to the Tool docs, tutorial, our training activities, ... I would prefer to have that in separated files as you suggest, e.g. .esmvaltool/dask.yml or .esmvaltool/distributed.yml.

Another question: would we like to be able to configure Dask distributed from the command line?

It could be nice to have the possibility to use something like esmvaltool config get_config_dask from the command line. But if you think that's not helpful or too much extra work, let's not worry too much about that.

we should think about HPC-wide configurations, in the case of central installations of the Tool

Yes, that's a good point. I'm just worried that this can make it more complicated for us the developers: time to get answers from HPC admins, updates in the software stack, number of machines supported, ... Perhaps we could simply link to specific documentation on Dask usage if HPC centers provide that (here is an example for DKRZ).

@remi-kazeroni remi-kazeroni added the dask related to improvements using Dask label Jun 1, 2023
@valeriupredoi
Copy link
Contributor

This is still an ongoing discussion so needs reopening

@valeriupredoi valeriupredoi reopened this Jun 1, 2023
@bouweandela
Copy link
Member Author

Suggestion by @sloosvel:

Regarding the configuration, is it possible to have multiple configurations in dask.yml? Not every recipe will require the same type of resources.

@bouweandela bouweandela modified the milestones: v2.9.0, v2.10.0 Jun 1, 2023
@bouweandela
Copy link
Member Author

At the workshop at SMHI agreement was reached that a new configuration file format would be acceptable. I will make a proposal, but this will not be implemented in time for v2.10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dask related to improvements using Dask enhancement New feature or request
Projects
3 participants