Example of running PyDDA in HPC ？ #56

sysumeteo · 2020-04-26T04:15:04Z

The example of nested wind retrieval in the doc is based on LocalCluster. Is PyDDA designed to be run on HPC such as Summit or TianHe-2? If yes , i think it would be really helpful if there is a example about the best strategy of spliting the grid and distributing the computations to workers under Dask in consideration of maximizing CPU usage and balancing the time of IO, including the setting of the number of jobs/n_workers/processess etc .

rcjackson · 2020-05-05T19:33:45Z

It can be run on an HPC cluster using dask distrubted! My best strategy has been to dedicate an entire node to one worker since the optimizer will use the cores available when doing the calculation. You from there could then use multiple nodes.

sysumeteo · 2020-05-06T00:15:17Z

It can be run on an HPC cluster using dask distrubted! My best strategy has been to dedicate an entire node to one worker since the optimizer will use the cores available when doing the calculation. You from there could then use multiple nodes.

Thanks! Another one i would like to know is that, is there a recommend for the setting of 'num_split'? I think a larger one will split the whole grid into more subgrids then we can use more nodes for calculation ,but more subgrids will cost much more time on processing the subgrid temp files or on IO between calcualtion nodes. Is that true? IF TRUE, is there a best 'num_split' that can banlance this?

rcjackson · 2020-05-06T19:16:35Z

I have used num_split=3 to get a good balance, but play around with numbers in range from 2 to 5. Bobby From: sysumeteo <notifications@github.com> Sent: Tuesday, May 5, 2020 7:16 PM To: openradar/PyDDA <PyDDA@noreply.github.com> Cc: Jackson, Robert <rjackson@anl.gov>; Comment <comment@noreply.github.com> Subject: Re: [openradar/PyDDA] Example of running PyDDA in HPC ？ (#56) It can be run on an HPC cluster using dask distrubted! My best strategy has been to dedicate an entire node to one worker since the optimizer will use the cores available when doing the calculation. You from there could then use multiple nodes. Thanks! Another one i would like to know is that, is there a recommend for the settting of 'num_split'? I think a larger one will split the whole grid into more subgrids then we can use more nodes for calculation ,but more subgrids will cost much more time on processing the subgrid temp files or on IO between calcualtion nodes. Is that true? IF TRUE, is there a best 'num_split' that can banlance this? — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#56 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AFIQA5ADKOAYO3GP7MZM2VLRQCT2FANCNFSM4MRBQXCA>.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example of running PyDDA in HPC ？ #56

Example of running PyDDA in HPC ？ #56

sysumeteo commented Apr 26, 2020 •

edited

Loading

rcjackson commented May 5, 2020

sysumeteo commented May 6, 2020 •

edited

Loading

rcjackson commented May 6, 2020 via email

Example of running PyDDA in HPC ？ #56

Example of running PyDDA in HPC ？ #56

Comments

sysumeteo commented Apr 26, 2020 • edited Loading

rcjackson commented May 5, 2020

sysumeteo commented May 6, 2020 • edited Loading

rcjackson commented May 6, 2020 via email

sysumeteo commented Apr 26, 2020 •

edited

Loading

sysumeteo commented May 6, 2020 •

edited

Loading