-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example of running PyDDA in HPC ? #56
Comments
It can be run on an HPC cluster using dask distrubted! My best strategy has been to dedicate an entire node to one worker since the optimizer will use the cores available when doing the calculation. You from there could then use multiple nodes. |
Thanks! Another one i would like to know is that, is there a recommend for the setting of 'num_split'? I think a larger one will split the whole grid into more subgrids then we can use more nodes for calculation ,but more subgrids will cost much more time on processing the subgrid temp files or on IO between calcualtion nodes. Is that true? IF TRUE, is there a best 'num_split' that can banlance this? |
I have used num_split=3 to get a good balance, but play around with numbers in range from 2 to 5.
Bobby
From: sysumeteo <notifications@github.com>
Sent: Tuesday, May 5, 2020 7:16 PM
To: openradar/PyDDA <PyDDA@noreply.github.com>
Cc: Jackson, Robert <rjackson@anl.gov>; Comment <comment@noreply.github.com>
Subject: Re: [openradar/PyDDA] Example of running PyDDA in HPC ? (#56)
It can be run on an HPC cluster using dask distrubted! My best strategy has been to dedicate an entire node to one worker since the optimizer will use the cores available when doing the calculation. You from there could then use multiple nodes.
Thanks! Another one i would like to know is that, is there a recommend for the settting of 'num_split'? I think a larger one will split the whole grid into more subgrids then we can use more nodes for calculation ,but more subgrids will cost much more time on processing the subgrid temp files or on IO between calcualtion nodes. Is that true? IF TRUE, is there a best 'num_split' that can banlance this?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#56 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AFIQA5ADKOAYO3GP7MZM2VLRQCT2FANCNFSM4MRBQXCA>.
|
The example of nested wind retrieval in the doc is based on LocalCluster. Is PyDDA designed to be run on HPC such as Summit or TianHe-2? If yes , i think it would be really helpful if there is a example about the best strategy of spliting the grid and distributing the computations to workers under Dask in consideration of maximizing CPU usage and balancing the time of IO, including the setting of the number of jobs/n_workers/processess etc .
The text was updated successfully, but these errors were encountered: