-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce a parallel loop to do esmpy regridding for many levels #773
Conversation
OK found a bug - was printing a list from a generator (and that was eliminating the generator, no wonder that was so fast hahah). Here's a dilemma: even if this is faster by a factor of 5-6 for eg 8 processes, this means we can't use |
OK so the reason why this is still a Draft is that it doesn't actually work - the parallel loop stalls and I've managed to identify where it stalls: def build_regridder_2d(iter_pack, regrid_method, mask_threshold):
"""Build regridder for 2d regridding."""
src_rep, dst_rep = iter_pack
dst_field = cube_to_empty_field(dst_rep)
src_field = cube_to_empty_field(src_rep)
regridding_arguments = {
'srcfield': src_field,
'dstfield': dst_field,
'regrid_method': regrid_method,
'unmapped_action': ESMF.UnmappedAction.IGNORE,
'ignore_degenerate': True,
} -> this block runs fine in parallel! if np.ma.is_masked(src_rep.data):
src_field.data[...] = ~src_rep.data.mask.T
src_mask = src_field.grid.get_item(ESMF.GridItem.MASK,
ESMF.StaggerLoc.CENTER)
src_mask[...] = src_rep.data.mask.T
center_mask = dst_field.grid.get_item(ESMF.GridItem.MASK,
ESMF.StaggerLoc.CENTER)
center_mask[...] = 0
mask_regridder = ESMF.Regrid(
src_mask_values=MASK_REGRIDDING_MASK_VALUE[regrid_method],
dst_mask_values=np.array([]),
**regridding_arguments)
regr_field = mask_regridder(src_field, dst_field)
dst_mask = regr_field.data[...].T < mask_threshold
center_mask[...] = dst_mask.T
else:
dst_mask = False -> there is something in this block that triggers a wait/acquire call that stalls all the processes; if you comment it out (well, of course it's silly, but for prototyping purposes) it goes on and builds the
with reducing (pickling) issues - this is expected since it's very hard to pickle a function within a function - if that is converted to normal excution block rather than a function all goes through nicely in parallel. Thing is I am not confident I can make the necessary code changes since I am not that familiar with the esmpy regridding, that's why @bouweandela @zklaus and meself we should probably have a chat about it 🍺 |
OK I've dug even deeper in this - things like 2dim |
this is not working and is a bit of a dead end so let's keep the discussion going in #775 |
Before you start, please read our contribution guidelines.
Tasks
yamllint
to check that your YAML files do not contain mistakesIf you need help with any of the tasks above, please do not hesitate to ask by commenting in the issue or pull request.
Closes #issue_number #724
This PR introduces a parallel loop that assembles the esmpy regridders from multiple processes, this is needed because in the case of eg 75 levels one needs to wait forever and a half. This speeds up the regridder assembly from 400s to 1s 🍺