Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use unbound regridding method for dask wrapping #39

Merged

Conversation

aulemahal
Copy link
Collaborator

This fixes JiaweiZhuang/xESMF#71 and #38.

There are currently no tests in the test suite that use distributed dask client, so all I can say is that solved the problem on my machine with dask 2.17.

From what I understand the problem of using a distributed client is that it must serialize everything it sends to its workers. In the Regridder object, we keep the ESMpy Grid/Mesh/Locstream objects, which are not serializable by dask. Thus the error and the workaround that dumped overwrote _grid_in and _grid_out to None (see JiaweiZhuang/xESMF#71). I figured out that those two objects were passed to dask only because we were wrapping Regridder.regrid_numpy which is a bound method, meaning the whole Regridder instance has to passed together with the function to the workers.

This PR creates the Regridder._regrid_array static method and passes the needed arguments through a dictionary that removes any internal reference to the instance.

Copy link
Contributor

@huard huard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean.
Please update the CHANGES file.

@huard huard added this to the v0.5 milestone Oct 16, 2020
@huard huard requested a review from andersy005 October 20, 2020 17:42
@huard
Copy link
Contributor

huard commented Oct 20, 2020

@andersy005 I'm not familiar with this aspect of dask. Could you take a quick look and confirm this is reasonable?

Copy link
Member

@andersy005 andersy005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks great to me 👍

There are currently no tests in the test suite that use distributed dask client, so all I can say is that solved the problem on my machine with dask 2.17.

I'm okay with merging this PR as is for the time being. We should add some dask tests to the test suites to make sure that things are working properly. I'll work on this some time this week.

@huard
Copy link
Contributor

huard commented Oct 20, 2020

Thanks, will open new issue for dask testing.

@huard huard merged commit 80fdfd4 into pangeo-data:master Oct 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support parallel regridding with dask-distributed
3 participants