-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support parallel regridding with dask-distributed #71
Comments
In pangeo-data/pangeo#334 (comment), I got Dask-distributed working on a synthetic regridding problem (just a sparse matrix multiply) by using To make Dask-distributed work, there needs to be an explicit call to broadcast the weights to all workers: I haven't tested dask-distributed on xarray DataArray/Dataset yet, and your error might be due to a different issue associated with xarray metadata other than dask. A quick way to test this is to pass the raw data instead, i.e. |
Hmm, with
|
Aha, problem solved. Just set this before applying the regridder to data: regridder._grid_in = None
regridder._grid_out = None
|
The explicit broadcasting of regridding weights is still TBD, though. In pangeo-data/pangeo#334 (comment), I was thinking about adding an explicit Or maybe there is a cleverer way to hide this explicit call from users. Any suggestions & PRs are extremely welcome! |
@JiaweiZhuang in your example notebook that you include above, what versions of xESMF and dask distributed are you using? I am still getting the |
fix broken link in doc notebook masking/extrapolation
Previous issues have discussed supporting dask enabled parallel regridding (e.g. #3). This seems to be working for the threaded scheduler but not for the distributed scheduler. It seems like this should be doable at this point with some work to solve some serialization problems.
Current behavior
If you run the current dask regridding example in this repo's binder setup with dask-distributed, you get a bunch of serialization errors:
From what I can tell, it seems like there is some object that dask is trying to serialize that can't be pickled. Has anyone looked into this to diagnose why this is happening?
The text was updated successfully, but these errors were encountered: