Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete dask conversion of XArrayResamplerBilinear #148

Closed
pnuu opened this issue Nov 26, 2018 · 5 comments
Closed

Complete dask conversion of XArrayResamplerBilinear #148

pnuu opened this issue Nov 26, 2018 · 5 comments
Assignees
Labels

Comments

@pnuu
Copy link
Member

pnuu commented Nov 26, 2018

The xarray/dask version of bilinear resampling works, but much of the data are computed too early and in-memory. The "daskification" of the class methods should be completed.

@pnuu pnuu added enhancement in progress PCW Pytroll Contributors' Week labels Nov 26, 2018
@pnuu pnuu self-assigned this Nov 26, 2018
@pnuu
Copy link
Member Author

pnuu commented Nov 27, 2018

I've been taking a deep look into this, and my conclusions are:

def _tile(arr, repeat):
    lst = []
    for _ in range(repeat[0]):
        lst.append(arr)
    return da.stack(lst, axis=1) 

Dask profiling with SatPy shows that when using on-disk caching for the resampling LUTs, only the first run (where the LUTs are calculated) is slow, after that both the memory usage and overall processing time is very low. With my test (SEVIRI, one RGB composite, target area of 1959 x 1934 pixels) the memory use is reduced from 8 GB to less than 2 GB, and processing time went down from 52 s to 6.5 s.

@rabernat
Copy link

Is the conclusion of this discussion that pyresample will not support lazy resampling of dask arrays?

@djhoese
Copy link
Member

djhoese commented Aug 13, 2019

@rabernat No, support exists. Much of pyresample is dask/xarray friendly but not easily accessible from pyresample's traditional interfaces. We currently have a terrible spread of resampling functionality across satpy and pyresample. It all uses xarray and dask. We have plans of making a pyresample 2.0 with better interfaces for both numpy-based workflows and dask workflows (probaby an xarray accessor). This issue is specifically about bilinear resampling which is relatively new and hasn't been profiled as aggressively as the nearest neighbor resampling.

Additionally we have plans for a gradient search resampling (see #191) that would be faster than the current nearest neighbor and also use dask, but may be using dask in unintended ways. @mraspaud could provide more information if you're curious.

If you have something you'd like to do specifically let me know and we can discuss it here or on the pangeo gitter.

@rabernat
Copy link

Thanks for the reply @djhoese. I opened a new issue about my use case.

@pnuu pnuu added PCW Pytroll Contributors' Week and removed PCW Pytroll Contributors' Week labels Oct 8, 2019
@pnuu pnuu closed this as completed Oct 8, 2019
@pnuu
Copy link
Member Author

pnuu commented Oct 8, 2019

Superseded by #215

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants