-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
interp
performance with chunked dimensions
#6799
Comments
Yeah I think this is right. You could check if it was better before #4155 (if it worked that is) |
You are right about the behavior of the code. I don't see any way to enhance that in the general case. Maybe, in your case, rechunking before interpolating might be a good idea |
The chunking structure on disk is pretty instrumental to my application, which requires fast retrievals of full slices in the time dimension. The loop option in my first post only takes about 10 seconds with |
The current code also has the unfortunate side-effect of merging all chunks too. I think we should instead think of generating a dask array of weights and then using |
Don't really know what I'm talking about here, but it looks to me like the current dask-interpolation routine uses I would have expected interpolation to use FYI, fixing this would probably be a big deal to geospatial people—then you could do array reprojection without GDAL! Unfortunately not something I have time to work on right now, but perhaps someone else would be interested? |
The challenge is you could be interping to an unordered set of locations. So perhaps we can sort the input locations, do the interp with map_overlap, then argsort the result back to expected order. |
Linking the dask issue: dask/dask#6474 |
Hi all, See a comparison of RAM/execution time here: GlacioHack/xdem#501 (reply in thread). The RAM usage is also checked automatically in our tests and doesn't seem to exceed what we expect 🙂 Using @GenevieveBuckley's very nice blogpost on ragged output (https://blog.dask.org/2021/07/02/ragged-output), we tested both Unfortunately the implementation is not generic for Xarray, having a regular or equal grid along the interpolated dimensions is only a specific case here. So I guess the question is: Is it common enough that maybe it could be interesting to implement that functionality directly in Xarray if the interpolated dimensions are detected to be regular? |
This comment was marked as outdated.
This comment was marked as outdated.
I'm looking in to this and can report back late next week. |
#9881 fixes #6799 (comment) so I marked it as "Outdated" to not distract future readers For the OP, I'm experimenting with using vectorized indexing and from typing import Hashable
from xarray import DataArray, Variable
def digitize(to, from_, right=True):
return np.digitize(to, from_, right) - 1
def xr_interp(obj: DataArray | Variable, to: dict[Hashable, Variable]) -> DataArray | Variable:
from_ = {dim: obj[dim].variable for dim in to}
weights = []
indexers = {}
sum_dims = []
for dim in to:
lo_index = digitize(to[dim], from_[dim])
hi_index = np.minimum(lo_index + 1, obj.sizes[dim] - 1)
lo_weight = np.abs((to[dim] - from_[dim][lo_index]) / (from_[dim][hi_index] - from_[dim][lo_index]))
lo_weight[(to[dim] < from_[dim][0]) | (to[dim] > from_[dim][-1])] = np.nan
hi_weight = 1. - lo_weight
concat_dim = f"__{dim}__"
sum_dims.append(concat_dim)
weights_concat = Variable.concat([hi_weight, lo_weight], dim=concat_dim)
weights.append(weights_concat)
indexers[dim] = Variable.concat([Variable(dim, lo_index), Variable(dim, hi_index)], dim=concat_dim)
#var = hi_weight * var.isel({dim: lo_index}) + lo_weight * var.isel({dim: hi_index})
result = xr.dot(obj.isel(indexers), *weights, dim=sum_dims, optimize=True)
result = result.assign_coords(to)
return result This works nicely for the vectorized interpolation example in the original post but totally falls apart inside dask for the outer interpolation case: #9907 (code reproduced below; cc @phofl ). I noticed that EDIT*: It works a lot better with import dask.array
import pandas as pd
import numpy as np
import xarray as xr
arr = xr.DataArray(
dask.array.random.random((1, 75902, 45910), chunks=(1, "auto", -1)),
dims=["band", "y", "x"],
coords={"x": np.linspace(-73.58, -62.11, 45910), "y": np.linspace(-36.08, -55.05, 75902)},
name="bla",
)
arr2 = xr.DataArray(
dask.array.random.random((1, 75902, 45910), chunks=(1, "auto", -1)),
dims=["band", "y", "x"],
coords={"x": np.linspace(-73.58, -62.11, 45910), "y": np.linspace(-36.08, -55.05, 75902)},
name="bla",
)
x = arr2.interp(
x=arr.coords["x"],
y=arr.coords["y"],
method="linear",
) |
What is your issue?
I'm trying to perform 2D interpolation on a large 3D array that is heavily chunked along the interpolation dimensions and not the third dimension. The application could be extracting a timeseries from a reanalysis dataset chunked in space but not time, to compare to observed station data with more precise coordinates.
I use the advanced interpolation method as described in the documentation, with the interpolation coordinates specified by DataArray's with a shared dimension like so:
With just 10 interpolation points, this example calculation uses about
1.5 * ds.nbytes
of memory, and saturates around2 * ds.nbytes
by about 100 interpolation points.This could definitely work better, as each interpolated point usually only requires a single chunk of the input dataset, and at most 4 if it is right on the corner of a chunk. For example we can instead do it in a loop and get very reasonable memory usage, but this isn't very scalable:
I tried adding a
.chunk({'z':1})
to the interpolation coordinates but this doesn't help. We can also do.sel(x=xx, y=yy, method='nearest')
with very good performance.Any tips to make this calculation work better with existing options, or otherwise ways we might improve the
interp
method to handle this case? Given the performance behavior I'm guessing we may be doing sequntial interpolation for the dimensions, basically aninterp1d
call for all thexx
points and from there another to theyy
points, which for even a small number of points would require nearly all chunks to be loaded in. But I haven't explored the code enough yet to understand the details.The text was updated successfully, but these errors were encountered: