-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making xarray math lazy #2298
Comments
This sounds interesting. |
The main practical difference is that it allows us to reliably guarantee that expressions like The typical example is spatially referenced imagery, e.g., a 2D satellite photo of the surface of the Earth with 2D latitude/longitude coordinates associated with each point. It would be very expensive to store full latitude and longitude arrays, but fortunately they can usually be computed cheaply from row and column indices. Ideally, this logic would live outside xarray. But it's important enough to some xarray users (especially geoscience + astronomy) and we have enough related functionality (e.g., for lazy and explicit indexing) that it probably makes sense to add it. |
Thanks, @shoyer
Agreed.
Yes, I concerned about this. But in practice, it should take a long time to realize the any-array-like support and it might be a good idea to natively support the lazy mathematics for now. Therefore, personally, I'd like to see this lazy math by implementing a lazy array. |
This is not a bad idea, but the version of lazy arithmetic that I have been contemplating (see #2302) is not yet complete. For example, it doesn't have any way to represent a lazy aggregation. |
Two thoughts:
|
Indeed, I really like the look of dask/dask#2538 and its implementation in dask/dask#2608. It doesn't solve the indexing optimization yet but that could be pretty straightforward to add -- especially once we add a notion of explicit indexing types (basic vs outer vs vectorized) directly into dask. |
Any thoughts on the current status on this? |
At SciPy, I had the realization that it would be relatively straightforward to make element-wise math between xarray objects lazy. This would let us support lazy coordinate arrays, a feature that has quite a few use-cases, e.g., for both geoscience and astronomy.
The trick would be to write a lazy array class that holds an element-wise vectorized function and passes indexers on to its arguments. I haven't thought too hard about this yet for vectorized indexing, but it could be quite efficient for outer indexing. I have some prototype code but no tests yet.
The question is how to hook this into xarray operations. In particular, supposing that the inputs to a function do no hold dask arrays:
+
with separate logic fromapply_ufunc
.apply_ufunc()
lazy by default?apply_ufunc()
if you use some special flag, e.g.,apply_ufunc(..., lazy=True)
?I am leaning towards the last option for now but would welcome other opinions.
The text was updated successfully, but these errors were encountered: