-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
TLDR: I think we want A.sel(x=A.x).equals(A) to pass lazily. It doesn't do so currently.
Currently if I do A.sel(x=A.x), this sticks in a getitem call in the dask graph which breaks our lazy array equality optimization. Here's an example
>>> A = xr.DataArray(np.arange(100), dims="x", coords={"x": np.arange(100)}).chunk({"x": 1})
>>> A.sel(x=A.x).variable.equals(A, equiv=xr.core.duck_array_ops.lazy_array_equiv)
NoneQuestions:
-
Where is the best place to do this? In
selorisel? Both?
Sticking the following inselmakes the above check returnTruewhich is what we want:if self._indexes: equals = [] for index in indexers: equals.append(indexers[index].to_index().equals(self._indexes[index])) if all(equals): return self
This doesn't handle slice objects though so that makes me think we'd want to add something similar to
iseltoo. -
What is the behaviour we want?
A.sel(x=A.x).equals(A)orA.sel(x=A.x) is A? -
Doing the latter will mean changing
_to_temp_datasetand_from_temp_datasetwhich suggests the constraintA._from_temp_dataset(A._to_temp_dataset()) is A? But this seems too strong to me. Do we only want to lazily satisfy anequalsconstraint rather than anidenticalconstraint? -
It seems like we'll want to add such short-circuits in many places (I have not checked all of these):
sortby,broadcast,align,reindex(transposedoes this now).