Skip to content

Generalize lazy backend indexing a little more #10078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 28, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
@@ -30,6 +30,8 @@ New Features
By `Justus Magin <https://github.com/keewis>`_.
- Added experimental support for coordinate transforms (not ready for public use yet!) (:pull:`9543`)
By `Benoit Bovy <https://github.com/benbovy>`_.
- Support reading to `GPU memory with Zarr <https://zarr.readthedocs.io/en/stable/user-guide/gpu.html>`_ (:pull:`10078`).
By `Deepak Cherian <https://github.com/dcherian>`_.

Breaking changes
~~~~~~~~~~~~~~~~
3 changes: 2 additions & 1 deletion xarray/core/indexes.py
Original file line number Diff line number Diff line change
@@ -442,6 +442,7 @@ def safe_cast_to_index(array: Any) -> pd.Index:
"""
from xarray.core.dataarray import DataArray
from xarray.core.variable import Variable
from xarray.namedarray.pycompat import to_numpy

if isinstance(array, pd.Index):
index = array
@@ -468,7 +469,7 @@ def safe_cast_to_index(array: Any) -> pd.Index:
)
kwargs["dtype"] = "float64"

index = pd.Index(np.asarray(array), **kwargs)
index = pd.Index(to_numpy(array), **kwargs)

return _maybe_cast_to_cftimeindex(index)

4 changes: 2 additions & 2 deletions xarray/core/indexing.py
Original file line number Diff line number Diff line change
@@ -1013,8 +1013,8 @@ def explicit_indexing_adapter(
raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
result = raw_indexing_method(raw_key.tuple)
if numpy_indices.tuple:
# index the loaded np.ndarray
indexable = NumpyIndexingAdapter(result)
# index the loaded duck array
indexable = as_indexable(result)
result = apply_indexer(indexable, numpy_indices)
return result

13 changes: 13 additions & 0 deletions xarray/tests/test_indexing.py
Original file line number Diff line number Diff line change
@@ -19,6 +19,7 @@
raise_if_dask_computes,
requires_dask,
)
from xarray.tests.arrays import DuckArrayWrapper

B = IndexerMaker(indexing.BasicIndexer)

@@ -1052,3 +1053,15 @@ def test_advanced_indexing_dask_array() -> None:
with raise_if_dask_computes():
actual = ds.b.sel(x=ds.a.data)
assert_identical(expected, actual)


def test_backend_indexing_non_numpy() -> None:
"""This model indexing of a Zarr store that reads to GPU memory."""
array = DuckArrayWrapper(np.array([1, 2, 3]))
indexed = indexing.explicit_indexing_adapter(
indexing.BasicIndexer((slice(1),)),
shape=array.shape,
indexing_support=indexing.IndexingSupport.BASIC,
raw_indexing_method=array.__getitem__,
)
np.testing.assert_array_equal(indexed.array, np.array([1]))

Unchanged files with check annotations Beta

attrs: _AttrsLike = None,
):
self._data = data
self._dims = self._parse_dimensions(dims)

Check warning on line 264 in xarray/namedarray/core.py

GitHub Actions / ubuntu-latest py3.10 bare-minimum

Duplicate dimension names present: dimensions {'x'} appear more than once in dims=('x', 'x'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``.

Check warning on line 264 in xarray/namedarray/core.py

GitHub Actions / ubuntu-latest py3.10 bare-minimum

Duplicate dimension names present: dimensions {'x'} appear more than once in dims=('x', 'x'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``.

Check warning on line 264 in xarray/namedarray/core.py

GitHub Actions / macos-latest py3.10

Duplicate dimension names present: dimensions {'x'} appear more than once in dims=('x', 'x'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``.

Check warning on line 264 in xarray/namedarray/core.py

GitHub Actions / macos-latest py3.10

Duplicate dimension names present: dimensions {'x'} appear more than once in dims=('x', 'x'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``.

Check warning on line 264 in xarray/namedarray/core.py

GitHub Actions / macos-latest py3.13

Duplicate dimension names present: dimensions {'x'} appear more than once in dims=('x', 'x'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``.

Check warning on line 264 in xarray/namedarray/core.py

GitHub Actions / macos-latest py3.13

Duplicate dimension names present: dimensions {'x'} appear more than once in dims=('x', 'x'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``.

Check warning on line 264 in xarray/namedarray/core.py

GitHub Actions / ubuntu-latest py3.10

Duplicate dimension names present: dimensions {'x'} appear more than once in dims=('x', 'x'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``.

Check warning on line 264 in xarray/namedarray/core.py

GitHub Actions / ubuntu-latest py3.10

Duplicate dimension names present: dimensions {'x'} appear more than once in dims=('x', 'x'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``.

Check warning on line 264 in xarray/namedarray/core.py

GitHub Actions / ubuntu-latest py3.13

Duplicate dimension names present: dimensions {'x'} appear more than once in dims=('x', 'x'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``.

Check warning on line 264 in xarray/namedarray/core.py

GitHub Actions / ubuntu-latest py3.13

Duplicate dimension names present: dimensions {'x'} appear more than once in dims=('x', 'x'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``.
self._attrs = dict(attrs) if attrs else None
def __init_subclass__(cls, **kwargs: Any) -> None:
xp = get_array_namespace(data)
if xp == np:
# numpy currently doesn't have a astype:
return data.astype(dtype, **kwargs)

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / macos-latest py3.10

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / macos-latest py3.10

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / ubuntu-latest py3.13 all-but-numba

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / ubuntu-latest py3.13 all-but-numba

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / macos-latest py3.13

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / macos-latest py3.13

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / macos-latest py3.13

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / macos-latest py3.13

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / ubuntu-latest py3.10

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / ubuntu-latest py3.10

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / ubuntu-latest py3.13

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / ubuntu-latest py3.13

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / ubuntu-latest py3.13

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / ubuntu-latest py3.13

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / windows-latest py3.10

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / windows-latest py3.10

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / windows-latest py3.13

invalid value encountered in cast

Check warning on line 232 in xarray/core/duck_array_ops.py

GitHub Actions / windows-latest py3.13

invalid value encountered in cast
return xp.astype(data, dtype, **kwargs)
return data.astype(dtype, **kwargs)
kwargs = {"decode_vlen_strings": True}
h5 = h5netcdf.File(tmp_file, mode="r", **kwargs)

Check warning on line 4188 in xarray/tests/test_backends.py

GitHub Actions / macos-latest py3.10

The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.

Check warning on line 4188 in xarray/tests/test_backends.py

GitHub Actions / macos-latest py3.10

The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.

Check warning on line 4188 in xarray/tests/test_backends.py

GitHub Actions / ubuntu-latest py3.10

The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.

Check warning on line 4188 in xarray/tests/test_backends.py

GitHub Actions / ubuntu-latest py3.10

The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.

Check warning on line 4188 in xarray/tests/test_backends.py

GitHub Actions / windows-latest py3.10

The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.

Check warning on line 4188 in xarray/tests/test_backends.py

GitHub Actions / windows-latest py3.10

The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.

Check warning on line 4188 in xarray/tests/test_backends.py

GitHub Actions / windows-latest py3.10

The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.

Check warning on line 4188 in xarray/tests/test_backends.py

GitHub Actions / windows-latest py3.10

The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.

Check warning on line 4188 in xarray/tests/test_backends.py

GitHub Actions / windows-latest py3.10

The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.

Check warning on line 4188 in xarray/tests/test_backends.py

GitHub Actions / windows-latest py3.10

The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
store = backends.H5NetCDFStore(h5["g"])
with open_dataset(store) as ds:
expected = Dataset({"x": ((), 42)})
# otherwise numpy unsigned ints will silently cast to the signed counterpart
fill_value = fill_value.item()
# passes if provided fill value fits in encoded on-disk type
new_fill = encoded_dtype.type(fill_value)

Check warning on line 348 in xarray/coding/variables.py

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).
except OverflowError:
encoded_kind_str = "signed" if encoded_dtype.kind == "i" else "unsigned"
warnings.warn(