Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move parallelcompat and chunkmanagers to NamedArray #8319

Merged
merged 72 commits into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from 66 commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
70ac10a
move files across
TomNicholas Oct 16, 2023
6b9cd7e
redirect imports from parallelcompat module
TomNicholas Oct 16, 2023
bb89c7e
fix imports
TomNicholas Oct 16, 2023
f3af5b5
move compute/load
TomNicholas Oct 16, 2023
4b4afab
redirect utils imports
TomNicholas Oct 16, 2023
cb8a346
entrypoint should point to namedarray
TomNicholas Oct 16, 2023
dd1e7e3
Merge branch 'main' into namedarray-parallelcompat
TomNicholas Oct 16, 2023
a3b79aa
is_dict_like import
TomNicholas Oct 17, 2023
c554695
fix import for consolidate_dask_from_array_kwargs
TomNicholas Oct 17, 2023
412cbc1
fix test
TomNicholas Oct 17, 2023
8214865
Merge branch 'main' into namedarray-parallelcompat
TomNicholas Oct 17, 2023
74ee3f0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 17, 2023
8e4eb93
Merge branch 'main' into namedarray-parallelcompat
TomNicholas Oct 18, 2023
fe69aca
move is_duck_array
TomNicholas Oct 18, 2023
ba0df3f
passing but with load/compute on variable
TomNicholas Oct 18, 2023
664b4fb
Merge branch 'main' into namedarray-parallelcompat
TomNicholas Oct 24, 2023
1d7052f
Merge branch 'main' into namedarray-parallelcompat
andersy005 Oct 24, 2023
4723f83
fix import
andersy005 Oct 24, 2023
0667b1e
more typing
andersy005 Oct 25, 2023
fd34a7d
Merge branch 'main' into namedarray-parallelcompat
andersy005 Oct 25, 2023
d04fe49
update utils
andersy005 Oct 25, 2023
466bb22
fix imports
andersy005 Oct 25, 2023
f26e259
more imports fixes
andersy005 Oct 25, 2023
8c6e896
replace is_duck_array with _arrayfunction_or_api instead
andersy005 Oct 26, 2023
ed4698c
more typing updates
andersy005 Oct 26, 2023
b6d7756
Merge branch 'main' into namedarray-parallelcompat
andersy005 Oct 26, 2023
309cd4d
more typing
andersy005 Oct 26, 2023
ebf0252
Merge branch 'main' into namedarray-parallelcompat
andersy005 Oct 26, 2023
6aa6b80
fix imports
andersy005 Oct 26, 2023
8a87810
revert get_array_namespace
andersy005 Oct 26, 2023
53906e4
Merge branch 'main' into namedarray-parallelcompat
andersy005 Jan 30, 2024
78dec61
Update module imports
andersy005 Jan 30, 2024
5ce08f0
Fix import statements
andersy005 Jan 30, 2024
905bbef
Merge branch 'main' into namedarray-parallelcompat
andersy005 Jan 30, 2024
ad21f96
Merge branch 'main' into namedarray-parallelcompat
andersy005 Feb 1, 2024
b3b5a62
Merge branch 'main' into namedarray-parallelcompat
andersy005 Feb 1, 2024
541049f
Use is_dask_collection function instead of dask.typing.DaskCollection
andersy005 Feb 2, 2024
01c3d24
revert to using is_duck_array: https://github.com/pydata/xarray/issue…
andersy005 Feb 2, 2024
cda1b26
Fix import error in _typing.py
andersy005 Feb 2, 2024
57092ec
Update typing imports and add compatibility for Python 3.11
andersy005 Feb 2, 2024
f84b3ec
Add support for TypeAlias in Python 3.11 and fallback to typing_exten…
andersy005 Feb 2, 2024
f710504
fix typing issues
andersy005 Feb 4, 2024
4b229fe
fix typing
andersy005 Feb 4, 2024
6b16862
Merge branch 'main' into namedarray-parallelcompat
andersy005 Feb 7, 2024
a34d7a7
Fix type annotations and ignore type errors
andersy005 Feb 7, 2024
128f874
Merge branch 'main' into namedarray-parallelcompat
andersy005 Feb 7, 2024
caf0599
Merge branch 'main' into namedarray-parallelcompat
andersy005 Feb 7, 2024
a1d3883
Merge branch 'main' into namedarray-parallelcompat
andersy005 Feb 9, 2024
4bb66e1
use to_duck_array function to xarray.namedarray.pycompat
andersy005 Feb 9, 2024
ec7821f
Fix type annotations
andersy005 Feb 9, 2024
3459f60
Refactor code to simplify data loading in Variable.compute() and Name…
andersy005 Feb 9, 2024
02382bd
Add to_numpy() function to convert data to numpy array
andersy005 Feb 9, 2024
d3e5e83
move DaskManager import
andersy005 Feb 9, 2024
0829d78
Fix type annotations in DaskManager class
andersy005 Feb 9, 2024
521b319
Update to_numpy and to_duck_array functions
andersy005 Feb 9, 2024
42a63db
Refactor dask-specific kwargs handling in .chunk() method
andersy005 Feb 9, 2024
564ddcf
fix type annotations
andersy005 Feb 9, 2024
e61ed52
Merge branch 'main' into namedarray-parallelcompat
andersy005 Feb 9, 2024
5f9bcfc
Fix type annotations in DaskManager
andersy005 Feb 9, 2024
9f0ad6e
Update DaskManager's from_array method signature
andersy005 Feb 9, 2024
a7016ee
Update compute method return type
andersy005 Feb 9, 2024
be7109c
Refactor DaskManager's unify_chunks method***
andersy005 Feb 9, 2024
6ac37b3
Fix return value in DaskManager
andersy005 Feb 9, 2024
42210a8
Fix imports and use is_chunked_array instead of _chunkedarrayfunction…
andersy005 Feb 9, 2024
5c78d49
use is_duck_array
andersy005 Feb 9, 2024
30487ff
Fix incorrect variable name in assert_duckarray_equal function
andersy005 Feb 9, 2024
89995b1
Merge branch 'main' into namedarray-parallelcompat
andersy005 Feb 12, 2024
76b7e84
try preserving import structure
andersy005 Feb 12, 2024
f38659d
formatting only
andersy005 Feb 12, 2024
df4920b
more imports restructure
andersy005 Feb 12, 2024
dd442c1
more formatting
andersy005 Feb 12, 2024
ea7feef
update what's new
andersy005 Feb 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions doc/internals/chunked-arrays.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,44 +35,44 @@ The implementation of these functions is specific to the type of arrays passed t
whereas :py:class:`cubed.Array` objects must be processed by :py:func:`cubed.map_blocks`.

In order to use the correct implementation of a core operation for the array type encountered, xarray dispatches to the
corresponding subclass of :py:class:`~xarray.core.parallelcompat.ChunkManagerEntrypoint`,
corresponding subclass of :py:class:`~xarray.namedarray.parallelcompat.ChunkManagerEntrypoint`,
also known as a "Chunk Manager". Therefore **a full list of the operations that need to be defined is set by the
API of the** :py:class:`~xarray.core.parallelcompat.ChunkManagerEntrypoint` **abstract base class**. Note that chunked array
API of the** :py:class:`~xarray.namedarray.parallelcompat.ChunkManagerEntrypoint` **abstract base class**. Note that chunked array
methods are also currently dispatched using this class.

Chunked array creation is also handled by this class. As chunked array objects have a one-to-one correspondence with
in-memory numpy arrays, it should be possible to create a chunked array from a numpy array by passing the desired
chunking pattern to an implementation of :py:class:`~xarray.core.parallelcompat.ChunkManagerEntrypoint.from_array``.
chunking pattern to an implementation of :py:class:`~xarray.namedarray.parallelcompat.ChunkManagerEntrypoint.from_array``.

.. note::

The :py:class:`~xarray.core.parallelcompat.ChunkManagerEntrypoint` abstract base class is mostly just acting as a
The :py:class:`~xarray.namedarray.parallelcompat.ChunkManagerEntrypoint` abstract base class is mostly just acting as a
namespace for containing the chunked-aware function primitives. Ideally in the future we would have an API standard
for chunked array types which codified this structure, making the entrypoint system unnecessary.

.. currentmodule:: xarray.core.parallelcompat
.. currentmodule:: xarray.namedarray.parallelcompat

.. autoclass:: xarray.core.parallelcompat.ChunkManagerEntrypoint
.. autoclass:: xarray.namedarray.parallelcompat.ChunkManagerEntrypoint
:members:

Registering a new ChunkManagerEntrypoint subclass
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Rather than hard-coding various chunk managers to deal with specific chunked array implementations, xarray uses an
entrypoint system to allow developers of new chunked array implementations to register their corresponding subclass of
:py:class:`~xarray.core.parallelcompat.ChunkManagerEntrypoint`.
:py:class:`~xarray.namedarray.parallelcompat.ChunkManagerEntrypoint`.


To register a new entrypoint you need to add an entry to the ``setup.cfg`` like this::

[options.entry_points]
xarray.chunkmanagers =
dask = xarray.core.daskmanager:DaskManager
dask = xarray.namedarray.daskmanager:DaskManager

See also `cubed-xarray <https://github.com/xarray-contrib/cubed-xarray>`_ for another example.

To check that the entrypoint has worked correctly, you may find it useful to display the available chunkmanagers using
the internal function :py:func:`~xarray.core.parallelcompat.list_chunkmanagers`.
the internal function :py:func:`~xarray.namedarray.parallelcompat.list_chunkmanagers`.

.. autofunction:: list_chunkmanagers

Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ issue-tracker = "https://github.com/pydata/xarray/issues"
source-code = "https://github.com/pydata/xarray"

[project.entry-points."xarray.chunkmanagers"]
dask = "xarray.core.daskmanager:DaskManager"
dask = "xarray.namedarray.daskmanager:DaskManager"

[build-system]
build-backend = "setuptools.build_meta"
Expand Down
4 changes: 2 additions & 2 deletions xarray/backends/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,13 @@
_nested_combine,
combine_by_coords,
)
from xarray.core.daskmanager import DaskManager
from xarray.core.dataarray import DataArray
from xarray.core.dataset import Dataset, _get_chunk, _maybe_chunk
from xarray.core.indexes import Index
from xarray.core.parallelcompat import guess_chunkmanager
from xarray.core.types import ZarrWriteModes
from xarray.core.utils import is_remote_uri
from xarray.namedarray.daskmanager import DaskManager
from xarray.namedarray.parallelcompat import guess_chunkmanager

if TYPE_CHECKING:
try:
Expand Down
4 changes: 2 additions & 2 deletions xarray/backends/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@

from xarray.conventions import cf_encoder
from xarray.core import indexing
from xarray.core.parallelcompat import get_chunked_array_type
from xarray.core.pycompat import is_chunked_array
from xarray.core.utils import FrozenDict, NdimSizeLenMixin, is_remote_uri
from xarray.namedarray.parallelcompat import get_chunked_array_type
from xarray.namedarray.pycompat import is_chunked_array

if TYPE_CHECKING:
from io import BufferedIOBase
Expand Down
2 changes: 1 addition & 1 deletion xarray/backends/plugins.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from typing import TYPE_CHECKING, Any, Callable

from xarray.backends.common import BACKEND_ENTRYPOINTS, BackendEntrypoint
from xarray.core.utils import module_available
from xarray.namedarray.utils import module_available

if TYPE_CHECKING:
import os
Expand Down
4 changes: 2 additions & 2 deletions xarray/backends/pydap_.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,15 @@
)
from xarray.backends.store import StoreBackendEntrypoint
from xarray.core import indexing
from xarray.core.pycompat import integer_types
from xarray.core.utils import (
Frozen,
FrozenDict,
close_on_error,
is_dict_like,
is_remote_uri,
)
from xarray.core.variable import Variable
from xarray.namedarray.pycompat import integer_types
from xarray.namedarray.utils import is_dict_like

if TYPE_CHECKING:
import os
Expand Down
4 changes: 2 additions & 2 deletions xarray/backends/zarr.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,15 @@
)
from xarray.backends.store import StoreBackendEntrypoint
from xarray.core import indexing
from xarray.core.parallelcompat import guess_chunkmanager
from xarray.core.pycompat import integer_types
from xarray.core.types import ZarrWriteModes
from xarray.core.utils import (
FrozenDict,
HiddenKeyDict,
close_on_error,
)
from xarray.core.variable import Variable
from xarray.namedarray.parallelcompat import guess_chunkmanager
from xarray.namedarray.pycompat import integer_types

if TYPE_CHECKING:
from io import BufferedIOBase
Expand Down
3 changes: 2 additions & 1 deletion xarray/coding/strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,9 @@
unpack_for_encoding,
)
from xarray.core import indexing
from xarray.core.parallelcompat import get_chunked_array_type, is_chunked_array
from xarray.core.variable import Variable
from xarray.namedarray.parallelcompat import get_chunked_array_type
from xarray.namedarray.pycompat import is_chunked_array


def create_vlen_dtype(element_type):
Expand Down
5 changes: 3 additions & 2 deletions xarray/coding/times.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,12 @@
from xarray.core.common import contains_cftime_datetimes, is_np_datetime_like
from xarray.core.duck_array_ops import asarray
from xarray.core.formatting import first_n_items, format_timestamp, last_item
from xarray.core.parallelcompat import T_ChunkedArray, get_chunked_array_type
from xarray.core.pdcompat import nanosecond_precision_timestamp
from xarray.core.pycompat import is_chunked_array, is_duck_dask_array
from xarray.core.utils import emit_user_level_warning
from xarray.core.variable import Variable
from xarray.namedarray.parallelcompat import T_ChunkedArray, get_chunked_array_type
from xarray.namedarray.pycompat import is_chunked_array
from xarray.namedarray.utils import is_duck_dask_array

try:
import cftime
Expand Down
6 changes: 3 additions & 3 deletions xarray/coding/variables.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@
import pandas as pd

from xarray.core import dtypes, duck_array_ops, indexing
from xarray.core.parallelcompat import get_chunked_array_type
from xarray.core.pycompat import is_chunked_array
from xarray.core.variable import Variable
from xarray.namedarray.parallelcompat import get_chunked_array_type
from xarray.namedarray.pycompat import is_chunked_array

if TYPE_CHECKING:
T_VarTuple = tuple[tuple[Hashable, ...], Any, dict, dict]
Expand Down Expand Up @@ -163,7 +163,7 @@ def lazy_elemwise_func(array, func: Callable, dtype: np.typing.DTypeLike):
if is_chunked_array(array):
chunkmanager = get_chunked_array_type(array)

return chunkmanager.map_blocks(func, array, dtype=dtype)
return chunkmanager.map_blocks(func, array, dtype=dtype) # type: ignore[arg-type]
else:
return _ElementwiseFunctionArray(array, func, dtype)

Expand Down
2 changes: 1 addition & 1 deletion xarray/conventions.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@
_contains_datetime_like_objects,
contains_cftime_datetimes,
)
from xarray.core.pycompat import is_duck_dask_array
from xarray.core.utils import emit_user_level_warning
from xarray.core.variable import IndexVariable, Variable
from xarray.namedarray.utils import is_duck_dask_array

CF_RELATED_DATA = (
"bounds",
Expand Down
2 changes: 1 addition & 1 deletion xarray/convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from xarray.core import duck_array_ops
from xarray.core.dataarray import DataArray
from xarray.core.dtypes import get_fill_value
from xarray.core.pycompat import array_type
from xarray.namedarray.pycompat import array_type

iris_forbidden_keys = {
"standard_name",
Expand Down
3 changes: 2 additions & 1 deletion xarray/core/_aggregations.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@
from xarray.core import duck_array_ops
from xarray.core.options import OPTIONS
from xarray.core.types import Dims, Self
from xarray.core.utils import contains_only_chunked_or_numpy, module_available
from xarray.core.utils import contains_only_chunked_or_numpy
from xarray.namedarray.utils import module_available

if TYPE_CHECKING:
from xarray.core.dataarray import DataArray
Expand Down
2 changes: 1 addition & 1 deletion xarray/core/accessor_dt.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@
is_np_datetime_like,
is_np_timedelta_like,
)
from xarray.core.pycompat import is_duck_dask_array
from xarray.core.types import T_DataArray
from xarray.core.variable import IndexVariable
from xarray.namedarray.utils import is_duck_dask_array

if TYPE_CHECKING:
from numpy.typing import DTypeLike
Expand Down
3 changes: 2 additions & 1 deletion xarray/core/alignment.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,9 @@
safe_cast_to_index,
)
from xarray.core.types import T_Alignable
from xarray.core.utils import is_dict_like, is_full_slice
from xarray.core.utils import is_full_slice
from xarray.core.variable import Variable, as_compatible_data, calculate_dimensions
from xarray.namedarray.utils import is_dict_like

if TYPE_CHECKING:
from xarray.core.dataarray import DataArray
Expand Down
7 changes: 2 additions & 5 deletions xarray/core/arithmetic.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,9 @@
VariableOpsMixin,
)
from xarray.core.common import ImplementsArrayReduce, ImplementsDatasetReduce
from xarray.core.ops import (
IncludeNumpySameMethods,
IncludeReduceMethods,
)
from xarray.core.ops import IncludeNumpySameMethods, IncludeReduceMethods
from xarray.core.options import OPTIONS, _get_keep_attrs
from xarray.core.pycompat import is_duck_array
from xarray.namedarray.utils import is_duck_array


class SupportsArithmetic:
Expand Down
6 changes: 3 additions & 3 deletions xarray/core/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,15 @@
from xarray.core import dtypes, duck_array_ops, formatting, formatting_html, ops
from xarray.core.indexing import BasicIndexer, ExplicitlyIndexed
from xarray.core.options import OPTIONS, _get_keep_attrs
from xarray.core.parallelcompat import get_chunked_array_type, guess_chunkmanager
from xarray.core.pycompat import is_chunked_array
from xarray.core.utils import (
Frozen,
either_dict_or_kwargs,
emit_user_level_warning,
is_scalar,
)
from xarray.namedarray.core import _raise_if_any_duplicate_dimensions
from xarray.namedarray.parallelcompat import get_chunked_array_type, guess_chunkmanager
from xarray.namedarray.pycompat import is_chunked_array
from xarray.namedarray.utils import either_dict_or_kwargs

try:
import cftime
Expand Down
7 changes: 4 additions & 3 deletions xarray/core/computation.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,12 @@
from xarray.core.indexes import Index, filter_indexes_from_coords
from xarray.core.merge import merge_attrs, merge_coordinates_without_align
from xarray.core.options import OPTIONS, _get_keep_attrs
from xarray.core.parallelcompat import get_chunked_array_type
from xarray.core.pycompat import is_chunked_array, is_duck_dask_array
from xarray.core.types import Dims, T_DataArray
from xarray.core.utils import is_dict_like, is_scalar, parse_dims
from xarray.core.utils import is_scalar, parse_dims
from xarray.core.variable import Variable
from xarray.namedarray.parallelcompat import get_chunked_array_type
from xarray.namedarray.pycompat import is_chunked_array
from xarray.namedarray.utils import is_dict_like, is_duck_dask_array
from xarray.util.deprecation_helpers import deprecate_dims

if TYPE_CHECKING:
Expand Down
2 changes: 1 addition & 1 deletion xarray/core/coordinates.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,10 @@
from xarray.core.utils import (
Frozen,
ReprObject,
either_dict_or_kwargs,
emit_user_level_warning,
)
from xarray.core.variable import Variable, as_variable, calculate_dimensions
from xarray.namedarray.utils import either_dict_or_kwargs

if TYPE_CHECKING:
from xarray.core.common import DataWithCoords
Expand Down
Loading
Loading