forked from pydata/xarray
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] main from pydata:main #582
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Reimplement DataTree aggregations They now allow for dimensions that are missing on particular nodes, and use Xarray's standard generate_aggregations machinery, like aggregations for DataArray and Dataset. Fixes #8949, #8963 * add API docs on DataTree aggregations * remove incorrectly added sel methods * fix docstring reprs * mypy fix * fix self import * remove unimplemented agg methods * replace dim_arg_to_dims_set with parse_dims * add parse_dims_as_set * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix mypy errors * change tests to match slightly different error now thrown --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: TomNicholas <tom@cworthy.org>
…ee function (#9614) * updating group type annotation for netcdf, hdf5, and zarr open_datatree function * supporting only in group type annotation for netcdf, hdf5, and zarr open_datatree function * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Rename inherited -> inherit in DataTree.to_dataset * fixed one missed instance of kwarg from #9602 --------- Co-authored-by: Tom Nicholas <tom@cworthy.org>
* remove too-long underline * draft section on data alignment * fixes * draft section on coordinate inheritance * various improvements * more improvements * link from other page * align call include all 3 datasets * link back to use cases * clarification * small improvements * remove TODO after #9532 * add todo about #9475 * correct xr.align example call * add links to netCDF4 documentation * Consistent voice Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com> * keep indexes in lat lon selection to dodge #9475 * unpack generator properly Co-authored-by: Stephan Hoyer <shoyer@google.com> * ideas for next section * briefly summarize what alignment means * clarify that it's the data in each node that was previously unrelated * fix incorrect indentation of code block * display the tree with redundant coordinates again * remove content about non-inherited coords for a follow-up PR * remove todo * remove todo now that aggregations are re-implemented * remove link to (unmerged) migration guide * remove todo about improving error message * correct statement in data-structures docs * fix internal link --------- Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com> Co-authored-by: Stephan Hoyer <shoyer@google.com>
* test unary op * implement and generate unary ops * test for unary op with inherited coordinates * re-enable arithmetic tests * implementation for binary ops * test ds * dt commutativity * ensure other types defer to DataTree, thus fixing #9365 * test for inplace binary op * pseudocode implementation of inplace binary op, and xfail test * remove some unneeded type: ignore comments * return type should be DataTree * type datatree ops as accepting dataset-compatible types too * use same type hinting hack as Dataset does for __eq__ not being same as Mapping * ignore return type * add some methods to api docs * don't try to import DataTree.astype in API docs * test to check that single-node trees aren't broadcast * return NotImplemented * remove pseudocode for inplace binary ops * map_over_subtree -> map_over_datasets
* sketch of migration guide * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * whatsnew * add date * spell out API changes in more detail * details on backends integration * explain alignment and open_groups * explain coordinate inheritance * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * re-trigger CI * remove bullet about map_over_subtree * Markdown formatting for important warning block Co-authored-by: Matt Savoie <github@flamingbear.com> * Reorder changes in order of importance Co-authored-by: Matt Savoie <github@flamingbear.com> * Clearer wording on setting relationships Co-authored-by: Matt Savoie <github@flamingbear.com> * remove "technically" Co-authored-by: Matt Savoie <github@flamingbear.com> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Matt Savoie <github@flamingbear.com>
* Add inherit=False option to DataTree.copy() This PR adds a inherit=False option to DataTree.copy, so users can decide if they want to inherit coordinates from parents or not when creating a subtree. The default behavior is `inherit=True`, which is a breaking change from the current behavior where parent coordinates are dropped (which I believe should be considered a bug). * fix typing * add migration guide note * ignore typing error
* Bug fixes for DataTree indexing and aggregation My implementation of indexing and aggregation was incorrect on child nodes, re-creating the child nodes from the root. There was also another bug when indexing inherited coordinates that meant formerly inherited coordinates were incorrectly dropped from results. * disable broken test
as suggested by @headtr1ck in #9628 (comment)
* type hints for datatree ops tests * type hints for datatree aggregations tests * type hints for datatree indexing tests * type hint a lot more tests * more type hints
* Add zip_subtrees for paired iteration over DataTrees This should be used for implementing DataTree arithmetic inside map_over_datasets, so the result does not depend on the order in which child nodes are defined. I have also added a minimal implementation of breadth-first-search with an explicit queue the current recursion based solution in xarray.core.iterators (which has been removed). The new implementation is also slightly faster in my microbenchmark: In [1]: import xarray as xr In [2]: tree = xr.DataTree.from_dict({f"/x{i}": None for i in range(100)}) In [3]: %timeit _ = list(tree.subtree) # on main 87.2 μs ± 394 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each) # with this branch 55.1 μs ± 294 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each) * fix pytype error * Tweaks per review
If the file is empty (or contains no variables matching any filtering done by the backend), use a different error message indicating that, rather than suggesting that the file has too many variables for this function.
* Updates to DataTree.equals and DataTree.identical In contrast to `equals`, `identical` now also checks that any inherited variables are inherited on both objects. However, they do not need to be inherited from the same source. This aligns the behavior of `identical` with the DataTree `__repr__`. I've also removed the `from_root` argument from `equals` and `identical`. If a user wants to compare trees from their roots, a better (simpler) inference is to simply call these methods on the `.root` properties. I would also like to remove the `strict_names` argument, but that will require switching to use the new `zip_subtrees` (#9623) first. * More efficient check for inherited coordinates
* Fix error and probably missing code cell in io.rst * Make this even simpler, remove link to same section
* Replace black with ruff-format * Fix formatting mistakes moving mypy comments * Replace black with ruff in the contributing guides
* Add zip_subtrees for paired iteration over DataTrees This should be used for implementing DataTree arithmetic inside map_over_datasets, so the result does not depend on the order in which child nodes are defined. I have also added a minimal implementation of breadth-first-search with an explicit queue the current recursion based solution in xarray.core.iterators (which has been removed). The new implementation is also slightly faster in my microbenchmark: In [1]: import xarray as xr In [2]: tree = xr.DataTree.from_dict({f"/x{i}": None for i in range(100)}) In [3]: %timeit _ = list(tree.subtree) # on main 87.2 μs ± 394 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each) # with this branch 55.1 μs ± 294 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each) * fix pytype error * Re-implement map_over_datasets The main changes: - It is implemented using zip_subtrees, which means it should properly handle DataTrees where the nodes are defined in a different order. - For simplicity, I removed handling of `**kwargs`, in order to preserve some flexibility for adding keyword arugments. - I removed automatic skipping of empty nodes, because there are almost assuredly cases where that would make sense. This could be restored with a option keyword arugment. * fix typing of map_over_datasets * add group_subtrees * wip fixes * update isomorphic * documentation and API change for map_over_datasets * mypy fixes * fix test * diff formatting * more mypy * doc fix * more doc fix * add api docs * add utility for joining path on windows * docstring * add an overload for two return values from map_over_datasets * partial fixes per review * fixes per review * remove a couple of xfails
* _inherited_vars -> inherited_vars * implementation using Coordinates * datatree.DataTree -> xarray.DataTree * only show inherited coordinates on root * test that there is an Inherited coordinates header
* flox: Properly propagate multiindex Closes #9648 * skip test on old pandas * small optimization * fix
* Fix multiple grouping with missing groups Closes #9360 * Small repr improvement * Small optimization in mask * Add whats-new * fix doctests
…ests (#9651) * Add close() method to DataTree and clean-up open files in tests This removes a bunch of warnings that were previously issued in unit-tests. * Unit tests for closing functionality
…ap_blocks`` (#9658) * Reduce graph size through writing indexes directly into graph for map_blocks * Reduce graph size through writing indexes directly into graph for map_blocks * Update xarray/core/parallel.py --------- Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
* Remove zarr pin * Define zarr_v3 helper * zarr-v3: filters / compressors -> codecs * zarr-v3: update tests to avoid values equal to fillValue * Various test fixes * zarr_version fixes * removed open_consolidated workarounds * removed _store_version check * pass through zarr_version * fixup! zarr-v3: filters / compressors -> codecs * fixup! fixup! zarr-v3: filters / compressors -> codecs * fixup * path / key normalization in set_variables * fixes * workaround nested consolidated metadata * test: avoid fill_value * test: Adjust call counts * zarr-python 3.x Array.resize doesn't mutate * test compatibility - skip write_empty_chunks on 3.x - update patch targets * skip ZipStore with_mode * test: more fill_value avoidance * test: more fill_value avoidance * v3 compat for instrumented test * Handle zarr_version / zarr_format deprecation * wip * most Zarr tests passing * unskip tests * add custom Zarr _FillValue encoding / decoding * relax dtype comparison in test_roundtrip_empty_vlen_string_array * fix test_explicitly_omit_fill_value_via_encoding_kwarg * fix test_append_string_length_mismatch_raises * fix test_check_encoding_is_consistent_after_append for v3 * skip roundtrip_endian for zarr v3 * unskip datetimes and fix test_compressor_encoding * unskip tests * add back dtype skip * point upstream to v3 branch * Create temporary directory before using it * Avoid zarr.storage.zip on v2 * fixed close_store_on_close bug * Remove workaround, fixed upstream * Restore original `w` mode. * workaround for store closing with mode=w * typing fixes * compat * Remove unnecessary pop * fixed skip * fixup types * fixup types * [test-upstream] * Update install-upstream-wheels.sh * set use_consolidated to false when user provides consolidated=False * fix: import consolidated_metadata from package root * fix: relax instrumented store checks for v3 * Adjust 2.18.3 thresholds * skip datatree zarr tests w/ zarr 3 for now * fixed kvstore usage * typing fixes * move zarr.codecs import * fixup ignores * storage options fix, skip * fixed types * Update ci/install-upstream-wheels.sh * type fixes * whats-new * Update xarray/tests/test_backends_datatree.py * fix type import * set mapper, chunk_mapper * Pass through zarr_format * Fixup * more cleanup * revert test changes * Update xarray/backends/zarr.py * cleanup * update docstring * fix rtd * tweak --------- Co-authored-by: Ryan Abernathey <ryan.abernathey@gmail.com> Co-authored-by: Joe Hamman <joe@earthmover.io> Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> Co-authored-by: Deepak Cherian <deepak@cherian.net>
* rewrite the `min_deps_check` script * call the new script * unpin `micromamba` * install `rich-click` * enforce a minimum width of 120 * remove the background colors * remove old min-deps script * more changing of colors * some more styling * ... aaand some more styling * move the style definition in one place * compare versions *before* formatting * move the definition `console` into `main` * properly add two columns to the warnings tables * define the styles using the class and RGB values
… group (#9763) Bumps the actions group with 1 update: [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish). Updates `pypa/gh-action-pypi-publish` from 1.11.0 to 1.12.2 - [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases) - [Commits](pypa/gh-action-pypi-publish@v1.11.0...v1.12.2) --- updated-dependencies: - dependency-name: pypa/gh-action-pypi-publish dependency-type: direct:production update-type: version-update:semver-minor dependency-group: actions ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Use ``map_overlap`` for rolling reducers with Dask * Enable argmin test * Update
* Optimize polyfit Closes #5629 1. Use Variable instead of DataArray 2. Use `reshape_blockwise` when possible following #5629 (comment) * clean up little more * more clean up * Add one comment * Update doc/whats-new.rst * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix whats-new * Update doc/whats-new.rst Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>
* Allow wrapping astropy.units.Quantity * allow all np.ndarray subclasses * whats new * test np.matrix * fix comment --------- Co-authored-by: tvo <tvo.email@proton.me> Co-authored-by: Justus Magin <keewis@users.noreply.github.com> Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
* fix cf decoding of grid_mapping * fix linter * unnest list, add tests * add whats-new.rst entry * check for second warning, copy to prevent windows error (?) * revert copy, but set allow_cleanup_failures=ON_WINDOWS * add itertools * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update xarray/conventions.py Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> * Update conventions.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add test in test_conventions.py * add comment * revert backend tests --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
* Reduce the number of tasks when the limit parameter is set on the push function * Reduce the number of tasks when the limit parameter is set on the push function, and incorporate the method parameter for the cumreduction on the push method * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update xarray/core/dask_array_ops.py Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> * Use last instead of creating a custom function, and add a keepdims parameter for the last and first to make it compatible with the blelloch method * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove the keepdims on the last and first method and use the nanlast method directly, they already have the parameter * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Include the optimization of ffill and bfill on the whats-new.rst * Use map_overlap when the n is smaller than all the chunks * Avoid creating a numpy array to check if all the chunks are bigger than N on the push method * Updating the whats-new.rst * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Patrick Hoefler <phofl@users.noreply.github.com> Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
* add 'User-Agent'-header to pooch.retrieve * try sys.modules * Apply suggestions from code review Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add whats-new.rst entry --------- Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Fix open_mfdataset for list of fsspec files * Rewrite to for loop * Fixup
* add ReadBuffer Protocol for open_mfdataset * finally fix LSP violation * move import out of TYPE_CHECKING
…9793) Bumps the actions group with 1 update: [codecov/codecov-action](https://github.com/codecov/codecov-action). Updates `codecov/codecov-action` from 4.6.0 to 5.0.2 - [Release notes](https://github.com/codecov/codecov-action/releases) - [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md) - [Commits](codecov/codecov-action@v4.6.0...v5.0.2) --- updated-dependencies: - dependency-name: codecov/codecov-action dependency-type: direct:production update-type: version-update:semver-major dependency-group: actions ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…wn to `sliding_window_view` (#9720) * sliding_window_view: add new `automatic_rechunk` kwarg Closes #9550 xref #4325 * Switch to ``sliding_window_kwargs`` * Add one more * better docstring * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Rename to sliding_window_view_kwargs --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Fix code touching text * Fix type ignore syntax * Use type annotations instead of comments * Fix code with two backticks in rst files * Add pygrep-hooks pre-commit * Fix typos in docs and code * Add prettier pre-commit hook * Apply suggestions from code review * Update .pre-commit-config.yaml Co-authored-by: Justus Magin <keewis@users.noreply.github.com> * add to .gitignore --------- Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> Co-authored-by: Justus Magin <keewis@users.noreply.github.com>
* initial namespace-aware implementation * use np subclass, test duck dask arrays * remove dask special casing and numpy fallback * add isnat * hard code the supported ufuncs * handle np versions, separate unary/binary path * explicit unary/binary creators * add to api docs * add whats new * move numpy version check to tests * fix docs for aliased np funcs * fix whats new --------- Co-authored-by: Stephan Hoyer <shoyer@google.com>
* Bump minimum versions * tweak * Update doc/whats-new.rst Co-authored-by: Justus Magin <keewis@users.noreply.github.com> --------- Co-authored-by: Justus Magin <keewis@users.noreply.github.com>
* ENH, TST: aux func for importing optional deps * ENH: use our new helper func for importing optional deps * FIX: use aux func for a few more cftime imports * FIX: remove cruft.... * FIX: Make it play well with mypy Per the proposal at #9561 (comment) This pairs any use of (a now simplified) `attempt_import` with a direct import of the same module, guarded by an `if TYPE_CHECKING` block. * FIX, TST: match error * Update xarray/tests/test_utils.py Co-authored-by: Michael Niklas <mick.niklas@gmail.com> * DOC: add examples section to docstring * refactor: use try-except clause and return original error to user - Also change raise ImportError to raise RuntimeError, since we are catching both ImportError and ModuleNotFoundError * TST: test import of submodules * FIX: Incorporate @headtr1ck suggetsions From #9561 (comment) #9561 (comment) --------- Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> Co-authored-by: Michael Niklas <mick.niklas@gmail.com>
* Add utility for opening remote files with fsspec * Apply Joe's suggestions from code review Co-authored-by: Joe Hamman <jhamman1@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Lint * Add what's new entry * Type hint * Make mypy happy --------- Co-authored-by: Joe Hamman <jhamman1@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add GroupBy.shuffle() * Cleanup * Cleanup * fix * return groupby instance from shuffle * Fix nD by * Skip if no dask * fix tests * Add `chunks` to signature * FIx self * Another Self fix * Forward chunks too * [revert] * undo flox limit * [revert] * fix types * Add DataArray.shuffle_by, Dataset.shuffle_by * Add doctest * Refactor * tweak docstrings * fix typing * Fix * fix docstring * bump min version to dask>=2024.08.1 * Fix typing * Fix types * remove shuffle_by for now. * Add tests * Support shuffling with multiple groupers * Revert "remove shuffle_by for now." This reverts commit 7a99c8f. * bad merge * Add a test * Add docs * bugfix * Refactor out Dataset._shuffle * fix types * fix tests * Handle by is chunked * Some refactoring * Remove shuffle_by * shuffle -> distributed_shuffle * return xarray object from distributed_shuffle * fix * fix doctest * fix api * Rename to `shuffle_to_chunks` * update docs
* Compatibility with Zarr v3b2 * More guards with mode="w" * refactoring * tweak expected requestsC * compat * more compat * fix
* Faster chunk checking for backend datasets * limit size * fix test * optimize
* new blank whatsnew * add note on map_over_subtree -> map_over_datasets
* ListedColormap: don't pass N colors * fix somewhere else * fix typing
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )