Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from pydata:main #563

Merged
merged 56 commits into from
Jul 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
b636e68
add link to CF conventions on packed data in doc/user-guide/io.rst (#…
kmuehlbauer Jun 11, 2024
f0ee037
add order for polynomial interpolation, fixes #8762 (#9079)
nkarasiak Jun 11, 2024
2013e7f
Fix upcasting with python builtin numbers and numpy 2 (#8946)
djhoese Jun 11, 2024
a814d27
Add Eni to CITATION.cff (#9095)
eni-awowale Jun 11, 2024
938c8b6
add Jessica to citation (#9096)
JessicaS11 Jun 11, 2024
52fb457
(fix): don't handle time-dtypes as extension arrays in `from_datafram…
ilan-gold Jun 11, 2024
d9e4de6
Micro optimizations to improve indexing (#9002)
hmaarrfk Jun 11, 2024
cb3663d
Migrate datatree io.py and common.py into xarray/core (#9011)
owenlittlejohns Jun 12, 2024
3967351
open_datatree performance improvement on NetCDF, H5, and Zarr files (…
aladinor Jun 12, 2024
aacfeba
[skip-ci] Fix skip-ci for hypothesis (#9102)
dcherian Jun 12, 2024
7ec0952
Adds Matt Savoie to CITATION.cff (#9103)
flamingbear Jun 12, 2024
b221808
skip the `pandas` datetime roundtrip test with `pandas=3.0` (#9104)
keewis Jun 12, 2024
2e0dd6f
Add user survey announcement to docs (#9101)
jhamman Jun 12, 2024
cea4dd1
add remaining core-dev citations [skip-ci][skip-rtd] (#9110)
keewis Jun 12, 2024
ce196d5
Undo custom padding-top. (#9107)
dcherian Jun 12, 2024
6554855
[skip-ci] Try fixing hypothesis CI trigger (#9112)
dcherian Jun 13, 2024
b31a495
release notes for 2024.06.0 (#9092)
keewis Jun 13, 2024
bef0406
release v2024.06.0 (#9113)
keewis Jun 13, 2024
9237f90
new whats-new section (#9115)
keewis Jun 13, 2024
380979f
Move Sphinx directives out of `See also` (#8466)
max-sixty Jun 13, 2024
211d313
Add test for rechunking to a size string (#9117)
dcherian Jun 14, 2024
1265310
Update docstring in api.py for open_mfdataset(), clarifying "chunks" …
arthur-e Jun 14, 2024
599b779
Grouper refactor (#9122)
dcherian Jun 14, 2024
5ac8394
adjust repr tests to account for different platforms (#9127) (#9128)
mgorny Jun 16, 2024
32e1f33
Bump the actions group with 2 updates (#9130)
dependabot[bot] Jun 17, 2024
be8e17e
Support duplicate dimensions in `.chunk` (#9099)
mraspaud Jun 17, 2024
b1f3fea
Update zendoo badge link (#9133)
max-sixty Jun 17, 2024
3fd162e
Split out distributed writes in zarr docs (#9132)
max-sixty Jun 19, 2024
af722f0
Improve `to_zarr` docs (#9139)
max-sixty Jun 21, 2024
2645d7f
groupby: remove some internal use of IndexVariable (#9123)
dcherian Jun 21, 2024
deb2082
Improve zarr chunks docs (#9140)
max-sixty Jun 22, 2024
fe4fb06
Include numbagg in type checks (#9159)
max-sixty Jun 24, 2024
c8ff731
Remove mypy exclusions for a couple more libraries (#9160)
max-sixty Jun 24, 2024
872c1c5
Add test for #9155 (#9161)
max-sixty Jun 24, 2024
56209bd
Docs: Add page with figure for navigating help resources (#9147)
JessicaS11 Jun 24, 2024
b518074
switch to unit `"D"` (#9170)
keewis Jun 25, 2024
07b1756
Slightly improve DataTree repr (#9064)
shoyer Jun 26, 2024
19d0fbf
Fix example code formatting for CachingFileManager (#9178)
djhoese Jun 26, 2024
651bd12
Change np.core.defchararray to np.char (#9165) (#9166)
pont-us Jun 26, 2024
fa41cc0
temporarily pin `numpy<2` (#9181)
keewis Jun 27, 2024
48a4f7a
temporarily remove `pydap` from CI (#9183)
keewis Jun 27, 2024
f4183ec
also pin `numpy` in the all-but-dask CI (#9184)
keewis Jun 27, 2024
42ed6d3
promote floating-point numeric datetimes to 64-bit before decoding (#…
keewis Jun 28, 2024
caed274
`"source"` encoding for datasets opened from `fsspec` objects (#8923)
keewis Jun 30, 2024
3deee7b
properly diff objects with arrays as attributes on variables (#9169)
keewis Jun 30, 2024
fff8253
Allow str in static typing of reindex, ffill etc. (#9194)
headtr1ck Jun 30, 2024
24ab84c
Fix dark-theme in `html[data-theme=dark]`-tags (#9200)
prisae Jul 1, 2024
90e4486
Add open_datatree benchmark (#9158)
aladinor Jul 1, 2024
6c2d8c3
use a `composite` strategy to generate the dataframe with a tz-aware …
keewis Jul 1, 2024
a86c3ff
Hierarchical coordinates in DataTree (#9063)
shoyer Jul 3, 2024
52a7371
avoid converting custom indexes to pandas indexes when formatting coo…
keewis Jul 5, 2024
971d71d
Fix reductions for `np.complex_` dtypes with numbagg (#9210)
max-sixty Jul 7, 2024
04b38a0
Consolidate some numbagg tests (#9211)
max-sixty Jul 7, 2024
bac01c0
Use numpy 2.0-compat `np.complex64` dtype in test (#9217)
max-sixty Jul 8, 2024
179c670
Fix two bugs in DataTree.update() (#9214)
shoyer Jul 8, 2024
3024655
Only use necessary dims when creating temporary dataarray (#9206)
Illviljan Jul 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/ci-additional.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ jobs:
python -m mypy --install-types --non-interactive --cobertura-xml-report mypy_report xarray/

- name: Upload mypy coverage to Codecov
uses: codecov/codecov-action@v4.4.1
uses: codecov/codecov-action@v4.5.0
with:
file: mypy_report/cobertura.xml
flags: mypy
Expand Down Expand Up @@ -184,7 +184,7 @@ jobs:
python -m mypy --install-types --non-interactive --cobertura-xml-report mypy_report xarray/

- name: Upload mypy coverage to Codecov
uses: codecov/codecov-action@v4.4.1
uses: codecov/codecov-action@v4.5.0
with:
file: mypy_report/cobertura.xml
flags: mypy39
Expand Down Expand Up @@ -245,7 +245,7 @@ jobs:
python -m pyright xarray/

- name: Upload pyright coverage to Codecov
uses: codecov/codecov-action@v4.4.1
uses: codecov/codecov-action@v4.5.0
with:
file: pyright_report/cobertura.xml
flags: pyright
Expand Down Expand Up @@ -304,7 +304,7 @@ jobs:
python -m pyright xarray/

- name: Upload pyright coverage to Codecov
uses: codecov/codecov-action@v4.4.1
uses: codecov/codecov-action@v4.5.0
with:
file: pyright_report/cobertura.xml
flags: pyright39
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ jobs:
path: pytest.xml

- name: Upload code coverage to Codecov
uses: codecov/codecov-action@v4.4.1
uses: codecov/codecov-action@v4.5.0
with:
file: ./coverage.xml
flags: unittests
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/hypothesis.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,9 @@ jobs:
if: |
always()
&& (
(github.event_name == 'schedule' || github.event_name == 'workflow_dispatch')
|| needs.detect-ci-trigger.outputs.triggered == 'true'
|| contains( github.event.pull_request.labels.*.name, 'run-slow-hypothesis')
needs.detect-ci-trigger.outputs.triggered == 'false'
&& ( (github.event_name == 'schedule' || github.event_name == 'workflow_dispatch')
|| contains( github.event.pull_request.labels.*.name, 'run-slow-hypothesis'))
)
defaults:
run:
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pypi-release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ jobs:
path: dist
- name: Publish package to TestPyPI
if: github.event_name == 'push'
uses: pypa/gh-action-pypi-publish@v1.8.14
uses: pypa/gh-action-pypi-publish@v1.9.0
with:
repository_url: https://test.pypi.org/legacy/
verbose: true
Expand All @@ -111,6 +111,6 @@ jobs:
name: releases
path: dist
- name: Publish package to PyPI
uses: pypa/gh-action-pypi-publish@v1.8.14
uses: pypa/gh-action-pypi-publish@v1.9.0
with:
verbose: true
2 changes: 1 addition & 1 deletion .github/workflows/upstream-dev-ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ jobs:
run: |
python -m mypy --install-types --non-interactive --cobertura-xml-report mypy_report
- name: Upload mypy coverage to Codecov
uses: codecov/codecov-action@v4.4.1
uses: codecov/codecov-action@v4.5.0
with:
file: mypy_report/cobertura.xml
flags: mypy
Expand Down
10 changes: 10 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,16 @@ authors:
- family-names: "Henderson"
given-names: "Scott"
orcid: "https://orcid.org/0000-0003-0624-4965"
- family-names: "Awowale"
given-names: "Eniola Olufunke"
- family-names: "Scheick"
given-names: "Jessica"
orcid: "https://orcid.org/0000-0002-3421-4459"
- family-names: "Savoie"
given-names: "Matthew"
orcid: "https://orcid.org/0000-0002-8881-2550"
- family-names: "Littlejohns"
given-names: "Owen"
title: "xarray"
abstract: "N-D labeled arrays and datasets in Python."
license: Apache-2.0
Expand Down
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
[![Available on pypi](https://img.shields.io/pypi/v/xarray.svg)](https://pypi.python.org/pypi/xarray/)
[![Formatted with black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/python/black)
[![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)
[![Mirror on zendoo](https://zenodo.org/badge/DOI/10.5281/zenodo.598201.svg)](https://doi.org/10.5281/zenodo.598201)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.11183201.svg)](https://doi.org/10.5281/zenodo.11183201)
[![Examples on binder](https://img.shields.io/badge/launch-binder-579ACA.svg?logo=)](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/weather-data.ipynb)
[![Twitter](https://img.shields.io/twitter/follow/xarray_dev?style=social)](https://twitter.com/xarray_dev)

Expand Down Expand Up @@ -46,15 +46,15 @@ provide a powerful and concise interface. For example:

- Apply operations over dimensions by name: `x.sum('time')`.
- Select values by label instead of integer location:
`x.loc['2014-01-01']` or `x.sel(time='2014-01-01')`.
`x.loc['2014-01-01']` or `x.sel(time='2014-01-01')`.
- Mathematical operations (e.g., `x - y`) vectorize across multiple
dimensions (array broadcasting) based on dimension names, not shape.
dimensions (array broadcasting) based on dimension names, not shape.
- Flexible split-apply-combine operations with groupby:
`x.groupby('time.dayofyear').mean()`.
`x.groupby('time.dayofyear').mean()`.
- Database like alignment based on coordinate labels that smoothly
handles missing values: `x, y = xr.align(x, y, join='outer')`.
handles missing values: `x, y = xr.align(x, y, join='outer')`.
- Keep track of arbitrary metadata in the form of a Python dictionary:
`x.attrs`.
`x.attrs`.

## Documentation

Expand All @@ -73,12 +73,12 @@ page](https://docs.xarray.dev/en/stable/contributing.html).
## Get in touch

- Ask usage questions ("How do I?") on
[GitHub Discussions](https://github.com/pydata/xarray/discussions).
[GitHub Discussions](https://github.com/pydata/xarray/discussions).
- Report bugs, suggest features or view the source code [on
GitHub](https://github.com/pydata/xarray).
GitHub](https://github.com/pydata/xarray).
- For less well defined questions or ideas, or to announce other
projects of interest to xarray users, use the [mailing
list](https://groups.google.com/forum/#!forum/xarray).
projects of interest to xarray users, use the [mailing
list](https://groups.google.com/forum/#!forum/xarray).

## NumFOCUS

Expand Down Expand Up @@ -114,7 +114,7 @@ Licensed under the Apache License, Version 2.0 (the "License"); you
may not use this file except in compliance with the License. You may
obtain a copy of the License at

<https://www.apache.org/licenses/LICENSE-2.0>
<https://www.apache.org/licenses/LICENSE-2.0>

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
Expand Down
113 changes: 112 additions & 1 deletion asv_bench/benchmarks/dataset_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
import pandas as pd

import xarray as xr
from xarray.backends.api import open_datatree
from xarray.core.datatree import DataTree

from . import _skip_slow, parameterized, randint, randn, requires_dask

Expand All @@ -16,7 +18,6 @@
except ImportError:
pass


os.environ["HDF5_USE_FILE_LOCKING"] = "FALSE"

_ENGINES = tuple(xr.backends.list_engines().keys() - {"store"})
Expand Down Expand Up @@ -469,6 +470,116 @@ def create_delayed_write():
return ds.to_netcdf("file.nc", engine="netcdf4", compute=False)


class IONestedDataTree:
"""
A few examples that benchmark reading/writing a heavily nested netCDF datatree with
xarray
"""

timeout = 300.0
repeat = 1
number = 5

def make_datatree(self, nchildren=10):
# multiple Dataset
self.ds = xr.Dataset()
self.nt = 1000
self.nx = 90
self.ny = 45
self.nchildren = nchildren

self.block_chunks = {
"time": self.nt / 4,
"lon": self.nx / 3,
"lat": self.ny / 3,
}

self.time_chunks = {"time": int(self.nt / 36)}

times = pd.date_range("1970-01-01", periods=self.nt, freq="D")
lons = xr.DataArray(
np.linspace(0, 360, self.nx),
dims=("lon",),
attrs={"units": "degrees east", "long_name": "longitude"},
)
lats = xr.DataArray(
np.linspace(-90, 90, self.ny),
dims=("lat",),
attrs={"units": "degrees north", "long_name": "latitude"},
)
self.ds["foo"] = xr.DataArray(
randn((self.nt, self.nx, self.ny), frac_nan=0.2),
coords={"lon": lons, "lat": lats, "time": times},
dims=("time", "lon", "lat"),
name="foo",
attrs={"units": "foo units", "description": "a description"},
)
self.ds["bar"] = xr.DataArray(
randn((self.nt, self.nx, self.ny), frac_nan=0.2),
coords={"lon": lons, "lat": lats, "time": times},
dims=("time", "lon", "lat"),
name="bar",
attrs={"units": "bar units", "description": "a description"},
)
self.ds["baz"] = xr.DataArray(
randn((self.nx, self.ny), frac_nan=0.2).astype(np.float32),
coords={"lon": lons, "lat": lats},
dims=("lon", "lat"),
name="baz",
attrs={"units": "baz units", "description": "a description"},
)

self.ds.attrs = {"history": "created for xarray benchmarking"}

self.oinds = {
"time": randint(0, self.nt, 120),
"lon": randint(0, self.nx, 20),
"lat": randint(0, self.ny, 10),
}
self.vinds = {
"time": xr.DataArray(randint(0, self.nt, 120), dims="x"),
"lon": xr.DataArray(randint(0, self.nx, 120), dims="x"),
"lat": slice(3, 20),
}
root = {f"group_{group}": self.ds for group in range(self.nchildren)}
nested_tree1 = {
f"group_{group}/subgroup_1": xr.Dataset() for group in range(self.nchildren)
}
nested_tree2 = {
f"group_{group}/subgroup_2": xr.DataArray(np.arange(1, 10)).to_dataset(
name="a"
)
for group in range(self.nchildren)
}
nested_tree3 = {
f"group_{group}/subgroup_2/sub-subgroup_1": self.ds
for group in range(self.nchildren)
}
dtree = root | nested_tree1 | nested_tree2 | nested_tree3
self.dtree = DataTree.from_dict(dtree)


class IOReadDataTreeNetCDF4(IONestedDataTree):
def setup(self):
# TODO: Lazily skipped in CI as it is very demanding and slow.
# Improve times and remove errors.
_skip_slow()

requires_dask()

self.make_datatree()
self.format = "NETCDF4"
self.filepath = "datatree.nc4.nc"
dtree = self.dtree
dtree.to_netcdf(filepath=self.filepath)

def time_load_datatree_netcdf4(self):
open_datatree(self.filepath, engine="netcdf4").load()

def time_open_datatree_netcdf4(self):
open_datatree(self.filepath, engine="netcdf4")


class IOWriteNetCDFDask:
timeout = 60
repeat = 1
Expand Down
4 changes: 2 additions & 2 deletions ci/requirements/all-but-dask.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@ dependencies:
- netcdf4
- numba
- numbagg
- numpy
- numpy<2
- packaging
- pandas
- pint>=0.22
- pip
- pydap
# - pydap
- pytest
- pytest-cov
- pytest-env
Expand Down
3 changes: 2 additions & 1 deletion ci/requirements/doc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ dependencies:
- nbsphinx
- netcdf4>=1.5
- numba
- numpy>=1.21
- numpy>=1.21,<2
- packaging>=21.3
- pandas>=1.4,!=2.1.0
- pooch
Expand All @@ -42,5 +42,6 @@ dependencies:
- sphinxext-rediraffe
- zarr>=2.10
- pip:
- sphinxcontrib-mermaid
# relative to this file. Needs to be editable to be accepted.
- -e ../..
4 changes: 2 additions & 2 deletions ci/requirements/environment-windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,13 @@ dependencies:
- netcdf4
- numba
- numbagg
- numpy
- numpy<2
- packaging
- pandas
# - pint>=0.22
- pip
- pre-commit
- pydap
# - pydap
- pytest
- pytest-cov
- pytest-env
Expand Down
4 changes: 2 additions & 2 deletions ci/requirements/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ dependencies:
- numba
- numbagg
- numexpr
- numpy
- numpy<2
- opt_einsum
- packaging
- pandas
Expand All @@ -35,7 +35,7 @@ dependencies:
- pooch
- pre-commit
- pyarrow # pandas raises a deprecation warning without this, breaking doctests
- pydap
# - pydap
- pytest
- pytest-cov
- pytest-env
Expand Down
7 changes: 2 additions & 5 deletions doc/_static/style.css
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,8 @@ table.docutils td {
word-wrap: break-word;
}

div.bd-header-announcement {
background-color: unset;
color: #000;
.bd-header-announcement {
background-color: var(--pst-color-info-bg);
}

/* Reduce left and right margins */
Expand Down Expand Up @@ -222,8 +221,6 @@ main *:target::before {
}

body {
/* Add padding to body to avoid overlap with navbar. */
padding-top: var(--navbar-height);
width: 100%;
}

Expand Down
Loading
Loading