Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hypothesis test for netCDF4 roundtrip #3283

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
2 changes: 2 additions & 0 deletions properties/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ They are stored in a separate directory because they tend to run more examples
and thus take longer, and so that local development can run a test suite
without needing to `pip install hypothesis`.

To run these tests, run `pytest` in this directory.

## Hang on, "property-based" tests?

Instead of making assertions about operations on a particular piece of
Expand Down
51 changes: 51 additions & 0 deletions properties/test_netcdf_roundtrip.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
"""
Property-based tests for round-tripping data to netCDF
"""
import hypothesis.extra.numpy as npst
import hypothesis.strategies as st
from hypothesis import given

import xarray as xr


an_array = npst.arrays(
dtype=st.one_of(
npst.unsigned_integer_dtypes(),
npst.integer_dtypes(),
# NetCDF does not support float16
# https://www.unidata.ucar.edu/software/netcdf/docs/data_type.html
npst.floating_dtypes(sizes=(32, 64)),
dcherian marked this conversation as resolved.
Show resolved Hide resolved
npst.byte_string_dtypes(),
npst.unicode_string_dtypes(),
npst.datetime64_dtypes(),
npst.timedelta64_dtypes(),
),
shape=npst.array_shapes(max_side=3), # max_side specified for performance
)

compatible_names = st.text(
alphabet=st.characters(
# Limit characters to upper & lowercase letters and decimal digits
whitelist_categories=("Ll", "Lu", "Nd"),
# It looks like netCDF should allow unicode names, but removing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this specified? It would be useful to provide a link and prehaps explain what's going on with the categories too 😄

# this causes a failure with 'ά'
# https://www.unidata.ucar.edu/software/netcdf/docs/netcdf_data_set_components.html#Permitted
max_codepoint=255,
),
min_size=1,
)


@given(st.data(), an_array)
def test_netcdf_roundtrip(tmp_path, data, arr):
names = data.draw(
st.lists(
compatible_names, min_size=arr.ndim, max_size=arr.ndim, unique=True
).map(tuple)
)
var = xr.Variable(names, arr)
original = xr.Dataset({"data": var})
original.to_netcdf(tmp_path / "test.nc")

with xr.open_dataset(tmp_path / "test.nc") as roundtripped:
xr.testing.assert_identical(original, roundtripped)