-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Description
What happened: I have an xr.Dataset
with a dask-array-valued variable including zero-length dimension (other variables are non-empty). I tried saving it to zarr, but it fails with a zero division error.
What you expected to happen: I expect it to save without any errors.
Minimal Complete Verifiable Example: the following commands fail.
import numpy as np
import xarray as xr
ds = xr.Dataset(
{"x": (("a", "b", "c"), np.empty((75, 0, 30))), "y": (("a", "c"), np.random.normal(size=(75, 30)))},
{"a": np.arange(75), "b": [], "c": np.arange(30)},
).chunk({})
ds.to_zarr("fails.zarr") # RAISES ZeroDivisionError
Anything else we need to know?: If we load all the empty arrays to numpy, it is able to save correctly. That is:
ds["x"].load() # run on all variables that have a zero dimension
ds.to_zarr("works.zarr") # successfully runs
I'll make a PR using this solution, but not sure if this is a deeper bug that should be fixed in zarr or in a nicer way.
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27)
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.4.72-microsoft-standard-WSL2
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.10.6
libnetcdf: 4.8.0
xarray: 0.19.0
pandas: 1.2.4
numpy: 1.20.2
scipy: 1.7.1
netCDF4: 1.5.6
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: 2.9.3
cftime: 1.5.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2021.08.1
distributed: 2021.08.1
matplotlib: 3.4.1
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 49.6.0.post20210108
pip: 21.0.1
conda: None
pytest: None
IPython: 7.22.0
sphinx: None
Metadata
Metadata
Assignees
Labels
No labels