Skip to content

Zarr ZipStore versus DirectoryStore: ZipStore requires .close() #4076

@Huite

Description

@Huite

I was saving my dataset into a ZipStore -- apparently succesfully -- but then I couldn't reopen them.

The issue appears to be that a regular DirectoryStore behaves a little differently: it doesn't need to be closed, while a ZipStore.

(I'm not sure how this relates to #2586, the remarks there don't appear to be applicable anymore.)

MCVE Code Sample

This errors:

import xarray as xr
import zarr

# works as expected 
ds = xr.Dataset({'foo': [2,3,4], 'bar': ('x', [1, 2]), 'baz': 3.14})
ds.to_zarr(zarr.DirectoryStore("test.zarr"))
print(xr.open_zarr(zarr.DirectoryStore("test.zarr")))

# error with ValueError "group not found at path ''
ds.to_zarr(zarr.ZipStore("test.zip"))
print(xr.open_zarr(zarr.ZipStore("test.zip")))

Calling close, or using with does the trick:

store = zarr.ZipStore("test2.zip")
ds.to_zarr(store)
store.close()
print(xr.open_zarr(zarr.ZipStore("test2.zip")))

with zarr.ZipStore("test3.zip") as store:
     ds.to_zarr(store)
print(xr.open_zarr(zarr.ZipStore("test3.zip")))

Expected Output

I think it would be preferable to close the ZipStore in this case. But I might be missing something?

Problem Description

Because to_zarr works in this situation with a DirectoryStore, it's easy to assume a ZipStore will work similarly. However, I couldn't get it to read my data back in this case.

Versions

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.6 | packaged by conda-forge | (default, Jan 7 2020, 21:48:41) [MSC v.1916 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
libhdf5: 1.10.5
libnetcdf: 4.7.3

xarray: 0.15.2.dev41+g8415eefa.d20200419
pandas: 0.25.3
numpy: 1.17.5
scipy: 1.3.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: 2.4.0
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.1.2
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.14.0+23.gbea4c9a2
distributed: 2.14.0
matplotlib: 3.1.2
cartopy: None
seaborn: 0.10.0
numbagg: None
pint: None
setuptools: 46.1.3.post20200325
pip: 20.0.2
conda: None
pytest: 5.3.4
IPython: 7.13.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    staletopic-zarrRelated to zarr storage library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions