You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The gist of the issue is that when trying to store an array-like object to a Zarr Group, it looks for the chunks attribute to advise on how to chunk. When it finds chunks on Dask Arrays, the chunk sizes are not global and uniform necessarily, but specific sizes are given for each chunk, which may not be uniform. Zarr understandably stumbles over this as the format is not what it expects.
Even if Zarr could handle the Dask chunking format somehow, there is a question of what to do with non-uniform chunk sizes. There are two main options to consider: support non-uniform chunking in Zarr ( https://github.com/zarr-developers/zarr/issues/245 ) and/or rechunk Dask Arrays to be uniform ( dask/dask#3302 ). So some things to think about on both fronts. This should help provide both of those issues more context.
Ideally would like to be able to store Dask Arrays to Zarr with little more than __setitem__. In practice this doesn't work. That said, this borders more on a feature request than a bug report. Given that Dask Arrays are really not NumPy arrays, we may need a from_dask_array method
Version and installation information
Please provide the following:
Value of zarr.__version__: 2.2.0
Value of numcodecs.__version__: 0.5.4
Version of Python interpreter: 3.6.4
Operating system (Linux/Windows/Mac): Mac
How Zarr was installed (e.g., "using pip into virtual environment", or "using conda"): conda
Also, if you think it might be relevant, please provide the output from pip freeze or conda env export depending on which was used to install Zarr.
The gist of the issue is that when trying to store an array-like object to a Zarr Group, it looks for the
chunks
attribute to advise on how to chunk. When it findschunks
on Dask Arrays, the chunk sizes are not global and uniform necessarily, but specific sizes are given for each chunk, which may not be uniform. Zarr understandably stumbles over this as the format is not what it expects.Even if Zarr could handle the Dask chunking format somehow, there is a question of what to do with non-uniform chunk sizes. There are two main options to consider: support non-uniform chunking in Zarr ( https://github.com/zarr-developers/zarr/issues/245 ) and/or rechunk Dask Arrays to be uniform ( dask/dask#3302 ). So some things to think about on both fronts. This should help provide both of those issues more context.
cc @mrocklin
Minimal, reproducible code sample, a copy-pastable example if possible
Problem description
Ideally would like to be able to store Dask Arrays to Zarr with little more than
__setitem__
. In practice this doesn't work. That said, this borders more on a feature request than a bug report. Given that Dask Arrays are really not NumPy arrays, we may need afrom_dask_array
methodVersion and installation information
Please provide the following:
zarr.__version__
: 2.2.0numcodecs.__version__
: 0.5.4Also, if you think it might be relevant, please provide the output from
pip freeze
orconda env export
depending on which was used to install Zarr.The text was updated successfully, but these errors were encountered: