-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store axes as _ARRAY_DIMENSIONS
#166
Comments
👍 but if I remember correctly the status of my testing showed that attempting to access the zgroup with xarray led to an exception. The workaround suggested by @aurghs and @alexamici was to use differently named axes per resolution level > 0. |
Cheers, I wasn't aware of the limitation and that's definitely something that should be tested with the writer addition. If some the axes names needs to be resolution-aware in the |
Is there a concrete issue for this? I couldn't find any mention of this problem in the one you linked (zarr-developers/zarr-specs#125). |
I'm still not entirely clear from the NGFF spec and the xarray page whether |
Yes exactly, this metadata is expected to be defined in the array attributes |
Apologies, @constantinpape, no. This happened to date on the Zarr side with setting up the funding and getting B-Open up to speed. |
I was able to reproduce the error with the following local changes (base) sbesson@ls30630:ome-zarr-py (xarray_investigation) $ git diff --cached
diff --git a/.isort.cfg b/.isort.cfg
index b3d36cb..367ece7 100644
--- a/.isort.cfg
+++ b/.isort.cfg
@@ -1,5 +1,5 @@
[settings]
-known_third_party = dask,numpy,pytest,scipy,setuptools,skimage,zarr
+known_third_party = dask,numpy,pytest,scipy,setuptools,skimage,xarray,zarr
multi_line_output = 3
include_trailing_comma = True
force_grid_wrap = 0
diff --git a/ome_zarr/writer.py b/ome_zarr/writer.py
index 83bbe6d..497ed32 100644
--- a/ome_zarr/writer.py
+++ b/ome_zarr/writer.py
@@ -206,8 +206,9 @@ def write_multiscale(
datasets: List[dict] = []
for path, data in enumerate(pyramid):
# TODO: chunks here could be different per layer
- group.create_dataset(str(path), data=data, chunks=chunks)
+ dataset = group.create_dataset(str(path), data=data, chunks=chunks)
datasets.append({"path": str(path)})
+ dataset.attrs["_ARRAY_DIMENSIONS"] = [x["name"] for x in axes]
if coordinate_transformations is None:
shapes = [data.shape for data in pyramid]
diff --git a/requirements/requirements-test.txt b/requirements/requirements-test.txt
index 9cf8484..cc5774c 100644
--- a/requirements/requirements-test.txt
+++ b/requirements/requirements-test.txt
@@ -1,3 +1,4 @@
pytest
pytest-cov
codecov
+xarray
diff --git a/tests/test_xarray.py b/tests/test_xarray.py
new file mode 100644
index 0000000..187681f
--- /dev/null
+++ b/tests/test_xarray.py
@@ -0,0 +1,14 @@
+import pytest
+import xarray
+
+from ome_zarr.data import create_zarr
+
+
+class TestXarray:
+ @pytest.fixture(autouse=True)
+ def initdir(self, tmpdir):
+ self.path = tmpdir.mkdir("data")
+ create_zarr(str(self.path))
+
+ def test_open_zarr(self):
+ xarray.open_zarr(str(self.path)) Running The workaround mentioned in #166 (comment) i.e. suffixing the axes name using the resolution level is sufficient to let this simple test pass: diff --git a/ome_zarr/writer.py b/ome_zarr/writer.py
index 497ed32..c5438a3 100644
--- a/ome_zarr/writer.py
+++ b/ome_zarr/writer.py
@@ -208,7 +208,7 @@ def write_multiscale(
# TODO: chunks here could be different per layer
dataset = group.create_dataset(str(path), data=data, chunks=chunks)
datasets.append({"path": str(path)})
- dataset.attrs["_ARRAY_DIMENSIONS"] = [x["name"] for x in axes]
+ dataset.attrs["_ARRAY_DIMENSIONS"] = ["%s_%s" %(x["name"], path) for x in axes]
if coordinate_transformations is None:
shapes = [data.shape for data in pyramid] Discussing briefly with @joshmoore, I feel like we need to make a decision before we release OME-NGFF 0.4 and implement the feature across libraries. I can see three viable options:
/cc @aurghs @alexamici @constantinpape happy to discuss on the image.sc zulip or get on a call if that would help resolving this outstanding issue. |
@joshmoore @sbesson I am still missing some context here about what the actual problem is. Could you please give me a few line TLDR? |
After some discussions on zulip: let's remove |
Closing for now, the requirement has been removed from the 0.4 specification. Various layout currently allow to use |
Sorry for joining the party late. @aurghs and I are still getting up to speed with the microscopy data formats, we come from geospatial / satellite (GeoTIFF, HDF5, etc) and climate / scientific (netCDF / CF Conventions, GRIB, etc). As identified in our first call with @joshmoore the main issues with in the interaction between the NGFF spec and Xarray is that Xarray Zarr backend assume a few structural features for the Zarr store that make it similar to a netCDF, and the NGFF appears to break some of those assumptions. As soon as we feel comfortable with the microscopy data format we will engage the community more. |
Noticed while reviewing glencoesoftware/bioformats2raw#121, the
ome_zarr.writer
API currently does not store the axes metadata as_ARRAY_DIMENSIONS
under the.zattrs
of each resolution array.Since this is a requirement of the spec,introduced in 0.3 - see https://ngff.openmicroscopy.org/0.3/#multiscale-md, the relevant writer APIs should be updated before releasing 0.3.0.
The text was updated successfully, but these errors were encountered: