Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resizing an array overwrites the dimension separator metadata #1533

Closed
ziw-liu opened this issue Sep 29, 2023 · 2 comments · Fixed by #1540
Closed

Resizing an array overwrites the dimension separator metadata #1533

ziw-liu opened this issue Sep 29, 2023 · 2 comments · Fixed by #1540
Labels
bug Potential issues with the zarr-python library

Comments

@ziw-liu
Copy link
Contributor

ziw-liu commented Sep 29, 2023

Zarr version

v2.15.0, v2.16.1

Numcodecs version

v0.11.0

Python Version

3.9.16, 3.10.12

Operating System

Linux, macOS

Installation

using pip into a conda environment

Description

When an array with "dimension_separator": "/" is resized, the metadata field is overwritten (removed from the .zarray file). Subsequent reading operations will silently fall back to the default ("."), and fill zeros because it will not find the chunks with the expected path name.

Steps to reproduce

import zarr

with zarr.open(
    zarr.DirectoryStore("img.zarr", dimension_separator="/"), mode="w"
) as img:
    img[0] = zarr.ones((100, 100), chunks=(10, 10))

with zarr.open("img.zarr") as img:
    assert img[0]._dimension_separator == "/"

with zarr.open("img.zarr") as img:
    img[0].resize(100, 200)

with zarr.open("img.zarr") as img:
    assert img[0]._dimension_separator == "/", "Dimension separator changed"

Additional output

No response

@ziw-liu ziw-liu added the bug Potential issues with the zarr-python library label Sep 29, 2023
@ziw-liu
Copy link
Contributor Author

ziw-liu commented Sep 29, 2023

I traced this to Array._flush_metadata_nosync, specifically this branching statement:

zarr-python/zarr/core.py

Lines 343 to 357 in 6ec746e

if getattr(self._store, "_store_version", 2) == 2:
meta.update(dict(chunks=self._chunks, dtype=self._dtype, order=self._order))
else:
meta.update(
dict(
chunk_grid=dict(
type="regular",
chunk_shape=self._chunks,
separator=self._dimension_separator,
),
data_type=self._dtype,
chunk_memory_layout=self._order,
attributes=self.attrs.asdict(),
)
)

The dimension separator is not added to the updated metadata if the underlying store is v2.

@ziw-liu
Copy link
Contributor Author

ziw-liu commented Sep 29, 2023

Is it of interest to verify the metadata against the specification before writing operations? Even something as simple as a TypedDict instead of a dict used here can catch these bugs without much overhead.

Edit: #1526 seems to be doing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant