`StoreTest` has some odd parameterizations #2698

y4n9squared · 2025-01-13T21:56:09Z

Zarr version

3.0.0

Numcodecs version

0.14.1

Python Version

3.13

Operating System

Linux

Installation

Using pip

Description

These parameterizations for test_set and these combinations in test_set_many end up testing for patterns that won't occur: A zarr.json whose contents are arbitrary bytes or empty or contain the string "zarr.json".

I was adapting a custom store from v2 land and hit a few test case failures due to this (during JSON deserialization of the zarr.json file). Any interest in making the pytest parameterizations a bit more realistic? I can take a stab at it if so.

Steps to reproduce

Write a custom store implementation that actually performs JSON deserialization of zarr.json.

Additional output

No response

The text was updated successfully, but these errors were encountered:

d-v-b · 2025-01-13T22:04:59Z

currently the Store classes are designed to be agnostic to the zarr format -- the stores don't know that a key called zarr.json will encode a zarr metadata document. Given that fact, those test cases make sense.

Out of curiosity, what was the custom store you were trying to adapt? Because the 2.x store base classes were designed in the same way as 3.0 -- they just mapped strings to bytes, without applying any zarr semantics.

y4n9squared · 2025-01-13T22:10:37Z

We are using an in-house store implementation that loosely targets some goals of Icechunk (predates Icechunk). I only came across the StoreTest class after going through the release notes. It's possible our class wouldn't have passed the original but we ran a different set of tests (driven by Xarray's to_zarr() implementation).

The implementation is sensitive to whether the key is a chunk or metadata key because the metadata updates are handled specially to ensure read consistency. If you'd like to keep StoreTest agnostic to Zarr format, we can simply override the problematic test cases.

d-v-b · 2025-01-13T22:15:01Z

The implementation is sensitive to whether the key is a chunk or metadata key because the metadata updates are handled specially to ensure read consistency.

Right, in this case our store classes are at a lower abstraction level than this, so we should expect their tests to fail for something higher level. For what it's worth, we have considered making the stores less general and more zarr-format-specific, in part because chunks files and metadata might benefit from different treatment (e.g., separate caches). But I don't think there are active proposals for this right now.

d-v-b · 2025-01-13T22:18:19Z

If you'd like to keep StoreTest agnostic to Zarr format, we can simply override the problematic test cases.

I think overriding those cases is the right course of action here -- since modelling the zarr format is not part of the store design, even if we changed the relevant test cases to use "real" zarr keys and "real" zarr metadata documents, it's possible that we might revert those changes that one day. It's probably safer if you explicitly ensure on your end that the store tests model your store class properly.

y4n9squared · 2025-01-13T22:22:29Z

Sounds good. Closing.

jhamman · 2025-01-13T22:39:21Z

I would like to see us update these tests to use valid zarr metadata when writing to these stores. Alternatively, we could move the offending tests to specific stores (e.g. ZipStore).

d-v-b · 2025-01-13T22:45:00Z

doesn't the rest of the zarr test suite essentially test the ability of the store to read and write valid zarr metadata documents?

d-v-b · 2025-01-13T22:51:18Z

We have a lot of tests like this:

zarr-python/src/zarr/testing/store.py

Line 230 in b982224

await store.set("foo/zarr.json", self.buffer_cls.from_bytes(b"bar"))

I think it's important that tests are short and readable. Using b"bar" as the payload, instead of writing out a valid zarr.json document, helps with this. That being said, if it's possible to use valid zarr metadata, but have a comparably concise and readable test case, then I don't see a problem with the idea.

y4n9squared · 2025-01-13T23:00:48Z

What I have essentially done on my side is create a constant that is

"""{
  "attributes": {},
  "zarr_format": 3,
  "consolidated_metadata": null,
  "node_type": "group"
}""".encode()

and used that as payloads for zarr.json.

d-v-b · 2025-01-14T10:50:22Z

we will need multiple pieces of data, to ensure that our store routines don't inadvertantly associate multiple keys with the same payload. These could be generated with a function that takes a dict D as a parameter, and it returns encoded zarr v3 group metadata but with the attributes field set to D. I think this would be easy enough to invoke without bloating the tests.

y4n9squared added the bug Potential issues with the zarr-python library label Jan 13, 2025

y4n9squared closed this as completed Jan 13, 2025

This was referenced Feb 1, 2025

Monthly issue metrics report #2787

Open

Monthly issue metrics report sanketverma1704/zarr-python#7

Open

Monthly issue metrics report enthusiastdev121/zarr-python#16

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`StoreTest` has some odd parameterizations #2698

`StoreTest` has some odd parameterizations #2698

y4n9squared commented Jan 13, 2025

d-v-b commented Jan 13, 2025

y4n9squared commented Jan 13, 2025 •

edited

Loading

d-v-b commented Jan 13, 2025

d-v-b commented Jan 13, 2025

y4n9squared commented Jan 13, 2025

jhamman commented Jan 13, 2025

d-v-b commented Jan 13, 2025

d-v-b commented Jan 13, 2025 •

edited

Loading

y4n9squared commented Jan 13, 2025 •

edited

Loading

d-v-b commented Jan 14, 2025

StoreTest has some odd parameterizations #2698

StoreTest has some odd parameterizations #2698

Comments

y4n9squared commented Jan 13, 2025

Zarr version

Numcodecs version

Python Version

Operating System

Installation

Description

Steps to reproduce

Additional output

d-v-b commented Jan 13, 2025

y4n9squared commented Jan 13, 2025 • edited Loading

d-v-b commented Jan 13, 2025

d-v-b commented Jan 13, 2025

y4n9squared commented Jan 13, 2025

jhamman commented Jan 13, 2025

d-v-b commented Jan 13, 2025

d-v-b commented Jan 13, 2025 • edited Loading

y4n9squared commented Jan 13, 2025 • edited Loading

d-v-b commented Jan 14, 2025

`StoreTest` has some odd parameterizations #2698

`StoreTest` has some odd parameterizations #2698

y4n9squared commented Jan 13, 2025 •

edited

Loading

d-v-b commented Jan 13, 2025 •

edited

Loading

y4n9squared commented Jan 13, 2025 •

edited

Loading