You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a preamble, I wanted to highlight the Zarr v3 specification provides a list of officially supported codecs each with their own specification e.g. blosc. Even though https://zarr-specs.readthedocs.io/en/latest/v3/codecs.html is still marked under construction, this is a noticeable improvement over the Zarr v2 specification. Having an official registry of codecs also allows new additions to be proposed using the standard Zarr Enhancement Proposals process.
This issue is motivated by a compatibility issue initially raised in zarr-developers/jzarr#14: a new feature of the dev.zarr:jzarr:0.4.0 implementation added an extra key (numThreads) to the blosc object which in turned prevented the Zarr from been opened using zarr-python due to stricter semantics when reading the blosc dictionary. In that case, the extra key is not essential and a fix is under review to remove it.
This issue raises the wider question of how implementations should deal with codec objects containing unknown metadata fields. The must_understand key/value pair introduced in the v3 specification aims to handle similar scenarios. However as per the current terms
Future versions of this specification may also add new core features by adding new top-level metadata keys. Such features are required by default. However, if the value of an unknown feature is an object containing the key-value pair "must_understand": false, it can be ignored.
...
The array metadata object must not contain any other names. Those are reserved for future versions of this specification. An implementation must fail to open zarr hierarchies, groups or arrays with unknown metadata fields, with the exception of objects with a "must_understand": false key-value pair.
...
The group metadata object must not contain any other names. Those are reserved for future versions of this specification. An implementation must fail to open zarr hierarchies or groups with unknown metadata fields, with the exception of objects with a "must_understand": false key-value pair.
the scope of this key seems to be limited to unspecified top-level objects.
Ideally, the expectations regarding unspecified codec metadata fields should be enforced at the specification level. Note also there is an ongoing discussion in #72 (comment) about whether must_understand should be defined and supported at arbitrary levels which might be relevant to this issue.
The text was updated successfully, but these errors were encountered:
As a preamble, I wanted to highlight the Zarr v3 specification provides a list of officially supported codecs each with their own specification e.g. blosc. Even though https://zarr-specs.readthedocs.io/en/latest/v3/codecs.html is still marked under construction, this is a noticeable improvement over the Zarr v2 specification. Having an official registry of codecs also allows new additions to be proposed using the standard Zarr Enhancement Proposals process.
This issue is motivated by a compatibility issue initially raised in zarr-developers/jzarr#14: a new feature of the
dev.zarr:jzarr:0.4.0
implementation added an extra key (numThreads
) to theblosc
object which in turned prevented the Zarr from been opened usingzarr-python
due to stricter semantics when reading theblosc
dictionary. In that case, the extra key is not essential and a fix is under review to remove it.This issue raises the wider question of how implementations should deal with codec objects containing unknown metadata fields. The
must_understand
key/value pair introduced in the v3 specification aims to handle similar scenarios. However as per the current termsthe scope of this key seems to be limited to unspecified top-level objects.
Ideally, the expectations regarding unspecified codec metadata fields should be enforced at the specification level. Note also there is an ongoing discussion in #72 (comment) about whether
must_understand
should be defined and supported at arbitrary levels which might be relevant to this issue.The text was updated successfully, but these errors were encountered: