From 75b696819481363c1095f9a076a2a257c66b3fe0 Mon Sep 17 00:00:00 2001 From: Stephan Hoyer Date: Sun, 15 Jul 2018 16:43:07 -0500 Subject: [PATCH 1/2] DOC: zarr spec v3: adds optional dimensions and the "netZDF" format xref GH167 --- docs/spec/v2.rst | 96 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 92 insertions(+), 4 deletions(-) diff --git a/docs/spec/v2.rst b/docs/spec/v2.rst index 2a3bbd9a54..0483f9f2a3 100644 --- a/docs/spec/v2.rst +++ b/docs/spec/v2.rst @@ -1,6 +1,6 @@ .. _spec_v2: -Zarr storage specification version 2 +Zarr storage specification version 3 ==================================== This document provides a technical specification of the protocol and format @@ -78,6 +78,14 @@ filters filters are to be applied. Each codec configuration object MUST contain a ``"id"`` key identifying the codec to be used. +The following keys MAY be present: + +dimensions + A list of string or ``null`` values providing optional names for each ofthe + array's dimensions. If provided, the list MUST have length equal to the + number of array dimensions. If omitted, the array MUST be treated + equivalently to providing dimensions as a list of all ``null`` values. + Other keys MUST NOT be present within the metadata object. For example, the JSON object below defines a 2-dimensional array of 64-bit @@ -98,6 +106,10 @@ using the Blosc compression library prior to storage:: "clevel": 5, "shuffle": 1 }, + "dimensions": [ + "row", + "column" + ] "dtype": ">> print(open('data/group.zarr/.zgroup').read()) { - "zarr_format": 2 + "zarr_format": 3 } Create a sub-group:: @@ -495,6 +576,13 @@ What has been stored:: Changes ------- +Version 3 changes +~~~~~~~~~~~~~~~~~~~ + +* Optional support for named dimensions on arrays and groups. +* Added a description of the more restrictive "netZDF" format, inspired by the + `netCDF `_ data model. + Version 2 clarifications ~~~~~~~~~~~~~~~~~~~~~~~~ From 70b681a192d28b947b52eb8cd4185a003672146a Mon Sep 17 00:00:00 2001 From: Stephan Hoyer Date: Mon, 16 Jul 2018 20:31:37 -0500 Subject: [PATCH 2/2] Clarify that dimensions on groups are optional --- docs/spec/v2.rst | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/docs/spec/v2.rst b/docs/spec/v2.rst index 0483f9f2a3..bff989278d 100644 --- a/docs/spec/v2.rst +++ b/docs/spec/v2.rst @@ -346,11 +346,11 @@ Dimensions Groups and arrays can be associated with optional dimension names. This feature is intended to facilitate self-described datasets. -Dimensions are required to be consistent. Any dimensions set on an array -(any non-``null`` value), MUST also be defined on an ancestor group. Dimension -sizes can be overwritten in descendent groups, but the size of each named -dimensions on an array MUST match the size of that dimension on the most direct -ancestor group on which it is defined. +Setting dimensions on groups is an OPTIONAL way to indicate that arrays that +use reuse the same dimension have a consistent size. When a dimension is set on +a group, the size of each dimensions on arrays inside that group is REQUIRED to +match. This includes arrays inside descendent groups, unless the dimension is +explicitly overwritten by dimensions on a descendent group. For example, the JSON objects below describe a hierarchy of arrays, where the dimension ``x`` has size 1000 on the array ``foo`` and size 2000 on the array @@ -379,6 +379,10 @@ dimension ``x`` has size 1000 on the array ``foo`` and size 2000 on the array ... } +If dimensions were removed from ``nested/.zarray`` then the array store would +be invalid, because the array ``nested/bar`` has inconsistent size for +dimension ``x`` from the size of the dimension in the root group. + .. _spec_v2_netzdf: NetZDF @@ -390,6 +394,7 @@ following changes: * The "dimensions" field is REQUIRED for all arrays. * All entries in "dimensions" on arrays MUST be strings: ``null`` dimensions are not allowed. +* Every dimension on an array MUST be found on dimensions in an ancestor group. * The "netzdf" field is REQUIRED, with a value of ``true``, on all groups that that obey the netZDF spec.