-
-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: zarr spec v3: adds optional dimensions and the "netZDF" format #276
Closed
Closed
Changes from 1 commit
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
.. _spec_v2: | ||
|
||
Zarr storage specification version 2 | ||
Zarr storage specification version 3 | ||
==================================== | ||
|
||
This document provides a technical specification of the protocol and format | ||
|
@@ -78,6 +78,14 @@ filters | |
filters are to be applied. Each codec configuration object MUST contain a | ||
``"id"`` key identifying the codec to be used. | ||
|
||
The following keys MAY be present: | ||
|
||
dimensions | ||
A list of string or ``null`` values providing optional names for each ofthe | ||
array's dimensions. If provided, the list MUST have length equal to the | ||
number of array dimensions. If omitted, the array MUST be treated | ||
equivalently to providing dimensions as a list of all ``null`` values. | ||
|
||
Other keys MUST NOT be present within the metadata object. | ||
|
||
For example, the JSON object below defines a 2-dimensional array of 64-bit | ||
|
@@ -98,6 +106,10 @@ using the Blosc compression library prior to storage:: | |
"clevel": 5, | ||
"shuffle": 1 | ||
}, | ||
"dimensions": [ | ||
"row", | ||
"column" | ||
] | ||
"dtype": "<f8", | ||
"fill_value": "NaN", | ||
"filters": [ | ||
|
@@ -108,7 +120,7 @@ using the Blosc compression library prior to storage:: | |
10000, | ||
10000 | ||
], | ||
"zarr_format": 2 | ||
"zarr_format": 3 | ||
} | ||
|
||
.. _spec_v2_array_dtype: | ||
|
@@ -284,6 +296,20 @@ zarr_format | |
An integer defining the version of the storage specification to which the | ||
array store adheres. | ||
|
||
The following keys are OPTIONAL: | ||
|
||
dimensions | ||
A JSON object defining a map from string dimension names to integer sizes. | ||
All arrays in a group or its descendents with dimension names MUST have | ||
matching size along their named dimensions, unless any of those dimensions | ||
are overriden by dimensions in a descendent group. | ||
netzdf | ||
An optional boolean indicating whether arrays within the group and its | ||
descendents adhere to the more restrictive "netZDF" file-format (detailed | ||
below), in which dimensions are REQUIRED for all arrays. If omitted, | ||
software SHOULD NOT make assumptions about whether or not dimensions can be | ||
found on all arrays. | ||
|
||
Other keys MUST NOT be present within the metadata object. | ||
|
||
The members of a group are arrays and groups stored under logical paths that | ||
|
@@ -312,6 +338,61 @@ For example, the JSON object below encodes three attributes named | |
"baz": [1, 2, 3, 4] | ||
} | ||
|
||
.. _spec_v2_dimensions: | ||
|
||
Dimensions | ||
---------- | ||
|
||
Groups and arrays can be associated with optional dimension names. This feature | ||
is intended to facilitate self-described datasets. | ||
|
||
Dimensions are required to be consistent. Any dimensions set on an array | ||
(any non-``null`` value), MUST also be defined on an ancestor group. Dimension | ||
sizes can be overwritten in descendent groups, but the size of each named | ||
dimensions on an array MUST match the size of that dimension on the most direct | ||
ancestor group on which it is defined. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I'm going to change this, to make group dimensions and consistency entirely optional:
|
||
|
||
For example, the JSON objects below describe a hierarchy of arrays, where the | ||
dimension ``x`` has size 1000 on the array ``foo`` and size 2000 on the array | ||
``nested/bar``:: | ||
|
||
.zgroup: | ||
{ | ||
"dimensions": {"x": 1000}, | ||
... | ||
} | ||
foo/.zarray: | ||
{ | ||
"dimensions": ["x"], | ||
"shape": [1000] | ||
... | ||
} | ||
nested/.zgroup: | ||
{ | ||
"dimensions": {"x": 2000}, | ||
... | ||
} | ||
nested/bar/.zarray: | ||
{ | ||
"dimensions": ["x"], | ||
"shape": [2000] | ||
... | ||
} | ||
|
||
.. _spec_v2_netzdf: | ||
|
||
NetZDF | ||
------ | ||
|
||
NetZDF is a more restricted variant of the Zarr storage format, with the | ||
following changes: | ||
|
||
* The "dimensions" field is REQUIRED for all arrays. | ||
* All entries in "dimensions" on arrays MUST be strings: ``null`` dimensions are | ||
not allowed. | ||
* The "netzdf" field is REQUIRED, with a value of ``true``, on all groups that | ||
that obey the netZDF spec. | ||
|
||
.. _spec_v2_examples: | ||
|
||
Examples | ||
|
@@ -358,7 +439,7 @@ Inspect the array metadata:: | |
20, | ||
20 | ||
], | ||
"zarr_format": 2 | ||
"zarr_format": 3 | ||
} | ||
|
||
Chunks are initialized on demand. E.g., set some data:: | ||
|
@@ -433,7 +514,7 @@ Inspect the group metadata:: | |
|
||
>>> print(open('data/group.zarr/.zgroup').read()) | ||
{ | ||
"zarr_format": 2 | ||
"zarr_format": 3 | ||
} | ||
|
||
Create a sub-group:: | ||
|
@@ -495,6 +576,13 @@ What has been stored:: | |
Changes | ||
------- | ||
|
||
Version 3 changes | ||
~~~~~~~~~~~~~~~~~~~ | ||
|
||
* Optional support for named dimensions on arrays and groups. | ||
* Added a description of the more restrictive "netZDF" format, inspired by the | ||
`netCDF <https://www.unidata.ucar.edu/netcdf>`_ data model. | ||
|
||
Version 2 clarifications | ||
~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ofthe/of the/