diff --git a/RELEASE_NOTES.md b/RELEASE_NOTES.md index 623c0bc853..12bfa7bda7 100644 --- a/RELEASE_NOTES.md +++ b/RELEASE_NOTES.md @@ -7,6 +7,7 @@ This file contains a high-level description of this package's evolution. Release ## 4.9.3 - TBD +* Convert NCZarr V2 to store all netcdf-4 specific info as attributes. This improves interoperability with other Zarr implementations by no longer using non-standard keys. The price to be paid is that lazy attribute reading cannot be supported. See [Github #2836](https://github.com/Unidata/netcdf-c/issues/2936) for more information. * Cleanup the option code for NETCDF_ENABLE_SET_LOG_LEVEL\[_FUNC\] See [Github #2931](https://github.com/Unidata/netcdf-c/issues/2931) for more information. * Fix duplicate definition when using aws-sdk-cpp. See [Github #2928](https://github.com/Unidata/netcdf-c/issues/2928) for more information. * Cleanup various obsolete options and do some code refactoring. See [Github #2926](https://github.com/Unidata/netcdf-c/issues/2926) for more information. diff --git a/docs/nczarr.md b/docs/nczarr.md index 4b3f258255..ee41e4ffe5 100644 --- a/docs/nczarr.md +++ b/docs/nczarr.md @@ -8,13 +8,15 @@ The NetCDF NCZarr Implementation # NCZarr Introduction {#nczarr_introduction} -Beginning with netCDF version 4.8.0, the Unidata NetCDF group has extended the netcdf-c library to provide access to cloud storage (e.g. Amazon S3 [1] ). +Beginning with netCDF version 4.8.0, the Unidata NetCDF group has extended the netcdf-c library to support data stored using the Zarr data model and storage format [4,6]. As part of this support, netCDF adds support for accessing data stored using cloud storage (e.g. Amazon S3 [1] ). -The goal of this project is to provide maximum interoperability between the netCDF Enhanced (netcdf-4) data model and the Zarr version 2 [4] data model. This is embodied in the netcdf-c library so that it is possible to use the netcdf API to read and write Zarr formatted datasets. +The goal of this project, then, is to provide maximum interoperability between the netCDF Enhanced (netcdf-4) data model and the Zarr version 2 [4] data model. This is embodied in the netcdf-c library so that it is possible to use the netcdf API to read and write Zarr formatted datasets. -In order to better support the netcdf-4 data model, the netcdf-c library implements a limited set of extensions to the Zarr data model. +In order to better support the netcdf-4 data model, the netcdf-c library implements a limited set of extensions to the *Zarr* data model. This extended model is referred to as *NCZarr*. -An important goal is that those extensions not interfere with reading of those extended datasets by other Zarr specification conforming implementations. This means that one can write a dataset using the NCZarr extensions and expect that dataset to be readable by other Zarr implementations. +Additionally, another goal is to ensure interoperability between *NCZarr* +formatted files and standard (aka pure) *Zarr* formatted files. +This means that (1) an *NCZarr* file can be read by any other *Zarr* library (and especially the Zarr-python library), and (2) a standard *Zarr* file can be read by netCDF. Of course, there limitations in that other *Zarr* libraries will not use the extra, *NCZarr* meta-data, and netCDF will have to "fake" meta-data not provided by a pure *Zarr* file. As a secondary -- but equally important -- goal, it must be possible to use the NCZarr library to read and write datasets that are pure Zarr, @@ -29,14 +31,12 @@ Notes on terminology in this document. # The NCZarr Data Model {#nczarr_data_model} -NCZarr uses a data model [4] that, by design, extends the Zarr Version 2 Specification [6] to add support for the NetCDF-4 data model. +NCZarr uses a data model that, by design, extends the Zarr Version 2 Specification . -__Note Carefully__: a legal _NCZarr_ dataset is also a legal _Zarr_ dataset under a specific assumption. This assumption is that within Zarr meta-data objects, like ''.zarray'', unrecognized dictionary keys are ignored. -If this assumption is true of an implementation, then the _NCZarr_ dataset is a legal _Zarr_ dataset and should be readable by that _Zarr_ implementation. -The inverse is true also. A legal _Zarr_ dataset is also a legal _NCZarr_ -dataset, where "legal" means it conforms to the Zarr version 2 specification. +__Note Carefully__: a legal _NCZarr_ dataset is expected to also be a legal _Zarr_ dataset. +The inverse is true also. A legal _Zarr_ dataset is expected to also be a legal _NCZarr_ dataset, where "legal" means it conforms to the Zarr specification(s). In addition, certain non-Zarr features are allowed and used. -Specifically the XArray ''\_ARRAY\_DIMENSIONS'' attribute is one such. +Specifically the XArray [7] ''\_ARRAY\_DIMENSIONS'' attribute is one such. There are two other, secondary assumption: @@ -45,9 +45,10 @@ There are two other, secondary assumption: filters](./md_filters.html "filters") for details. Briefly, the data model supported by NCZarr is netcdf-4 minus -the user-defined types. However, a restricted form of String type -is supported (see Appendix E). -As with netcdf-4 chunking is supported. Filters and compression +the user-defined types and full String type support. +However, a restricted form of String type +is supported (see Appendix D). +As with netcdf-4, chunking is supported. Filters and compression are also [supported](./md_filters.html "filters"). Specifically, the model supports the following. @@ -74,8 +75,8 @@ When specified, they are treated as chunked where the file consists of only one This means that testing for contiguous or compact is not possible; the _nc_inq_var_chunking_ function will always return NC_CHUNKED and the chunksizes will be the same as the dimension sizes of the variable's dimensions. Additionally, it should be noted that NCZarr supports scalar variables, -but Zarr does not; Zarr only supports dimensioned variables. -In order to support interoperability, NCZarr does the following. +but Zarr Version 2 does not; Zarr V2 only supports dimensioned variables. +In order to support interoperability, NCZarr V2 does the following. 1. A scalar variable is recorded in the Zarr metadata as if it has a shape of **[1]**. 2. A note is stored in the NCZarr metadata that this is actually a netCDF scalar variable. @@ -108,55 +109,62 @@ using URLs. There are, however, some details that are important. - Protocol: this should be _https_ or _s3_,or _file_. - The _s3_ scheme is equivalent to "https" plus setting "mode=nczarr,s3" (see below). Specifying "file" is mostly used for testing, but is used to support directory tree or zipfile format storage. + The _s3_ scheme is equivalent to "https" plus setting "mode=s3". + Specifying "file" is mostly used for testing, but also for directory tree or zipfile format storage. ## Client Parameters The fragment part of a URL is used to specify information that is interpreted to specify what data format is to be used, as well as additional controls for that data format. -For NCZarr support, the following _key=value_ pairs are allowed. -- mode=nczarr|zarr|noxarray|file|zip|s3 +For reading, _key=value_ pairs are provided for specifying the storage format. +- mode=nczarr|zarr -Typically one will specify two mode flags: one to indicate what format -to use and one to specify the way the dataset is to be stored. -For example, a common one is "mode=zarr,file" +Additional pairs are provided to specify the Zarr version. +- mode=v2 +Additional pairs are provided to specify the storage medium: Amazon S3 vs File tree vs Zip file. +- mode=file|zip|s3 + +Note that when reading, an attempt will be made to infer the +format and Zarr version and storage medium format by probing the +file. If inferencing fails, then it is reported. In this case, +the client may need to add specific mode flags to avoid +inferencing. + +Typically one will specify three mode flags: one to indicate what format +to use and one to specify the way the dataset is to be stored. +For example, a common one is "mode=zarr,file" + + +Obviously, when creating a file, inferring the type of file to create +is not possible so the mode flags must be set specifically. +This means that both the storage medium and the exact storage +format must be specified. Using _mode=nczarr_ causes the URL to be interpreted as a reference to a dataset that is stored in NCZarr format. -The _zarr_ mode tells the library to -use NCZarr, but to restrict its operation to operate on pure -Zarr Version 2 datasets. +The _zarr_ mode tells the library to use NCZarr, but to restrict its operation to operate on pure Zarr. + -The modes _s3_, _file_, and _zip_ tell the library what storage +The modes _s3_, _file_, and _zip_ tell the library what storage medium driver to use. -* The _s3_ driver is the default and indicates using Amazon S3 or some equivalent. -* The _file_ format stores data in a directory tree. -* The _zip_ format stores data in a local zip file. +* The _s3_ driver stores data using Amazon S3 or some equivalent. +* The _file_ driver stores data in a directory tree. +* The _zip_ driver stores data in a local zip file. -Note that It should be the case that zipping a _file_ +As an aside, it should be the case that zipping a _file_ format directory tree will produce a file readable by the _zip_ storage format, and vice-versa. -By default, the XArray convention is supported and used for -both NCZarr files and pure Zarr files. This -means that every variable in the root group whose named dimensions +By default, the XArray convention is supported for Zarr Version 2 +and used for both NCZarr files and pure Zarr files. + +This means that every variable in the root group whose named dimensions are also in the root group will have an attribute called *\_ARRAY\_DIMENSIONS* that stores those dimension names. The _noxarray_ mode tells the library to disable the XArray support. -The netcdf-c library is capable of inferring additional mode flags based on the flags it finds. Currently we have the following inferences. -- _zarr_ => _nczarr_ - -So for example: ````...#mode=zarr,zip```` is equivalent to this. -````...#mode=nczarr,zarr,zip -```` - - # NCZarr Map Implementation {#nczarr_mapimpl} Internally, the nczarr implementation has a map abstraction that allows different storage formats to be used. @@ -192,7 +200,7 @@ be a prefix of any other key. There several other concepts of note. 1. __Dataset__ - a dataset is the complete tree contained by the key defining -the root of the dataset. +the root of the dataset. The term __File__ will often be used as a synonym. Technically, the root of the tree is the key \/.zgroup, where .zgroup can be considered the _superblock_ of the dataset. 2. __Object__ - equivalent of the S3 object; Each object has a unique key and "contains" data in the form of an arbitrary sequence of 8-bit bytes. @@ -277,14 +285,15 @@ As with other URLS (e.g. DAP), these kind of URLS can be passed as the path argu # NCZarr versus Pure Zarr. {#nczarr_purezarr} -The NCZARR format extends the pure Zarr format by adding extra keys such as ''\_NCZARR\_ARRAY'' inside the ''.zarray'' object. -It is possible to suppress the use of these extensions so that the netcdf library can read and write a pure zarr formatted file. -This is controlled by using ''mode=zarr'', which is an alias for the -''mode=nczarr,zarr'' combination. -The primary effects of using pure zarr are described in the [Translation Section](@ref nczarr_translation). - -There are some constraints on the reading of Zarr datasets using the NCZarr implementation. +The NCZARR format extends the pure Zarr format by adding extra attributes such as ''\_nczarr\_array'' inside the ''.zattr'' object. +It is possible to suppress the use of these extensions so that the netcdf library can write a pure zarr formatted file. But this probably unnecessary +since these attributes should be readable by any other Zarr implementation. +But these extra attributes might be seen as clutter and so it is possible +to suppress them when writing using *mode=zarr*. +Reading of pure Zarr files created using other implementations is a necessary +compatibility feature of NCZarr. +This requirement imposed some constraints on the reading of Zarr datasets using the NCZarr implementation. 1. Zarr allows some primitive types not recognized by NCZarr. Over time, the set of unrecognized types is expected to diminish. Examples of currently unsupported types are as follows: @@ -333,13 +342,14 @@ The reason for this is that the bucket name forms the initial segment in the key ## Data Model -The NCZarr storage format is almost identical to that of the the standard Zarr version 2 format. +The NCZarr storage format is almost identical to that of the the standard Zarr format. The data model differs as follows. 1. Zarr only supports anonymous dimensions -- NCZarr supports only shared (named) dimensions. 2. Zarr attributes are untyped -- or perhaps more correctly characterized as of type string. +3. Zarr does not explicitly support unlimited dimensions -- NCZarr does support them. -## Storage Format +## Storage Medium Consider both NCZarr and Zarr, and assume S3 notions of bucket and object. In both systems, Groups and Variables (Array in Zarr) map to S3 objects. @@ -347,8 +357,7 @@ Containment is modeled using the fact that the dataset's key is a prefix of the So for example, if variable _v1_ is contained in top level group g1 -- _/g1 -- then the key for _v1_ is _/g1/v_. Additional meta-data information is stored in special objects whose name start with ".z". -In Zarr, the following special objects exist. - +In Zarr Version 2, the following special objects exist. 1. Information about a group is kept in a special object named _.zgroup_; so for example the object _/g1/.zgroup_. 2. Information about an array is kept as a special object named _.zarray_; @@ -359,45 +368,46 @@ so for example the objects _/g1/.zattr_ and _/g1/v1/.zattr_. The first three contain meta-data objects in the form of a string representing a JSON-formatted dictionary. The NCZarr format uses the same objects as Zarr, but inserts NCZarr -specific key-value pairs in them to hold NCZarr specific information -The value of each of these keys is a JSON dictionary containing a variety +specific attributes in the *.zattr* object to hold NCZarr specific information +The value of each of these attributes is a JSON dictionary containing a variety of NCZarr specific information. -These keys are as follows: +These NCZarr-specific attributes are as follows: -_\_nczarr_superblock\__ -- this is in the top level group -- key _/.zarr_. +_\_nczarr_superblock\__ -- this is in the top level group's *.zattr* object. It is in effect the "superblock" for the dataset and contains any netcdf specific dataset level information. It is also used to verify that a given key is the root of a dataset. -Currently it contains the following key(s): -* "version" -- the NCZarr version defining the format of the dataset. +Currently it contains keys that are ignored and exist only to ensure that +older netcdf library versions do not crash. +* "version" -- the NCZarr version defining the format of the dataset (deprecated). -_\_nczarr_group\__ -- this key appears in every _.zgroup_ object. +_\_nczarr_group\__ -- this key appears in every group's _.zattr_ object. It contains any netcdf specific group information. Specifically it contains the following keys: -* "dims" -- the name and size of shared dimensions defined in this group, as well an optional flag indictating if the dimension is UNLIMITED. -* "vars" -- the name of variables defined in this group. +* "dimensions" -- the name and size of shared dimensions defined in this group, as well an optional flag indictating if the dimension is UNLIMITED. +* "arrays" -- the name of variables defined in this group. * "groups" -- the name of sub-groups defined in this group. These lists allow walking the NCZarr dataset without having to use the potentially costly search operation. -_\_nczarr_array\__ -- this key appears in every _.zarray_ object. +_\_nczarr_array\__ -- this key appears in the *.zattr* object associated +with a _.zarray_ object. It contains netcdf specific array information. Specifically it contains the following keys: -* dimrefs -- the names of the shared dimensions referenced by the variable. -* storage -- indicates if the variable is chunked vs contiguous in the netcdf sense. +* dimension_references -- the fully qualified names of the shared dimensions referenced by the variable. +* storage -- indicates if the variable is chunked vs contiguous in the netcdf sense. Also signals if a variable is scalar. -_\_nczarr_attr\__ -- this key appears in every _.zattr_ object. -This means that technically, it is attribute, but one for which access -is normally surpressed . +_\_nczarr_attr\__ -- this attribute appears in every _.zattr_ object. Specifically it contains the following keys: -* types -- the types of all of the other attributes in the _.zattr_ object. +* types -- the types of all attributes in the _.zattr_ object. ## Translation {#nczarr_translation} -With some constraints, it is possible for an nczarr library to read the pure Zarr format and for a zarr library to read the nczarr format. -The latter case, zarr reading nczarr is possible if the zarr library is willing to ignore keys whose name it does not recognize; specifically anything beginning with _\_nczarr\__. +With some loss of netcdf-4 information, it is possible for an nczarr library to read the pure Zarr format and for other zarr libraries to read the nczarr format. -The former case, nczarr reading zarr is also possible if the nczarr can simulate or infer the contents of the missing _\_nczarr\_xxx_ objects. +The latter case, zarr reading nczarr, is trival because all of the nczarr metadata is stored as ordinary, String valued (but JSON syntax), attributes. + +The former case, nczarr reading zarr is possible assuming the nczarr code can simulate or infer the contents of the missing _\_nczarr\_xxx_ attributes. As a rule this can be done as follows. 1. _\_nczarr_group\__ -- The list of contained variables and sub-groups can be computed using the search API to list the keys "contained" in the key for a group. The search looks for occurrences of _.zgroup_, _.zattr_, _.zarray_ to infer the keys for the contained groups, attribute sets, and arrays (variables). @@ -405,9 +415,8 @@ Constructing the set of "shared dimensions" is carried out by walking all the variables in the whole dataset and collecting the set of unique integer shapes for the variables. For each such dimension length, a top level dimension is created -named ".zdim_" where len is the integer length. -2. _\_nczarr_array\__ -- The dimrefs are inferred by using the shape -in _.zarray_ and creating references to the simulated shared dimension. +named "_Anonymous_Dimension_" where len is the integer length. +2. _\_nczarr_array\__ -- The dimension referencess are inferred by using the shape in _.zarray_ and creating references to the simulated shared dimensions. netcdf specific information. 3. _\_nczarr_attr\__ -- The type of each attribute is inferred by trying to parse the first attribute value string. @@ -417,13 +426,15 @@ In order to accomodate existing implementations, certain mode tags are provided ## XArray -The Xarray [XArray Zarr Encoding Specification](http://xarray.pydata.org/en/latest/internals.html#zarr-encoding-specification) Zarr implementation uses its own mechanism for specifying shared dimensions. +The Xarray [7] Zarr implementation uses its own mechanism for specifying shared dimensions. It uses a special attribute named ''_ARRAY_DIMENSIONS''. The value of this attribute is a list of dimension names (strings). An example might be ````["time", "lon", "lat"]````. -It is essentially equivalent to the ````_nczarr_array "dimrefs" list````, except that the latter uses fully qualified names so the referenced dimensions can be anywhere in the dataset. +It is almost equivalent to the ````_nczarr_array "dimension_references" list````, except that the latter uses fully qualified names so the referenced dimensions can be anywhere in the dataset. The Xarray dimension list differs from the netcdf-4 shared dimensions in two ways. +1. Specifying Xarray in a non-root group has no meaning in the current Xarray specification. +2. A given name can be associated with different lengths, even within a single array. This is considered an error in NCZarr. -As of _netcdf-c_ version 4.8.2, The Xarray ''_ARRAY_DIMENSIONS'' attribute is supported for both NCZarr and pure Zarr. +The Xarray ''_ARRAY_DIMENSIONS'' attribute is supported for both NCZarr and pure Zarr. If possible, this attribute will be read/written by default, but can be suppressed if the mode value "noxarray" is specified. If detected, then these dimension names are used to define shared dimensions. @@ -431,6 +442,8 @@ The following conditions will cause ''_ARRAY_DIMENSIONS'' to not be written. * The variable is not in the root group, * Any dimension referenced by the variable is not in the root group. +Note that this attribute is not needed for Zarr Version 3, and is ignored. + # Examples {#nczarr_examples} Here are a couple of examples using the _ncgen_ and _ncdump_ utilities. @@ -453,34 +466,17 @@ Here are a couple of examples using the _ncgen_ and _ncdump_ utilities. ``` 5. Create an nczarr file using the s3 protocol with a specific profile ``` - ncgen -4 -lb -o 's3://datasetbucket/rootkey\#mode=nczarr,awsprofile=unidata' dataset.cdl + ncgen -4 -lb -o "s3://datasetbucket/rootkey\#mode=nczarr&awsprofile=unidata" dataset.cdl ``` Note that the URL is internally translated to this - ``` - 'https://s2.<region>.amazonaws.com/datasetbucket/rootkey#mode=nczarr,awsprofile=unidata' dataset.cdl - ``` - -# References {#nczarr_bib} - -[1] [Amazon Simple Storage Service Documentation](https://docs.aws.amazon.com/s3/index.html)
-[2] [Amazon Simple Storage Service Library](https://github.com/aws/aws-sdk-cpp)
-[3] [The LibZip Library](https://libzip.org/)
-[4] [NetCDF ZARR Data Model Specification](https://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf-zarr-data-model-specification)
-[5] [Python Documentation: 8.3. -collections — High-performance dataset datatypes](https://docs.python.org/2/library/collections.html)
-[6] [Zarr Version 2 Specification](https://zarr.readthedocs.io/en/stable/spec/v2.html)
-[7] [XArray Zarr Encoding Specification](http://xarray.pydata.org/en/latest/internals.html#zarr-encoding-specification)
-[8] [Dynamic Filter Loading](https://support.hdfgroup.org/HDF5/doc/Advanced/DynamicallyLoadedFilters/HDF5DynamicallyLoadedFilters.pdf)
-[9] [Officially Registered Custom HDF5 Filters](https://portal.hdfgroup.org/display/support/Registered+Filter+Plugins)
-[10] [C-Blosc Compressor Implementation](https://github.com/Blosc/c-blosc)
-[11] [Conda-forge packages / aws-sdk-cpp](https://anaconda.org/conda-forge/aws-sdk-cpp)
-[12] [GDAL Zarr](https://gdal.org/drivers/raster/zarr.html)
- + ```` + "https://s2.<region>.amazonaws.com/datasetbucket/rootkey\#mode=nczarr&awsprofile=unidata" + ```` # Appendix A. Building NCZarr Support {#nczarr_build} Currently the following build cases are known to work. Note that this does not include S3 support. -A separate tabulation of S3 support is in the document cloud.md. +A separate tabulation of S3 support is in the document _cloud.md_.
Operating SystemBuild SystemNCZarr @@ -551,24 +547,9 @@ Some of the relevant limits are as follows: Note that the limit is defined in terms of bytes and not (Unicode) characters. This affects the depth to which groups can be nested because the key encodes the full path name of a group. -# Appendix C. NCZarr Version 1 Meta-Data Representation. {#nczarr_version1} - -In NCZarr Version 1, the NCZarr specific metadata was represented using new objects rather than as keys in existing Zarr objects. -Due to conflicts with the Zarr specification, that format is deprecated in favor of the one described above. -However the netcdf-c NCZarr support can still read the version 1 format. - -The version 1 format defines three specific objects: _.nczgroup_, _.nczarray_,_.nczattr_. -These are stored in parallel with the corresponding Zarr objects. So if there is a key of the form "/x/y/.zarray", then there is also a key "/x/y/.nczarray". -The content of these objects is the same as the contents of the corresponding keys. So the value of the ''_NCZARR_ARRAY'' key is the same as the content of the ''.nczarray'' object. The list of connections is as follows: - -* ''.nczarr'' <=> ''_nczarr_superblock_'' -* ''.nczgroup <=> ''_nczarr_group_'' -* ''.nczarray <=> ''_nczarr_array_'' -* ''.nczattr <=> ''_nczarr_attr_'' - -# Appendix D. JSON Attribute Convention. {#nczarr_json} +# Appendix C. JSON Attribute Convention. {#nczarr_json} -The Zarr V2 specification is somewhat vague on what is a legal +The Zarr V2 specification is somewhat vague on what is a legal value for an attribute. The examples all show one of two cases: 1. A simple JSON scalar atomic values (e.g. int, float, char, etc), or 2. A JSON array of such values. @@ -581,7 +562,7 @@ complex JSON expression. An example is the GDAL Driver convention [12], where the value is a complex JSON dictionary. -In order for NCZarr to be as consistent as possible with Zarr Version 2, +In order for NCZarr to be as consistent as possible with Zarr, it is desirable to support this convention for attribute values. This means that there must be some way to handle an attribute whose value is not either of the two cases above. That is, its value @@ -611,12 +592,12 @@ There are mutiple cases to consider. 3. The netcdf attribute **is** of type NC_CHAR and its value – taken as a single sequence of characters – **is** parseable as a legal JSON expression. * Parse to produce a JSON expression and write that expression. - * Use "|U1" as the dtype and store in the NCZarr metadata. + * Use "|J0" as the dtype and store in the NCZarr metadata. 4. The netcdf attribute **is** of type NC_CHAR and its value – taken as a single sequence of characters – **is not** parseable as a legal JSON expression. * Convert to a JSON string and write that expression - * Use "|U1" as the dtype and store in the NCZarr metadata. + * Use ">S1" as the dtype and store in the NCZarr metadata. ## Reading an attribute: @@ -640,10 +621,7 @@ and then store it as the equivalent netcdf vector. * If the dtype is not defined, then infer the dtype based on the first JSON value in the array, and then store it as the equivalent netcdf vector. -3. The JSON expression is an array some of whose values are dictionaries or (sub-)arrays. - * Un-parse the expression to an equivalent sequence of characters, and then store it as of type NC_CHAR. - -3. The JSON expression is a dictionary. +3. The attribute is any other JSON structure. * Un-parse the expression to an equivalent sequence of characters, and then store it as of type NC_CHAR. ## Notes @@ -654,7 +632,7 @@ actions "read-write-read" is equivalent to a single "read" and "write-read-write The "almost" caveat is necessary because (1) whitespace may be added or lost during the sequence of operations, and (2) numeric precision may change. -# Appendix E. Support for string types +# Appendix D. Support for string types Zarr supports a string type, but it is restricted to fixed size strings. NCZarr also supports such strings, @@ -702,6 +680,182 @@ the above types should always appear as strings, and the type that signals NC_CHAR (in NCZarr) would be handled by Zarr as a string of length 1. + + +# References {#nczarr_bib} + +[1] [Amazon Simple Storage Service Documentation](https://docs.aws.amazon.com/s3/index.html)
+[2] [Amazon Simple Storage Service Library](https://github.com/aws/aws-sdk-cpp)
+[3] [The LibZip Library](https://libzip.org/)
+[4] [NetCDF ZARR Data Model Specification](https://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf-zarr-data-model-specification)
+[5] [Python Documentation: 8.3. +collections — High-performance dataset datatypes](https://docs.python.org/2/library/collections.html)
+[6] [Zarr Version 2 Specification](https://zarr.readthedocs.io/en/stable/spec/v2.html)
+[7] [XArray Zarr Encoding Specification](http://xarray.pydata.org/en/latest/internals.html#zarr-encoding-specification)
+[8] [Dynamic Filter Loading](https://support.hdfgroup.org/HDF5/doc/Advanced/DynamicallyLoadedFilters/HDF5DynamicallyLoadedFilters.pdf)
+[9] [Officially Registered Custom HDF5 Filters](https://portal.hdfgroup.org/display/support/Registered+Filter+Plugins)
+[10] [C-Blosc Compressor Implementation](https://github.com/Blosc/c-blosc)
+[11] [Conda-forge packages / aws-sdk-cpp](https://anaconda.org/conda-forge/aws-sdk-cpp)
+[12] [GDAL Zarr](https://gdal.org/drivers/raster/zarr.html)
+ + # Change Log {#nczarr_changelog} [Note: minor text changes are not included.] @@ -710,6 +864,12 @@ intended to be a detailed chronology. Rather, it provides highlights that will be of interest to NCZarr users. In order to see exact changes, It is necessary to use the 'git diff' command. +## 03/31/2024 +1. Document the change to V2 to using attributes to hold NCZarr metadata. + +## 01/31/2024 +1. Add description of support for Zarr version 3 as an appendix. + ## 3/10/2023 1. Move most of the S3 text to the cloud.md document. @@ -729,4 +889,4 @@ include arbitrary JSON expressions; see Appendix D for more details. __Author__: Dennis Heimbigner
__Email__: dmh at ucar dot edu
__Initial Version__: 4/10/2020
-__Last Revised__: 3/8/2023 +__Last Revised__: 4/02/2024 diff --git a/include/nc4internal.h b/include/nc4internal.h index 56be310865..9a2aac02be 100644 --- a/include/nc4internal.h +++ b/include/nc4internal.h @@ -512,6 +512,5 @@ extern void NC_initialize_reserved(void); #define NC_NCZARR_GROUP "_nczarr_group" #define NC_NCZARR_ARRAY "_nczarr_array" #define NC_NCZARR_ATTR "_nczarr_attr" -#define NC_NCZARR_ATTR_UC "_NCZARR_ATTR" /* deprecated */ #endif /* _NC4INTERNAL_ */ diff --git a/include/ncjson.h b/include/ncjson.h index df24c0a569..62426778e4 100644 --- a/include/ncjson.h +++ b/include/ncjson.h @@ -57,7 +57,7 @@ typedef struct NCjson { int sort; /* of this object */ char* string; /* sort != DICT|ARRAY */ struct NCjlist { - int len; + size_t len; struct NCjson** contents; } list; /* sort == DICT|ARRAY */ } NCjson; @@ -96,7 +96,7 @@ OPTEXPORT int NCJnewstring(int sort, const char* value, NCjson** jsonp); OPTEXPORT int NCJnewstringn(int sort, size_t len, const char* value, NCjson** jsonp); /* Get dict key value by name */ -OPTEXPORT int NCJdictget(const NCjson* dict, const char* key, NCjson** valuep); +OPTEXPORT int NCJdictget(const NCjson* dict, const char* key, const NCjson** valuep); /* Convert one json sort to value of another type; don't use union so we can know when to reclaim sval */ OPTEXPORT int NCJcvt(const NCjson* value, int outsort, struct NCJconst* output); @@ -108,7 +108,14 @@ OPTEXPORT int NCJaddstring(NCjson* json, int sort, const char* s); OPTEXPORT int NCJappend(NCjson* object, NCjson* value); /* Insert key-value pair into a dict object. key will be copied */ -OPTEXPORT int NCJinsert(NCjson* object, char* key, NCjson* value); +OPTEXPORT int NCJinsert(NCjson* object, const char* key, NCjson* value); + +/* Insert key-value pair as strings into a dict object. + key and value will be copied */ +OPTEXPORT int NCJinsertstring(NCjson* object, const char* key, const char* value); + +/* Insert key-value pair where value is an int */ +OPTEXPORT int NCJinsertint(NCjson* object, const char* key, long long ivalue); /* Unparser to convert NCjson object to text in buffer */ OPTEXPORT int NCJunparse(const NCjson* json, unsigned flags, char** textp); @@ -131,8 +138,10 @@ OPTEXPORT const char* NCJtotext(const NCjson* json); #define NCJsort(x) ((x)->sort) #define NCJstring(x) ((x)->string) #define NCJlength(x) ((x)==NULL ? 0 : (x)->list.len) +#define NCJdictlength(x) ((x)==NULL ? 0 : (x)->list.len/2) #define NCJcontents(x) ((x)->list.contents) #define NCJith(x,i) ((x)->list.contents[i]) +#define NCJdictith(x,i) ((x)->list.contents[2*i]) /* Setters */ #define NCJsetsort(x,s) (x)->sort=(s) diff --git a/include/netcdf_json.h b/include/netcdf_json.h index 6879edf899..5d77cadb34 100644 --- a/include/netcdf_json.h +++ b/include/netcdf_json.h @@ -57,7 +57,7 @@ typedef struct NCjson { int sort; /* of this object */ char* string; /* sort != DICT|ARRAY */ struct NCjlist { - int len; + size_t len; struct NCjson** contents; } list; /* sort == DICT|ARRAY */ } NCjson; @@ -96,7 +96,7 @@ OPTEXPORT int NCJnewstring(int sort, const char* value, NCjson** jsonp); OPTEXPORT int NCJnewstringn(int sort, size_t len, const char* value, NCjson** jsonp); /* Get dict key value by name */ -OPTEXPORT int NCJdictget(const NCjson* dict, const char* key, NCjson** valuep); +OPTEXPORT int NCJdictget(const NCjson* dict, const char* key, const NCjson** valuep); /* Convert one json sort to value of another type; don't use union so we can know when to reclaim sval */ OPTEXPORT int NCJcvt(const NCjson* value, int outsort, struct NCJconst* output); @@ -108,7 +108,14 @@ OPTEXPORT int NCJaddstring(NCjson* json, int sort, const char* s); OPTEXPORT int NCJappend(NCjson* object, NCjson* value); /* Insert key-value pair into a dict object. key will be copied */ -OPTEXPORT int NCJinsert(NCjson* object, char* key, NCjson* value); +OPTEXPORT int NCJinsert(NCjson* object, const char* key, NCjson* value); + +/* Insert key-value pair as strings into a dict object. + key and value will be copied */ +OPTEXPORT int NCJinsertstring(NCjson* object, const char* key, const char* value); + +/* Insert key-value pair where value is an int */ +OPTEXPORT int NCJinsertint(NCjson* object, const char* key, long long ivalue); /* Unparser to convert NCjson object to text in buffer */ OPTEXPORT int NCJunparse(const NCjson* json, unsigned flags, char** textp); @@ -131,8 +138,10 @@ OPTEXPORT const char* NCJtotext(const NCjson* json); #define NCJsort(x) ((x)->sort) #define NCJstring(x) ((x)->string) #define NCJlength(x) ((x)==NULL ? 0 : (x)->list.len) +#define NCJdictlength(x) ((x)==NULL ? 0 : (x)->list.len/2) #define NCJcontents(x) ((x)->list.contents) #define NCJith(x,i) ((x)->list.contents[i]) +#define NCJdictith(x,i) ((x)->list.contents[2*i]) /* Setters */ #define NCJsetsort(x,s) (x)->sort=(s) @@ -278,7 +287,9 @@ static int NCJnewstring(int sort, const char* value, NCjson** jsonp); static int NCJnewstringn(int sort, size_t len, const char* value, NCjson** jsonp); static int NCJclone(const NCjson* json, NCjson** clonep); static int NCJaddstring(NCjson* json, int sort, const char* s); -static int NCJinsert(NCjson* object, char* key, NCjson* jvalue); +static int NCJinsert(NCjson* object, const char* key, NCjson* jvalue); +static int NCJinsertstring(NCjson* object, const char* key, const char* value); +static int NCJinsertint(NCjson* object, const char* key, long long ivalue); static int NCJappend(NCjson* object, NCjson* value); static int NCJunparse(const NCjson* json, unsigned flags, char** textp); #else /*!NETCDF_JSON_H*/ @@ -764,7 +775,7 @@ NCJnewstringn(int sort, size_t len, const char* value, NCjson** jsonp) } OPTSTATIC int -NCJdictget(const NCjson* dict, const char* key, NCjson** valuep) +NCJdictget(const NCjson* dict, const char* key, const NCjson** valuep) { int i,stat = NCJ_OK; @@ -1050,7 +1061,7 @@ NCJaddstring(NCjson* json, int sort, const char* s) /* Insert key-value pair into a dict object. key will be strdup'd */ OPTSTATIC int -NCJinsert(NCjson* object, char* key, NCjson* jvalue) +NCJinsert(NCjson* object, const char* key, NCjson* jvalue) { int stat = NCJ_OK; NCjson* jkey = NULL; @@ -1063,6 +1074,36 @@ NCJinsert(NCjson* object, char* key, NCjson* jvalue) return NCJTHROW(stat); } +/* Insert key-value pair as strings into a dict object. + key and value will be strdup'd */ +OPTSTATIC int +NCJinsertstring(NCjson* object, const char* key, const char* value) +{ + int stat = NCJ_OK; + NCjson* jvalue = NULL; + if(value == NULL) + NCJnew(NCJ_NULL,&jvalue); + else + NCJnewstring(NCJ_STRING,value,&jvalue); + NCJinsert(object,key,jvalue); +done: + return NCJTHROW(stat); +} + +/* Insert key-value pair with value being an integer */ +OPTSTATIC int +NCJinsertint(NCjson* object, const char* key, long long ivalue) +{ + int stat = NCJ_OK; + NCjson* jvalue = NULL; + char digits[128]; + snprintf(digits,sizeof(digits),"%lld",ivalue); + NCJnewstring(NCJ_STRING,digits,&jvalue); + NCJinsert(object,key,jvalue); +done: + return NCJTHROW(stat); +} + /* Append value to an array or dict object. */ OPTSTATIC int NCJappend(NCjson* object, NCjson* value) diff --git a/libdispatch/ncjson.c b/libdispatch/ncjson.c index 363b24ffef..148415666e 100644 --- a/libdispatch/ncjson.c +++ b/libdispatch/ncjson.c @@ -128,7 +128,9 @@ static int NCJnewstring(int sort, const char* value, NCjson** jsonp); static int NCJnewstringn(int sort, size_t len, const char* value, NCjson** jsonp); static int NCJclone(const NCjson* json, NCjson** clonep); static int NCJaddstring(NCjson* json, int sort, const char* s); -static int NCJinsert(NCjson* object, char* key, NCjson* jvalue); +static int NCJinsert(NCjson* object, const char* key, NCjson* jvalue); +static int NCJinsertstring(NCjson* object, const char* key, const char* value); +static int NCJinsertint(NCjson* object, const char* key, long long ivalue); static int NCJappend(NCjson* object, NCjson* value); static int NCJunparse(const NCjson* json, unsigned flags, char** textp); #else /*!NETCDF_JSON_H*/ @@ -614,7 +616,7 @@ NCJnewstringn(int sort, size_t len, const char* value, NCjson** jsonp) } OPTSTATIC int -NCJdictget(const NCjson* dict, const char* key, NCjson** valuep) +NCJdictget(const NCjson* dict, const char* key, const NCjson** valuep) { int i,stat = NCJ_OK; @@ -900,7 +902,7 @@ NCJaddstring(NCjson* json, int sort, const char* s) /* Insert key-value pair into a dict object. key will be strdup'd */ OPTSTATIC int -NCJinsert(NCjson* object, char* key, NCjson* jvalue) +NCJinsert(NCjson* object, const char* key, NCjson* jvalue) { int stat = NCJ_OK; NCjson* jkey = NULL; @@ -913,6 +915,36 @@ NCJinsert(NCjson* object, char* key, NCjson* jvalue) return NCJTHROW(stat); } +/* Insert key-value pair as strings into a dict object. + key and value will be strdup'd */ +OPTSTATIC int +NCJinsertstring(NCjson* object, const char* key, const char* value) +{ + int stat = NCJ_OK; + NCjson* jvalue = NULL; + if(value == NULL) + NCJnew(NCJ_NULL,&jvalue); + else + NCJnewstring(NCJ_STRING,value,&jvalue); + NCJinsert(object,key,jvalue); +done: + return NCJTHROW(stat); +} + +/* Insert key-value pair with value being an integer */ +OPTSTATIC int +NCJinsertint(NCjson* object, const char* key, long long ivalue) +{ + int stat = NCJ_OK; + NCjson* jvalue = NULL; + char digits[128]; + snprintf(digits,sizeof(digits),"%lld",ivalue); + NCJnewstring(NCJ_STRING,digits,&jvalue); + NCJinsert(object,key,jvalue); +done: + return NCJTHROW(stat); +} + /* Append value to an array or dict object. */ OPTSTATIC int NCJappend(NCjson* object, NCjson* value) diff --git a/libdispatch/ncs3sdk_h5.c b/libdispatch/ncs3sdk_h5.c index f8263293b7..0f99dfb473 100644 --- a/libdispatch/ncs3sdk_h5.c +++ b/libdispatch/ncs3sdk_h5.c @@ -122,7 +122,7 @@ NC_s3sdkinitialize(void) } /* Get environment information */ - NC_s3sdkenvironment(void); + NC_s3sdkenvironment(); return NC_NOERR; } diff --git a/libnczarr/zarr.c b/libnczarr/zarr.c index 832b0d7c40..9ff7893a7f 100644 --- a/libnczarr/zarr.c +++ b/libnczarr/zarr.c @@ -239,61 +239,6 @@ NCZ_get_superblock(NC_FILE_INFO_T* file, int* superblockp) /**************************************************/ /* Utilities */ -#if 0 -/** -@internal Open the root group object -@param dataset - [in] the root dataset object -@param rootp - [out] created root group -@return NC_NOERR -@author Dennis Heimbigner -*/ -static int -ncz_open_rootgroup(NC_FILE_INFO_T* dataset) -{ - int stat = NC_NOERR; - int i; - NCZ_FILE_INFO_T* zfile = NULL; - NC_GRP_INFO_T* root = NULL; - void* content = NULL; - char* rootpath = NULL; - NCjson* json = NULL; - - ZTRACE(3,"dataset=",dataset->hdr.name); - - zfile = dataset->format_file_info; - - /* Root should already be defined */ - root = dataset->root_grp; - - assert(root != NULL); - - if((stat=nczm_concat(NULL,ZGROUP,&rootpath))) - goto done; - if((stat = NCZ_downloadjson(zfile->map, rootpath, &json))) - goto done; - /* Process the json */ - for(i=0;icontents);i+=2) { - const NCjson* key = nclistget(json->contents,i); - const NCjson* value = nclistget(json->contents,i+1); - if(strcmp(NCJstring(key),"zarr_format")==0) { - int zversion; - if(sscanf(NCJstring(value),"%d",&zversion)!=1) - {stat = NC_ENOTNC; goto done;} - /* Verify against the dataset */ - if(zversion != zfile->zarr.zarr_version) - {stat = NC_ENOTNC; goto done;} - } - } - -done: - if(json) NCJreclaim(json); - nullfree(rootpath); - nullfree(content); - return ZUNTRACE(stat); -} -#endif - - static const char* controllookup(NClist* controls, const char* key) { @@ -315,7 +260,7 @@ applycontrols(NCZ_FILE_INFO_T* zinfo) int stat = NC_NOERR; const char* value = NULL; NClist* modelist = nclistnew(); - int noflags = 0; /* track non-default negative flags */ + size64_t noflags = 0; /* track non-default negative flags */ if((value = controllookup(zinfo->controllist,"mode")) != NULL) { if((stat = NCZ_comma_parse(value,modelist))) goto done; @@ -352,76 +297,3 @@ applycontrols(NCZ_FILE_INFO_T* zinfo) nclistfreeall(modelist); return stat; } - -#if 0 -/** -@internal Rewrite attributes into a group or var -@param map - [in] the map object for storage -@param container - [in] the containing object -@param jattrs - [in] the json for .zattrs -@param jtypes - [in] the json for .ztypes -@return NC_NOERR -@author Dennis Heimbigner -*/ -int -ncz_unload_jatts(NCZ_FILE_INFO_T* zinfo, NC_OBJ* container, NCjson* jattrs, NCjson* jtypes) -{ - int stat = NC_NOERR; - char* fullpath = NULL; - char* akey = NULL; - char* tkey = NULL; - NCZMAP* map = zinfo->map; - - assert((NCJsort(jattrs) == NCJ_DICT)); - assert((NCJsort(jtypes) == NCJ_DICT)); - - if(container->sort == NCGRP) { - NC_GRP_INFO_T* grp = (NC_GRP_INFO_T*)container; - /* Get grp's fullpath name */ - if((stat = NCZ_grpkey(grp,&fullpath))) - goto done; - } else { - NC_VAR_INFO_T* var = (NC_VAR_INFO_T*)container; - /* Get var's fullpath name */ - if((stat = NCZ_varkey(var,&fullpath))) - goto done; - } - - /* Construct the path to the .zattrs object */ - if((stat = nczm_concat(fullpath,ZATTRS,&akey))) - goto done; - - /* Always write as V2 */ - - { - NCjson* k = NULL; - NCjson* v = NULL; - /* remove any previous version */ - if(!NCJremove(jattrs,NCZ_V2_ATTRS,1,&k,&v)) { - NCJreclaim(k); NCJreclaim(v); - } - } - - if(!(zinfo->controls.flags & FLAG_PUREZARR)) { - /* Insert the jtypes into the set of attributes */ - if((stat = NCJinsert(jattrs,NCZ_V2_ATTRS,jtypes))) goto done; - } - - /* Upload the .zattrs object */ - if((stat=NCZ_uploadjson(map,tkey,jattrs))) - goto done; - -done: - if(stat) { - NCJreclaim(jattrs); - NCJreclaim(jtypes); - } - nullfree(fullpath); - nullfree(akey); - nullfree(tkey); - return stat; -} -#endif - - - diff --git a/libnczarr/zarr.h b/libnczarr/zarr.h index 22dd2d1cfc..714fb3bcea 100644 --- a/libnczarr/zarr.h +++ b/libnczarr/zarr.h @@ -41,15 +41,15 @@ EXTERNL int ncz_unload_jatts(NCZ_FILE_INFO_T*, NC_OBJ* container, NCjson* jattrs EXTERNL int ncz_close_file(NC_FILE_INFO_T* file, int abort); /* zcvt.c */ -EXTERNL int NCZ_json2cvt(NCjson* jsrc, struct ZCVT* zcvt, nc_type* typeidp); -EXTERNL int NCZ_convert1(NCjson* jsrc, nc_type, NCbytes*); +EXTERNL int NCZ_json2cvt(const NCjson* jsrc, struct ZCVT* zcvt, nc_type* typeidp); +EXTERNL int NCZ_convert1(const NCjson* jsrc, nc_type, NCbytes*); EXTERNL int NCZ_stringconvert1(nc_type typid, char* src, NCjson* jvalue); EXTERNL int NCZ_stringconvert(nc_type typid, size_t len, void* data0, NCjson** jdatap); /* zsync.c */ EXTERNL int ncz_sync_file(NC_FILE_INFO_T* file, int isclose); EXTERNL int ncz_sync_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, int isclose); -EXTERNL int ncz_sync_atts(NC_FILE_INFO_T*, NC_OBJ* container, NCindex* attlist, int isclose); +EXTERNL int ncz_sync_atts(NC_FILE_INFO_T*, NC_OBJ* container, NCindex* attlist, NCjson* jatts, NCjson* jtypes, int isclose); EXTERNL int ncz_read_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp); EXTERNL int ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container); EXTERNL int ncz_read_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp); @@ -62,12 +62,10 @@ EXTERNL int NCZ_grpkey(const NC_GRP_INFO_T* grp, char** pathp); EXTERNL int NCZ_varkey(const NC_VAR_INFO_T* var, char** pathp); EXTERNL int NCZ_dimkey(const NC_DIM_INFO_T* dim, char** pathp); EXTERNL int ncz_splitkey(const char* path, NClist* segments); -EXTERNL int NCZ_readdict(NCZMAP* zmap, const char* key, NCjson** jsonp); -EXTERNL int NCZ_readarray(NCZMAP* zmap, const char* key, NCjson** jsonp); EXTERNL int ncz_nctypedecode(const char* snctype, nc_type* nctypep); EXTERNL int ncz_nctype2dtype(nc_type nctype, int endianness, int purezarr,int len, char** dnamep); EXTERNL int ncz_dtype2nctype(const char* dtype, nc_type typehint, int purezarr, nc_type* nctypep, int* endianp, int* typelenp); -EXTERNL int NCZ_inferattrtype(NCjson* value, nc_type typehint, nc_type* typeidp); +EXTERNL int NCZ_inferattrtype(const NCjson* value, nc_type typehint, nc_type* typeidp); EXTERNL int NCZ_inferinttype(unsigned long long u64, int negative); EXTERNL int ncz_fill_value_sort(nc_type nctype, int*); EXTERNL int NCZ_createobject(NCZMAP* zmap, const char* key, size64_t size); @@ -89,7 +87,7 @@ EXTERNL int NCZ_get_maxstrlen(NC_OBJ* obj); EXTERNL int NCZ_fixed2char(const void* fixed, char** charp, size_t count, int maxstrlen); EXTERNL int NCZ_char2fixed(const char** charp, void* fixed, size_t count, int maxstrlen); EXTERNL int NCZ_copy_data(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, const void* memory, size_t count, int reading, void* copy); -EXTERNL int NCZ_iscomplexjson(NCjson* value, nc_type typehint); +EXTERNL int NCZ_iscomplexjson(const NCjson* value, nc_type typehint); /* zwalk.c */ EXTERNL int NCZ_read_chunk(int ncid, int varid, size64_t* zindices, void* chunkdata); diff --git a/libnczarr/zattr.c b/libnczarr/zattr.c index 29d8e693fb..7f3ef55454 100644 --- a/libnczarr/zattr.c +++ b/libnczarr/zattr.c @@ -51,7 +51,7 @@ ncz_getattlist(NC_GRP_INFO_T *grp, int varid, NC_VAR_INFO_T **varp, NCindex **at { NC_VAR_INFO_T *var; - if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, varid))) + if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, (size_t)varid))) return NC_ENOTVAR; assert(var->hdr.id == varid); @@ -120,7 +120,7 @@ ncz_get_att_special(NC_FILE_INFO_T* h5, NC_VAR_INFO_T* var, const char* name, /* The global reserved attributes */ if(strcmp(name,NCPROPS)==0) { - int len; + size_t len; if(h5->provenance.ncproperties == NULL) {stat = NC_ENOTATT; goto done;} if(mem_type == NC_NAT) mem_type = NC_CHAR; @@ -138,7 +138,7 @@ ncz_get_att_special(NC_FILE_INFO_T* h5, NC_VAR_INFO_T* var, const char* name, if(strcmp(name,SUPERBLOCKATT)==0) iv = (unsigned long long)h5->provenance.superblockversion; else /* strcmp(name,ISNETCDF4ATT)==0 */ - iv = NCZ_isnetcdf4(h5); + iv = (unsigned long long)NCZ_isnetcdf4(h5); if(mem_type == NC_NAT) mem_type = NC_INT; if(data) switch (mem_type) { @@ -279,8 +279,8 @@ NCZ_del_att(int ncid, int varid, const char *name) NC_FILE_INFO_T *h5; NC_ATT_INFO_T *att; NCindex* attlist = NULL; - int i; - size_t deletedid; + size_t i; + int deletedid; int retval; /* Name must be provided. */ @@ -516,7 +516,7 @@ ncz_put_att(NC_GRP_INFO_T* grp, int varid, const char *name, nc_type file_type, /* For an existing att, if we're not in define mode, the len must not be greater than the existing len for classic model. */ if (!(h5->flags & NC_INDEF) && - len * nc4typelen(file_type) > (size_t)att->len * nc4typelen(att->nc_typeid)) + len * (size_t)nc4typelen(file_type) > (size_t)att->len * (size_t)nc4typelen(att->nc_typeid)) { if (h5->cmode & NC_CLASSIC_MODEL) return NC_ENOTINDEFINE; @@ -980,7 +980,7 @@ int ncz_create_fillvalue(NC_VAR_INFO_T* var) { int stat = NC_NOERR; - int i; + size_t i; NC_ATT_INFO_T* fv = NULL; /* Have the var's attributes been read? */ diff --git a/libnczarr/zchunking.c b/libnczarr/zchunking.c index da0da5951f..442f53e0c7 100644 --- a/libnczarr/zchunking.c +++ b/libnczarr/zchunking.c @@ -258,7 +258,7 @@ NCZ_compute_all_slice_projections( NCZSliceProjections* results) { int stat = NC_NOERR; - size64_t r; + int r; for(r=0;rrank;r++) { /* Compute each of the rank SliceProjections instances */ diff --git a/libnczarr/zclose.c b/libnczarr/zclose.c index 7515bfcce7..3dbba0d6be 100644 --- a/libnczarr/zclose.c +++ b/libnczarr/zclose.c @@ -72,7 +72,7 @@ zclose_group(NC_GRP_INFO_T *grp) { int stat = NC_NOERR; NCZ_GRP_INFO_T* zgrp; - int i; + size_t i; assert(grp && grp->format_grp_info != NULL); LOG((3, "%s: grp->name %s", __func__, grp->hdr.name)); @@ -103,6 +103,9 @@ zclose_group(NC_GRP_INFO_T *grp) /* Close the zgroup. */ zgrp = grp->format_grp_info; LOG((4, "%s: closing group %s", __func__, grp->hdr.name)); + nullfree(zgrp->zgroup.prefix); + NCJreclaim(zgrp->zgroup.obj); + NCJreclaim(zgrp->zgroup.atts); nullfree(zgrp); grp->format_grp_info = NULL; /* avoid memory errors */ @@ -123,7 +126,7 @@ zclose_gatts(NC_GRP_INFO_T* grp) { int stat = NC_NOERR; NC_ATT_INFO_T *att; - int a; + size_t a; for(a = 0; a < ncindexsize(grp->att); a++) { NCZ_ATT_INFO_T* zatt = NULL; att = (NC_ATT_INFO_T* )ncindexith(grp->att, a); @@ -149,10 +152,9 @@ NCZ_zclose_var1(NC_VAR_INFO_T* var) int stat = NC_NOERR; NCZ_VAR_INFO_T* zvar; NC_ATT_INFO_T* att; - int a; + size_t a; assert(var && var->format_var_info); - zvar = var->format_var_info;; for(a = 0; a < ncindexsize(var->att); a++) { NCZ_ATT_INFO_T* zatt; att = (NC_ATT_INFO_T*)ncindexith(var->att, a); @@ -170,9 +172,14 @@ NCZ_zclose_var1(NC_VAR_INFO_T* var) #endif /* Reclaim the type */ if(var->type_info) (void)zclose_type(var->type_info); + /* reclaim dispatch info */ + zvar = var->format_var_info;; if(zvar->cache) NCZ_free_chunk_cache(zvar->cache); /* reclaim xarray */ if(zvar->xarray) nclistfreeall(zvar->xarray); + nullfree(zvar->zarray.prefix); + NCJreclaim(zvar->zarray.obj); + NCJreclaim(zvar->zarray.atts); nullfree(zvar); var->format_var_info = NULL; /* avoid memory errors */ return stat; @@ -191,7 +198,7 @@ zclose_vars(NC_GRP_INFO_T* grp) { int stat = NC_NOERR; NC_VAR_INFO_T* var; - int i; + size_t i; for(i = 0; i < ncindexsize(grp->vars); i++) { var = (NC_VAR_INFO_T*)ncindexith(grp->vars, i); @@ -215,7 +222,7 @@ zclose_dims(NC_GRP_INFO_T* grp) { int stat = NC_NOERR; NC_DIM_INFO_T* dim; - int i; + size_t i; for(i = 0; i < ncindexsize(grp->dim); i++) { NCZ_DIM_INFO_T* zdim; @@ -265,7 +272,7 @@ static int zclose_types(NC_GRP_INFO_T* grp) { int stat = NC_NOERR; - int i; + size_t i; NC_TYPE_INFO_T* type; for(i = 0; i < ncindexsize(grp->type); i++) @@ -289,7 +296,7 @@ static int zwrite_vars(NC_GRP_INFO_T *grp) { int stat = NC_NOERR; - int i; + size_t i; assert(grp && grp->format_grp_info != NULL); LOG((3, "%s: grp->name %s", __func__, grp->hdr.name)); diff --git a/libnczarr/zcvt.c b/libnczarr/zcvt.c index 26dc936b07..879c5e8c20 100644 --- a/libnczarr/zcvt.c +++ b/libnczarr/zcvt.c @@ -15,7 +15,7 @@ Code taken directly from libdap4/d4cvt.c */ -static const int ncz_type_size[NC_MAX_ATOMIC_TYPE+1] = { +static const size_t ncz_type_size[NC_MAX_ATOMIC_TYPE+1] = { 0, /*NC_NAT*/ sizeof(char), /*NC_BYTE*/ sizeof(char), /*NC_CHAR*/ @@ -101,7 +101,7 @@ NCZ_string2cvt(char* src, nc_type srctype, struct ZCVT* zcvt, nc_type* typeidp) /* Warning: not free returned zcvt.strv; it may point into a string in jsrc */ int -NCZ_json2cvt(NCjson* jsrc, struct ZCVT* zcvt, nc_type* typeidp) +NCZ_json2cvt(const NCjson* jsrc, struct ZCVT* zcvt, nc_type* typeidp) { int stat = NC_NOERR; nc_type srctype = NC_NAT; @@ -154,7 +154,7 @@ NCZ_json2cvt(NCjson* jsrc, struct ZCVT* zcvt, nc_type* typeidp) /* Convert a singleton NCjson value to a memory equivalent value of specified dsttype; */ int -NCZ_convert1(NCjson* jsrc, nc_type dsttype, NCbytes* buf) +NCZ_convert1(const NCjson* jsrc, nc_type dsttype, NCbytes* buf) { int stat = NC_NOERR; nc_type srctype; @@ -536,7 +536,7 @@ int NCZ_stringconvert(nc_type typeid, size_t len, void* data0, NCjson** jdatap) { int stat = NC_NOERR; - int i; + size_t i; char* src = data0; /* so we can do arithmetic on it */ size_t typelen; char* str = NULL; diff --git a/libnczarr/zfilter.c b/libnczarr/zfilter.c index 6f4a8b9730..7481c4ab57 100644 --- a/libnczarr/zfilter.c +++ b/libnczarr/zfilter.c @@ -979,7 +979,7 @@ NCZ_filter_build(const NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, const NCjson* j { int i,stat = NC_NOERR; NCZ_Filter* filter = NULL; - NCjson* jvalue = NULL; + const NCjson* jvalue = NULL; NCZ_Plugin* plugin = NULL; NCZ_Codec codec = codec_empty; NCZ_HDF5 hdf5 = hdf5_empty; diff --git a/libnczarr/zinternal.h b/libnczarr/zinternal.h index 3c3f706f91..2548ad54ba 100644 --- a/libnczarr/zinternal.h +++ b/libnczarr/zinternal.h @@ -22,7 +22,6 @@ #define NCZ_CHUNKSIZE_FACTOR (10) #define NCZ_MIN_CHUNK_SIZE (2) - /**************************************************/ /* Constants */ @@ -39,56 +38,43 @@ # endif #endif -/* V1 reserved objects */ -#define NCZMETAROOT "/.nczarr" -#define NCZGROUP ".nczgroup" -#define NCZARRAY ".nczarray" -#define NCZATTRS ".nczattrs" -/* Deprecated */ -#define NCZVARDEP ".nczvar" -#define NCZATTRDEP ".nczattr" - #define ZMETAROOT "/.zgroup" +#define ZMETAATTR "/.zattrs" #define ZGROUP ".zgroup" #define ZATTRS ".zattrs" #define ZARRAY ".zarray" -/* Pure Zarr pseudo names */ -#define ZDIMANON "_zdim" - /* V2 Reserved Attributes */ /* -Inserted into /.zgroup +For nczarr version 2.x.x, the following (key,value) +pairs are stored in .zgroup and/or .zarray. + +Inserted into /.zattrs in root group _nczarr_superblock: {"version": "2.0.0"} -Inserted into any .zgroup + +Inserted into any group level .zattrs "_nczarr_group": "{ -\"dimensions\": {\"d1\": \"1\", \"d2\": \"1\",...} -\"variables\": [\"v1\", \"v2\", ...] +\"dimensions\": [{name: , size: , unlimited: 1|0},...], +\"arrays\": [\"v1\", \"v2\", ...] \"groups\": [\"g1\", \"g2\", ...] }" -Inserted into any .zarray + +Inserted into any array level .zattrs "_nczarr_array": "{ -\"dimensions\": [\"/g1/g2/d1\", \"/d2\",...] -\"storage\": \"scalar\"|\"contiguous\"|\"compact\"|\"chunked\" +\"dimension_references\": [\"/g1/g2/d1\", \"/d2\",...] +\"storage\": \"scalar\"|\"contiguous\"|\"chunked\" }" -Inserted into any .zattrs ? or should it go into the container? + +Inserted into any .zattrs "_nczarr_attr": "{ \"types\": {\"attr1\": \" NC_CHAR. -+ */ #define NCZ_V2_SUPERBLOCK "_nczarr_superblock" #define NCZ_V2_GROUP "_nczarr_group" #define NCZ_V2_ARRAY "_nczarr_array" -#define NCZ_V2_ATTR NC_NCZARR_ATTR - -#define NCZ_V2_SUPERBLOCK_UC "_NCZARR_SUPERBLOCK" -#define NCZ_V2_GROUP_UC "_NCZARR_GROUP" -#define NCZ_V2_ARRAY_UC "_NCZARR_ARRAY" -#define NCZ_V2_ATTR_UC NC_NCZARR_ATTR_UC +#define NCZ_V2_ATTR "_nczarr_attr" /* Must match value in include/nc4internal.h */ #define NCZARRCONTROL "nczarr" #define PUREZARRCONTROL "zarr" @@ -154,7 +140,7 @@ typedef struct NCZ_FILE_INFO { # define FLAG_SHOWFETCH 2 # define FLAG_LOGGING 4 # define FLAG_XARRAYDIMS 8 -# define FLAG_NCZARR_V1 16 +# define FLAG_NCZARR_KEY 16 /* _nczarr_xxx keys are stored in object and not in _nczarr_attrs */ NCZM_IMPL mapimpl; } controls; int default_maxstrlen; /* default max str size for variables of type string */ @@ -173,18 +159,13 @@ typedef struct NCZ_ATT_INFO { /* Struct to hold ZARR-specific info for a group. */ typedef struct NCZ_GRP_INFO { NCZcommon common; -#if 0 - /* The jcontent field stores the following: - 1. List of (name,length) for dims in the group - 2. List of (name,type) for user-defined types in the group - 3. List of var names in the group - 4. List of subgroups names in the group - */ - NClist* dims; - NClist* types; /* currently not used */ - NClist* vars; - NClist* grps; -#endif + /* Read .zgroup and .zattrs once */ + struct ZARROBJ { + char* prefix; /* prefix of .zgroup and .zattrs */ + NCjson* obj; /* .zgroup|.zarray */ + NCjson* atts; + int nczv1; /* 1 => _nczarr_xxx are in obj and not attributes */ + } zgroup; } NCZ_GRP_INFO_T; /* Struct to hold ZARR-specific info for a variable. */ @@ -199,6 +180,9 @@ typedef struct NCZ_VAR_INFO { char dimension_separator; /* '.' | '/' */ NClist* incompletefilters; int maxstrlen; /* max length of strings for this variable */ + /* Read .zarray and .zattrs once */ + struct ZARROBJ zarray; + struct ZARROBJ zattrs; } NCZ_VAR_INFO_T; /* Struct to hold ZARR-specific info for a field. */ diff --git a/libnczarr/zsync.c b/libnczarr/zsync.c index 4d8ee9d9ca..5fcb2547da 100644 --- a/libnczarr/zsync.c +++ b/libnczarr/zsync.c @@ -8,7 +8,7 @@ #include #ifndef nulldup - #define nulldup(x) ((x)?strdup(x):(x)) +#define nulldup(x) ((x)?strdup(x):(x)) #endif #undef FILLONCLOSE @@ -21,28 +21,33 @@ static int ncz_collect_dims(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NCjson** jdimsp); static int ncz_sync_var(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose); -static int load_jatts(NCZMAP* map, NC_OBJ* container, int nczarrv1, NCjson** jattrsp, NClist** atypes); -static int zconvert(NCjson* src, nc_type typeid, size_t typelen, int* countp, NCbytes* dst); -static int computeattrinfo(const char* name, NClist* atypes, nc_type typehint, int purezarr, NCjson* values, +static int download_jatts(NC_FILE_INFO_T* file, NC_OBJ* container, const NCjson** jattsp, const NCjson** jtypesp); +static int zconvert(const NCjson* src, nc_type typeid, size_t typelen, int* countp, NCbytes* dst); +static int computeattrinfo(const char* name, const NCjson* jtypes, nc_type typehint, int purezarr, NCjson* values, nc_type* typeidp, size_t* typelenp, size_t* lenp, void** datap); -static int parse_group_content(NCjson* jcontent, NClist* dimdefs, NClist* varnames, NClist* subgrps); +static int parse_group_content(const NCjson* jcontent, NClist* dimdefs, NClist* varnames, NClist* subgrps); static int parse_group_content_pure(NCZ_FILE_INFO_T* zinfo, NC_GRP_INFO_T* grp, NClist* varnames, NClist* subgrps); static int define_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp); static int define_dims(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* diminfo); static int define_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* varnames); +static int define_var1(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, const char* varname); static int define_subgrps(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* subgrpnames); static int searchvars(NCZ_FILE_INFO_T*, NC_GRP_INFO_T*, NClist*); static int searchsubgrps(NCZ_FILE_INFO_T*, NC_GRP_INFO_T*, NClist*); static int locategroup(NC_FILE_INFO_T* file, size_t nsegs, NClist* segments, NC_GRP_INFO_T** grpp); static int createdim(NC_FILE_INFO_T* file, const char* name, size64_t dimlen, NC_DIM_INFO_T** dimp); static int parsedimrefs(NC_FILE_INFO_T*, NClist* dimnames, size64_t* shape, NC_DIM_INFO_T** dims, int create); -static int decodeints(NCjson* jshape, size64_t* shapes); -static int computeattrdata(nc_type typehint, nc_type* typeidp, NCjson* values, size_t* typelenp, size_t* lenp, void** datap); +static int decodeints(const NCjson* jshape, size64_t* shapes); +static int computeattrdata(nc_type typehint, nc_type* typeidp, const NCjson* values, size_t* typelenp, size_t* lenp, void** datap); static int computedimrefs(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int purezarr, int xarray, int ndims, NClist* dimnames, size64_t* shapes, NC_DIM_INFO_T** dims); -static int json_convention_read(NCjson* jdict, NCjson** jtextp); -static int jtypes2atypes(NCjson* jtypes, NClist* atypes); - +static int json_convention_read(const NCjson* jdict, NCjson** jtextp); static int ncz_validate(NC_FILE_INFO_T* file); +static int insert_attr(NCjson* jatts, NCjson* jtypes, const char* aname, NCjson* javalue, const char* atype); +static int insert_nczarr_attr(NCjson* jatts, NCjson* jtypes); +static int upload_attrs(NC_FILE_INFO_T* file, NC_OBJ* container, NCjson* jatts); +static int getnczarrkey(NC_OBJ* container, const char* name, const NCjson** jncxxxp); +static int downloadzarrobj(NC_FILE_INFO_T*, struct ZARROBJ* zobj, const char* fullpath, const char* objname); +static int dictgetalt(const NCjson* jdict, const char* name, const char* alt, const NCjson** jvaluep); /**************************************************/ /**************************************************/ @@ -93,7 +98,8 @@ ncz_sync_file(NC_FILE_INFO_T* file, int isclose) static int ncz_collect_dims(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NCjson** jdimsp) { - int i, stat=NC_NOERR; + int stat=NC_NOERR; + size_t i; NCjson* jdims = NULL; NCjson* jdimsize = NULL; NCjson* jdimargs = NULL; @@ -107,24 +113,24 @@ ncz_collect_dims(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NCjson** jdimsp) char slen[128]; snprintf(slen,sizeof(slen),"%llu",(unsigned long long)dim->len); - if((stat = NCJnewstring(NCJ_INT,slen,&jdimsize))) goto done; + NCJnewstring(NCJ_INT,slen,&jdimsize); /* If dim is not unlimited, then write in the old format to provide maximum back compatibility. */ if(dim->unlimited) { NCJnew(NCJ_DICT,&jdimargs); - if((stat = NCJaddstring(jdimargs,NCJ_STRING,"size"))) goto done; - if((stat = NCJappend(jdimargs,jdimsize))) goto done; + if((stat = NCJaddstring(jdimargs,NCJ_STRING,"size"))<0) {stat = NC_EINVAL; goto done;} + if((stat = NCJappend(jdimargs,jdimsize))<0) {stat = NC_EINVAL; goto done;} jdimsize = NULL; - if((stat = NCJaddstring(jdimargs,NCJ_STRING,"unlimited"))) goto done; - if((stat = NCJaddstring(jdimargs,NCJ_INT,"1"))) goto done; + if((stat = NCJaddstring(jdimargs,NCJ_STRING,"unlimited"))<0) {stat = NC_EINVAL; goto done;} + if((stat = NCJaddstring(jdimargs,NCJ_INT,"1"))<0) {stat = NC_EINVAL; goto done;} } else { /* !dim->unlimited */ jdimargs = jdimsize; jdimsize = NULL; } - if((stat = NCJaddstring(jdims,NCJ_STRING,dim->hdr.name))) goto done; - if((stat = NCJappend(jdims,jdimargs))) goto done; + if((stat = NCJaddstring(jdims,NCJ_STRING,dim->hdr.name))<0) {stat = NC_EINVAL; goto done;} + if((stat = NCJappend(jdims,jdimargs))<0) {stat = NC_EINVAL; goto done;} } if(jdimsp) {*jdimsp = jdims; jdims = NULL;} done: @@ -144,7 +150,8 @@ ncz_collect_dims(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NCjson** jdimsp) int ncz_sync_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, int isclose) { - int i,stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; NCZ_FILE_INFO_T* zinfo = NULL; char version[1024]; int purezarr = 0; @@ -156,8 +163,11 @@ ncz_sync_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, int isclose) NCjson* jdims = NULL; NCjson* jvars = NULL; NCjson* jsubgrps = NULL; + NCjson* jnczgrp = NULL; NCjson* jsuper = NULL; NCjson* jtmp = NULL; + NCjson* jatts = NULL; + NCjson* jtypes = NULL; LOG((3, "%s: dims: %s", __func__, key)); ZTRACE(3,"file=%s grp=%s isclose=%d",file->controller->path,grp->hdr.name,isclose); @@ -169,76 +179,83 @@ ncz_sync_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, int isclose) /* Construct grp key */ if((stat = NCZ_grpkey(grp,&fullpath))) + goto done; + + /* build ZGROUP contents */ + NCJnew(NCJ_DICT,&jgroup); + snprintf(version,sizeof(version),"%d",zinfo->zarr.zarr_version); + if((stat = NCJaddstring(jgroup,NCJ_STRING,"zarr_format"))<0) {stat = NC_EINVAL; goto done;} + if((stat = NCJaddstring(jgroup,NCJ_INT,version))<0) {stat = NC_EINVAL; goto done;} + /* build ZGROUP path */ + if((stat = nczm_concat(fullpath,ZGROUP,&key))) goto done; + /* Write to map */ + if((stat=NCZ_uploadjson(map,key,jgroup))) goto done; + nullfree(key); key = NULL; if(!purezarr) { + if(grp->parent == NULL) { /* Root group */ + /* create superblock */ + snprintf(version,sizeof(version),"%lu.%lu.%lu", + zinfo->zarr.nczarr_version.major, + zinfo->zarr.nczarr_version.minor, + zinfo->zarr.nczarr_version.release); + NCJnew(NCJ_DICT,&jsuper); + if((stat = NCJinsertstring(jsuper,"version",version))<0) {stat = NC_EINVAL; goto done;} + } /* Create dimensions dict */ if((stat = ncz_collect_dims(file,grp,&jdims))) goto done; /* Create vars list */ - if((stat = NCJnew(NCJ_ARRAY,&jvars))) - goto done; + NCJnew(NCJ_ARRAY,&jvars); for(i=0; ivars); i++) { NC_VAR_INFO_T* var = (NC_VAR_INFO_T*)ncindexith(grp->vars,i); - if((stat = NCJaddstring(jvars,NCJ_STRING,var->hdr.name))) goto done; + if((stat = NCJaddstring(jvars,NCJ_STRING,var->hdr.name))<0) {stat = NC_EINVAL; goto done;} } /* Create subgroups list */ - if((stat = NCJnew(NCJ_ARRAY,&jsubgrps))) - goto done; + NCJnew(NCJ_ARRAY,&jsubgrps); for(i=0; ichildren); i++) { NC_GRP_INFO_T* g = (NC_GRP_INFO_T*)ncindexith(grp->children,i); - if((stat = NCJaddstring(jsubgrps,NCJ_STRING,g->hdr.name))) goto done; + if((stat = NCJaddstring(jsubgrps,NCJ_STRING,g->hdr.name))<0) {stat = NC_EINVAL; goto done;} } /* Create the "_nczarr_group" dict */ - if((stat = NCJnew(NCJ_DICT,&json))) - goto done; + NCJnew(NCJ_DICT,&jnczgrp); /* Insert the various dicts and arrays */ - if((stat = NCJinsert(json,"dims",jdims))) goto done; + if((stat = NCJinsert(jnczgrp,"dimensions",jdims))<0) {stat = NC_EINVAL; goto done;} jdims = NULL; /* avoid memory problems */ - if((stat = NCJinsert(json,"vars",jvars))) goto done; + if((stat = NCJinsert(jnczgrp,"arrays",jvars))<0) {stat = NC_EINVAL; goto done;} jvars = NULL; /* avoid memory problems */ - if((stat = NCJinsert(json,"groups",jsubgrps))) goto done; + if((stat = NCJinsert(jnczgrp,"groups",jsubgrps))<0) {stat = NC_EINVAL; goto done;} jsubgrps = NULL; /* avoid memory problems */ } - /* build ZGROUP contents */ - if((stat = NCJnew(NCJ_DICT,&jgroup))) - goto done; - snprintf(version,sizeof(version),"%d",zinfo->zarr.zarr_version); - if((stat = NCJaddstring(jgroup,NCJ_STRING,"zarr_format"))) goto done; - if((stat = NCJaddstring(jgroup,NCJ_INT,version))) goto done; - if(!purezarr && grp->parent == NULL) { /* Root group */ - snprintf(version,sizeof(version),"%lu.%lu.%lu", - zinfo->zarr.nczarr_version.major, - zinfo->zarr.nczarr_version.minor, - zinfo->zarr.nczarr_version.release); - if((stat = NCJnew(NCJ_DICT,&jsuper))) goto done; - if((stat-NCJnewstring(NCJ_STRING,version,&jtmp))) goto done; - if((stat = NCJinsert(jsuper,"version",jtmp))) goto done; - jtmp = NULL; - if((stat = NCJinsert(jgroup,NCZ_V2_SUPERBLOCK,jsuper))) goto done; - jsuper = NULL; + /* Build the .zattrs object */ + assert(grp->att); + NCJnew(NCJ_DICT,&jatts); + NCJnew(NCJ_DICT,&jtypes); + if((stat = ncz_sync_atts(file, (NC_OBJ*)grp, grp->att, jatts, jtypes, isclose))) goto done; + + if(!purezarr && jnczgrp != NULL) { + /* Insert _nczarr_group */ + if((stat=insert_attr(jatts,jtypes,NCZ_V2_GROUP,jnczgrp,"|J0"))) goto done; + jnczgrp = NULL; } - if(!purezarr) { - /* Insert the "_NCZARR_GROUP" dict */ - if((stat = NCJinsert(jgroup,NCZ_V2_GROUP,json))) goto done; - json = NULL; + if(!purezarr && jsuper != NULL) { + /* Insert superblock */ + if((stat=insert_attr(jatts,jtypes,NCZ_V2_SUPERBLOCK,jsuper,"|J0"))) goto done; + jsuper = NULL; } - /* build ZGROUP path */ - if((stat = nczm_concat(fullpath,ZGROUP,&key))) - goto done; - /* Write to map */ - if((stat=NCZ_uploadjson(map,key,jgroup))) - goto done; - nullfree(key); key = NULL; + /* As a last mod to jatts, insert the jtypes as an attribute */ + if(!purezarr && jtypes != NULL) { + if((stat = insert_nczarr_attr(jatts,jtypes))) goto done; + jtypes = NULL; + } - /* Build the .zattrs object */ - assert(grp->att); - if((stat = ncz_sync_atts(file,(NC_OBJ*)grp, grp->att, isclose))) - goto done; + /* Write out the .zattrs */ + if((stat = upload_attrs(file,(NC_OBJ*)grp,jatts))) goto done; /* Now synchronize all the variables */ for(i=0; ivars); i++) { @@ -260,6 +277,9 @@ ncz_sync_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, int isclose) NCJreclaim(jdims); NCJreclaim(jvars); NCJreclaim(jsubgrps); + NCJreclaim(jnczgrp); + NCJreclaim(jtypes); + NCJreclaim(jatts); nullfree(fullpath); nullfree(key); return ZUNTRACE(THROW(stat)); @@ -292,6 +312,8 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) NCjson* jdimrefs = NULL; NCjson* jtmp = NULL; NCjson* jfill = NULL; + NCjson* jatts = NULL; + NCjson* jtypes = NULL; char* dtypename = NULL; int purezarr = 0; size64_t shape[NC_MAX_VAR_DIMS]; @@ -326,13 +348,12 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) goto done; /* Create the zarray json object */ - if((stat = NCJnew(NCJ_DICT,&jvar))) - goto done; + NCJnew(NCJ_DICT,&jvar); /* zarr_format key */ snprintf(number,sizeof(number),"%d",zinfo->zarr.zarr_version); - if((stat = NCJaddstring(jvar,NCJ_STRING,"zarr_format"))) goto done; - if((stat = NCJaddstring(jvar,NCJ_INT,number))) goto done; + if((stat = NCJaddstring(jvar,NCJ_STRING,"zarr_format"))<0) {stat = NC_EINVAL; goto done;} + if((stat = NCJaddstring(jvar,NCJ_INT,number))<0) {stat = NC_EINVAL; goto done;} /* Collect the shape vector */ for(i=0;indims;i++) { @@ -346,25 +367,25 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) /* shape key */ /* Integer list defining the length of each dimension of the array.*/ /* Create the list */ - if((stat = NCJnew(NCJ_ARRAY,&jtmp))) goto done; + NCJnew(NCJ_ARRAY,&jtmp); if(zvar->scalar) { NCJaddstring(jtmp,NCJ_INT,"1"); } else for(i=0;indims;i++) { snprintf(number,sizeof(number),"%llu",shape[i]); NCJaddstring(jtmp,NCJ_INT,number); } - if((stat = NCJinsert(jvar,"shape",jtmp))) goto done; + if((stat = NCJinsert(jvar,"shape",jtmp))<0) {stat = NC_EINVAL; goto done;} jtmp = NULL; /* dtype key */ /* A string or list defining a valid data type for the array. */ - if((stat = NCJaddstring(jvar,NCJ_STRING,"dtype"))) goto done; + if((stat = NCJaddstring(jvar,NCJ_STRING,"dtype"))<0) {stat = NC_EINVAL; goto done;} { /* Add the type name */ int endianness = var->type_info->endianness; int atomictype = var->type_info->hdr.id; assert(atomictype > 0 && atomictype <= NC_MAX_ATOMIC_TYPE); if((stat = ncz_nctype2dtype(atomictype,endianness,purezarr,NCZ_get_maxstrlen((NC_OBJ*)var),&dtypename))) goto done; - if((stat = NCJaddstring(jvar,NCJ_STRING,dtypename))) goto done; + if((stat = NCJaddstring(jvar,NCJ_STRING,dtypename))<0) {stat = NC_EINVAL; goto done;} nullfree(dtypename); dtypename = NULL; } @@ -373,9 +394,9 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) of contiguous (or compact), so it will never appear in the read case. */ /* list of chunk sizes */ - if((stat = NCJaddstring(jvar,NCJ_STRING,"chunks"))) goto done; + if((stat = NCJaddstring(jvar,NCJ_STRING,"chunks"))<0) {stat = NC_EINVAL; goto done;} /* Create the list */ - if((stat = NCJnew(NCJ_ARRAY,&jtmp))) goto done; + NCJnew(NCJ_ARRAY,&jtmp); if(zvar->scalar) { NCJaddstring(jtmp,NCJ_INT,"1"); /* one chunk of size 1 */ } else for(i=0;indims;i++) { @@ -383,12 +404,12 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) snprintf(number,sizeof(number),"%lld",len); NCJaddstring(jtmp,NCJ_INT,number); } - if((stat = NCJappend(jvar,jtmp))) goto done; + if((stat = NCJappend(jvar,jtmp))<0) {stat = NC_EINVAL; goto done;} jtmp = NULL; /* fill_value key */ if(var->no_fill) { - if((stat=NCJnew(NCJ_NULL,&jfill))) goto done; + NCJnew(NCJ_NULL,&jfill); } else {/*!var->no_fill*/ int atomictype = var->type_info->hdr.id; if(var->fill_value == NULL) { @@ -398,21 +419,21 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) if((stat = NCZ_stringconvert(atomictype,1,var->fill_value,&jfill))) goto done; assert(jfill->sort != NCJ_ARRAY); } - if((stat = NCJinsert(jvar,"fill_value",jfill))) goto done; + if((stat = NCJinsert(jvar,"fill_value",jfill))<0) {stat = NC_EINVAL; goto done;} jfill = NULL; /* order key */ - if((stat = NCJaddstring(jvar,NCJ_STRING,"order"))) goto done; + if((stat = NCJaddstring(jvar,NCJ_STRING,"order"))<0) {stat = NC_EINVAL; goto done;} /* "C" means row-major order, i.e., the last dimension varies fastest; "F" means column-major order, i.e., the first dimension varies fastest.*/ /* Default to C for now */ - if((stat = NCJaddstring(jvar,NCJ_STRING,"C"))) goto done; + if((stat = NCJaddstring(jvar,NCJ_STRING,"C"))<0) {stat = NC_EINVAL; goto done;} /* Compressor and Filters */ /* compressor key */ /* From V2 Spec: A JSON object identifying the primary compression codec and providing configuration parameters, or ``null`` if no compressor is to be used. */ - if((stat = NCJaddstring(jvar,NCJ_STRING,"compressor"))) goto done; + if((stat = NCJaddstring(jvar,NCJ_STRING,"compressor"))<0) {stat = NC_EINVAL; goto done;} #ifdef NETCDF_ENABLE_NCZARR_FILTERS filterchain = (NClist*)var->filters; if(nclistlength(filterchain) > 0) { @@ -423,9 +444,9 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) #endif { /* no filters at all */ /* Default to null */ - if((stat = NCJnew(NCJ_NULL,&jtmp))) goto done; + NCJnew(NCJ_NULL,&jtmp); } - if(jtmp && (stat = NCJappend(jvar,jtmp))) goto done; + if(jtmp && (stat = NCJappend(jvar,jtmp))<0) {stat = NC_EINVAL; goto done;} jtmp = NULL; /* filters key */ @@ -434,24 +455,23 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) object MUST contain a "id" key identifying the codec to be used. */ /* A list of JSON objects providing codec configurations, or ``null`` if no filters are to be applied. */ - if((stat = NCJaddstring(jvar,NCJ_STRING,"filters"))) goto done; + if((stat = NCJaddstring(jvar,NCJ_STRING,"filters"))<0) {stat = NC_EINVAL; goto done;} #ifdef NETCDF_ENABLE_NCZARR_FILTERS if(nclistlength(filterchain) > 1) { size_t k; /* jtmp holds the array of filters */ - if((stat = NCJnew(NCJ_ARRAY,&jtmp))) goto done; + NCJnew(NCJ_ARRAY,&jtmp); for(k=0;kdimension_separator;/* make separator a string*/ sep[1] = '\0'; - if((stat = NCJnewstring(NCJ_STRING,sep,&jtmp))) goto done; - if((stat = NCJinsert(jvar,"dimension_separator",jtmp))) goto done; + NCJnewstring(NCJ_STRING,sep,&jtmp); + if((stat = NCJinsert(jvar,"dimension_separator",jtmp))<0) {stat = NC_EINVAL; goto done;} jtmp = NULL; } + /* build .zarray path */ + if((stat = nczm_concat(fullpath,ZARRAY,&key))) + goto done; + + /* Write to map */ + if((stat=NCZ_uploadjson(map,key,jvar))) + goto done; + nullfree(key); key = NULL; + /* Capture dimref names as FQNs */ if(var->ndims > 0) { if((dimrefs = nclistnew())==NULL) {stat = NC_ENOMEM; goto done;} @@ -479,54 +508,53 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) /* Build the NCZ_V2_ARRAY object */ { /* Create the dimrefs json object */ - if((stat = NCJnew(NCJ_ARRAY,&jdimrefs))) - goto done; + NCJnew(NCJ_ARRAY,&jdimrefs); for(i=0;indims == 0) { - if((stat = NCJnewstring(NCJ_INT,"1",&jtmp)))goto done; - if((stat = NCJinsert(jncvar,"scalar",jtmp))) goto done; + NCJnewstring(NCJ_INT,"1",&jtmp); + if((stat = NCJinsert(jncvar,"scalar",jtmp))<0) {stat = NC_EINVAL; goto done;} jtmp = NULL; } /* everything looks like it is chunked */ - if((stat = NCJnewstring(NCJ_STRING,"chunked",&jtmp)))goto done; - if((stat = NCJinsert(jncvar,"storage",jtmp))) goto done; + NCJnewstring(NCJ_STRING,"chunked",&jtmp); + if((stat = NCJinsert(jncvar,"storage",jtmp))<0) {stat = NC_EINVAL; goto done;} jtmp = NULL; + } - if(!(zinfo->controls.flags & FLAG_PUREZARR)) { - if((stat = NCJinsert(jvar,NCZ_V2_ARRAY,jncvar))) goto done; - jncvar = NULL; - } + /* Build .zattrs object */ + assert(var->att); + NCJnew(NCJ_DICT,&jatts); + NCJnew(NCJ_DICT,&jtypes); + if((stat = ncz_sync_atts(file,(NC_OBJ*)var, var->att, jatts, jtypes, isclose))) goto done; + + if(!purezarr && jncvar != NULL) { + /* Insert _nczarr_array */ + if((stat=insert_attr(jatts,jtypes,NCZ_V2_ARRAY,jncvar,"|J0"))) goto done; + jncvar = NULL; } - /* build .zarray path */ - if((stat = nczm_concat(fullpath,ZARRAY,&key))) - goto done; + /* As a last mod to jatts, optionally insert the jtypes as an attribute and add _nczarr_attr as attribute*/ + if(!purezarr && jtypes != NULL) { + if((stat = insert_nczarr_attr(jatts,jtypes))) goto done; + jtypes = NULL; + } - /* Write to map */ - if((stat=NCZ_uploadjson(map,key,jvar))) - goto done; - nullfree(key); key = NULL; + /* Write out the .zattrs */ + if((stat = upload_attrs(file,(NC_OBJ*)var,jatts))) goto done; var->created = 1; - /* Build .zattrs object */ - assert(var->att); - if((stat = ncz_sync_atts(file,(NC_OBJ*)var, var->att, isclose))) - goto done; - done: nclistfreeall(dimrefs); nullfree(fullpath); @@ -537,6 +565,8 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) NCJreclaim(jncvar); NCJreclaim(jtmp); NCJreclaim(jfill); + NCJreclaim(jatts); + NCJreclaim(jtypes); return ZUNTRACE(THROW(stat)); } @@ -653,25 +683,25 @@ ncz_write_var(NC_VAR_INFO_T* var) /** * @internal Synchronize attribute data from memory to map. * + * @param file * @param container Pointer to grp|var struct containing the attributes - * @param key the name of the map entry + * @param attlist + * @param jattsp + * @param jtypesp * * @return ::NC_NOERR No error. * @author Dennis Heimbigner */ int -ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, int isclose) +ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, NCjson* jatts, NCjson* jtypes, int isclose) { - int i,stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; NCZ_FILE_INFO_T* zinfo = NULL; - NCjson* jatts = NULL; - NCjson* jtypes = NULL; - NCjson* jtype = NULL; NCjson* jdimrefs = NULL; NCjson* jdict = NULL; NCjson* jint = NULL; NCjson* jdata = NULL; - NCZMAP* map = NULL; char* fullpath = NULL; char* key = NULL; char* content = NULL; @@ -684,6 +714,8 @@ ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, int isc int purezarr = 0; int endianness = (NC_isLittleEndian()?NC_ENDIAN_LITTLE:NC_ENDIAN_BIG); + NC_UNUSED(isclose); + LOG((3, "%s", __func__)); ZTRACE(3,"file=%s container=%s |attlist|=%u",file->controller->path,container->name,(unsigned)ncindexsize(attlist)); @@ -696,46 +728,33 @@ ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, int isc } zinfo = file->format_file_info; - map = zinfo->map; - purezarr = (zinfo->controls.flags & FLAG_PUREZARR)?1:0; if(zinfo->controls.flags & FLAG_XARRAYDIMS) isxarray = 1; - /* Create the attribute dictionary */ - if((stat = NCJnew(NCJ_DICT,&jatts))) goto done; - if(ncindexsize(attlist) > 0) { - /* Create the jncattr.types object */ - if((stat = NCJnew(NCJ_DICT,&jtypes))) - goto done; /* Walk all the attributes convert to json and collect the dtype */ for(i=0;ihdr.name); - /* If reserved and hidden, then ignore */ - if(ra && (ra->flags & HIDDENATTRFLAG)) continue; -#endif if(a->nc_typeid > NC_MAX_ATOMIC_TYPE) {stat = (THROW(NC_ENCZARR)); goto done;} if(a->nc_typeid == NC_STRING) - typesize = NCZ_get_maxstrlen(container); + typesize = (size_t)NCZ_get_maxstrlen(container); else {if((stat = NC4_inq_atomic_type(a->nc_typeid,NULL,&typesize))) goto done;} /* Convert to storable json */ if((stat = NCZ_stringconvert(a->nc_typeid,a->len,a->data,&jdata))) goto done; - if((stat = NCJinsert(jatts,a->hdr.name,jdata))) goto done; - jdata = NULL; /* Collect the corresponding dtype */ - { - if((stat = ncz_nctype2dtype(a->nc_typeid,endianness,purezarr,typesize,&tname))) goto done; - if((stat = NCJnewstring(NCJ_STRING,tname,&jtype))) goto done; - nullfree(tname); tname = NULL; - if((stat = NCJinsert(jtypes,a->hdr.name,jtype))) goto done; /* add {name: type} */ - jtype = NULL; - } + if((stat = ncz_nctype2dtype(a->nc_typeid,endianness,purezarr,typesize,&tname))) goto done; + + /* Insert the attribute; consumes jdata */ + if((stat = insert_attr(jatts,jtypes,a->hdr.name, jdata, tname))) goto done; + + /* cleanup */ + nullfree(tname); tname = NULL; + jdata = NULL; + } } @@ -751,8 +770,7 @@ ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, int isc if(inrootgroup && isxarray) { int dimsinroot = 1; /* Insert the XARRAY _ARRAY_ATTRIBUTE attribute */ - if((stat = NCJnew(NCJ_ARRAY,&jdimrefs))) - goto done; + NCJnew(NCJ_ARRAY,&jdimrefs); /* Fake the scalar case */ if(var->ndims == 0) { NCJaddstring(jdimrefs,NCJ_STRING,XARRAYSCALAR); @@ -776,7 +794,7 @@ ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, int isc nullfree(dimname); dimname = NULL; } /* Add the _ARRAY_DIMENSIONS attribute */ - if((stat = NCJinsert(jatts,NC_XARRAY_DIMS,jdimrefs))) goto done; + if((stat = NCJinsert(jatts,NC_XARRAY_DIMS,jdimrefs))<0) {stat = NC_EINVAL; goto done;} jdimrefs = NULL; } } @@ -785,56 +803,31 @@ ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, int isc if(container->sort == NCVAR && var && var->quantize_mode > 0) { char mode[64]; snprintf(mode,sizeof(mode),"%d",var->nsd); - if((stat = NCJnewstring(NCJ_INT,mode,&jint))) - goto done; + NCJnewstring(NCJ_INT,mode,&jint); /* Insert the quantize attribute */ switch (var->quantize_mode) { case NC_QUANTIZE_BITGROOM: - if((stat = NCJinsert(jatts,NC_QUANTIZE_BITGROOM_ATT_NAME,jint))) goto done; + if((stat = NCJinsert(jatts,NC_QUANTIZE_BITGROOM_ATT_NAME,jint))<0) {stat = NC_EINVAL; goto done;} jint = NULL; break; case NC_QUANTIZE_GRANULARBR: - if((stat = NCJinsert(jatts,NC_QUANTIZE_GRANULARBR_ATT_NAME,jint))) goto done; + if((stat = NCJinsert(jatts,NC_QUANTIZE_GRANULARBR_ATT_NAME,jint))<0) {stat = NC_EINVAL; goto done;} jint = NULL; break; case NC_QUANTIZE_BITROUND: - if((stat = NCJinsert(jatts,NC_QUANTIZE_BITROUND_ATT_NAME,jint))) goto done; + if((stat = NCJinsert(jatts,NC_QUANTIZE_BITROUND_ATT_NAME,jint))<0) {stat = NC_EINVAL; goto done;} jint = NULL; break; default: break; } } - if(NCJlength(jatts) > 0) { - if(!(zinfo->controls.flags & FLAG_PUREZARR)) { - /* Insert the _NCZARR_ATTR attribute */ - if((stat = NCJnew(NCJ_DICT,&jdict))) - goto done; - if(jtypes != NULL) - {if((stat = NCJinsert(jdict,"types",jtypes))) goto done;} - jtypes = NULL; - if(jdict != NULL) - {if((stat = NCJinsert(jatts,NCZ_V2_ATTR,jdict))) goto done;} - jdict = NULL; - } - /* write .zattrs path */ - if((stat = nczm_concat(fullpath,ZATTRS,&key))) - goto done; - /* Write to map */ - if((stat=NCZ_uploadjson(map,key,jatts))) - goto done; - nullfree(key); key = NULL; - } - done: nullfree(fullpath); nullfree(key); nullfree(content); nullfree(dimpath); nullfree(tname); - NCJreclaim(jatts); - NCJreclaim(jtypes); - NCJreclaim(jtype); NCJreclaim(jdimrefs); NCJreclaim(jdict); NCJreclaim(jint); @@ -850,115 +843,79 @@ ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, int isc the corresponding NCjson dict. @param map - [in] the map object for storage @param container - [in] the containing object -@param jattrsp - [out] the json for .zattrs -@param jtypesp - [out] the json for .ztypes +@param jattsp - [out] the json for .zattrs || NULL if not found +@param jtypesp - [out] the json attribute type dict || NULL +@param jnczgrp - [out] the json for _nczarr_group || NULL +@param jnczarray - [out] the json for _nczarr_array || NULL @return NC_NOERR +@return NC_EXXX @author Dennis Heimbigner */ static int -load_jatts(NCZMAP* map, NC_OBJ* container, int nczarrv1, NCjson** jattrsp, NClist** atypesp) +download_jatts(NC_FILE_INFO_T* file, NC_OBJ* container, const NCjson** jattsp, const NCjson** jtypesp) { int stat = NC_NOERR; - char* fullpath = NULL; - char* key = NULL; - NCjson* jnczarr = NULL; - NCjson* jattrs = NULL; - NCjson* jncattr = NULL; - NClist* atypes = NULL; /* envv list */ + const NCjson* jatts = NULL; + const NCjson* jtypes = NULL; + const NCjson* jnczattr = NULL; + NC_GRP_INFO_T* grp = NULL; + NC_VAR_INFO_T* var = NULL; + NCZ_GRP_INFO_T* zgrp = NULL; + NCZ_VAR_INFO_T* zvar = NULL; + NCZ_FILE_INFO_T* zinfo = (NCZ_FILE_INFO_T*)file->format_file_info; + int purezarr = 0; + int zarrkey = 0; - ZTRACE(3,"map=%p container=%s nczarrv1=%d",map,container->name,nczarrv1); + ZTRACE(3,"map=%p container=%s ",map,container->name); - /* alway return (possibly empty) list of types */ - atypes = nclistnew(); + purezarr = (zinfo->controls.flags & FLAG_PUREZARR)?1:0; + zarrkey = (zinfo->controls.flags & FLAG_NCZARR_KEY)?1:0; if(container->sort == NCGRP) { - NC_GRP_INFO_T* grp = (NC_GRP_INFO_T*)container; - /* Get grp's fullpath name */ - if((stat = NCZ_grpkey(grp,&fullpath))) - goto done; + grp = (NC_GRP_INFO_T*)container; + zgrp = (NCZ_GRP_INFO_T*)grp->format_grp_info; + jatts = zgrp->zgroup.atts; } else { - NC_VAR_INFO_T* var = (NC_VAR_INFO_T*)container; - /* Get var's fullpath name */ - if((stat = NCZ_varkey(var,&fullpath))) - goto done; + var = (NC_VAR_INFO_T*)container; + zvar = (NCZ_VAR_INFO_T*)var->format_var_info; + jatts = zvar->zarray.atts; } - - /* Construct the path to the .zattrs object */ - if((stat = nczm_concat(fullpath,ZATTRS,&key))) - goto done; - - /* Download the .zattrs object: may not exist if not NCZarr V1 */ - switch ((stat=NCZ_downloadjson(map,key,&jattrs))) { - case NC_NOERR: break; - case NC_EEMPTY: stat = NC_NOERR; break; /* did not exist */ - default: goto done; /* failure */ - } - nullfree(key); key = NULL; - - if(jattrs != NULL) { - if(nczarrv1) { - /* Construct the path to the NCZATTRS object */ - if((stat = nczm_concat(fullpath,NCZATTRS,&key))) goto done; - /* Download the NCZATTRS object: may not exist if pure zarr or using deprecated name */ - stat=NCZ_downloadjson(map,key,&jncattr); - if(stat == NC_EEMPTY) { - /* try deprecated name */ - nullfree(key); key = NULL; - if((stat = nczm_concat(fullpath,NCZATTRDEP,&key))) goto done; - stat=NCZ_downloadjson(map,key,&jncattr); - } - } else {/* Get _nczarr_attr from .zattrs */ - stat = NCJdictget(jattrs,NCZ_V2_ATTR,&jncattr); - if(!stat && jncattr == NULL) - {stat = NCJdictget(jattrs,NCZ_V2_ATTR_UC,&jncattr);} - } - nullfree(key); key = NULL; - switch (stat) { - case NC_NOERR: break; - case NC_EEMPTY: stat = NC_NOERR; jncattr = NULL; break; - default: goto done; /* failure */ - } - if(jncattr != NULL) { - NCjson* jtypes = NULL; - /* jncattr attribute should be a dict */ - if(NCJsort(jncattr) != NCJ_DICT) {stat = (THROW(NC_ENCZARR)); goto done;} - /* Extract "types; may not exist if only hidden attributes are defined */ - if((stat = NCJdictget(jncattr,"types",&jtypes))) goto done; + assert(purezarr || zarrkey || jatts != NULL); + + if(jatts != NULL) { + /* Get _nczarr_attr from .zattrs */ + if((stat = NCJdictget(jatts,NCZ_V2_ATTR,&jnczattr))<0) {stat = NC_EINVAL; goto done;} + if(jnczattr != NULL) { + /* jnczattr attribute should be a dict */ + if(NCJsort(jnczattr) != NCJ_DICT) {stat = (THROW(NC_ENCZARR)); goto done;} + /* Extract "types"; may not exist if only hidden attributes are defined */ + if((stat = NCJdictget(jnczattr,"types",&jtypes))<0) {stat = NC_EINVAL; goto done;} if(jtypes != NULL) { if(NCJsort(jtypes) != NCJ_DICT) {stat = (THROW(NC_ENCZARR)); goto done;} - /* Convert to an envv list */ - if((stat = jtypes2atypes(jtypes,atypes))) goto done; } } } - if(jattrsp) {*jattrsp = jattrs; jattrs = NULL;} - if(atypesp) {*atypesp = atypes; atypes = NULL;} + if(jattsp) {*jattsp = jatts; jatts = NULL;} + if(jtypes) {*jtypesp = jtypes; jtypes = NULL;} done: - if(nczarrv1) - NCJreclaim(jncattr); - if(stat) { - NCJreclaim(jnczarr); - nclistfreeall(atypes); - } - nullfree(fullpath); - nullfree(key); return ZUNTRACE(THROW(stat)); } /* Convert a JSON singleton or array of strings to a single string */ static int -zcharify(NCjson* src, NCbytes* buf) +zcharify(const NCjson* src, NCbytes* buf) { - int i, stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; struct NCJconst jstr = NCJconst_empty; if(NCJsort(src) != NCJ_ARRAY) { /* singleton */ - if((stat = NCJcvt(src, NCJ_STRING, &jstr))) goto done; + if((stat = NCJcvt(src, NCJ_STRING, &jstr))<0) {stat = NC_EINVAL; goto done;} ncbytescat(buf,jstr.sval); } else for(i=0;icontroller->path,grp->hdr.name); zinfo = file->format_file_info; - map = zinfo->map; + zgrp = grp->format_grp_info; + + purezarr = (zinfo->controls.flags & FLAG_PUREZARR)?1:0; /* Construct grp path */ - if((stat = NCZ_grpkey(grp,&fullpath))) - goto done; + if((stat = NCZ_grpkey(grp,&fullpath))) goto done; - if(zinfo->controls.flags & FLAG_PUREZARR) { + /* Download .zgroup and .zattrs */ + if((stat = downloadzarrobj(file,&zgrp->zgroup,fullpath,ZGROUP))) goto done; + jgroup = zgrp->zgroup.obj; + jattrs = zgrp->zgroup.atts; + + if(purezarr) { if((stat = parse_group_content_pure(zinfo,grp,varnames,subgrps))) goto done; purezarr = 1; } else { /*!purezarr*/ - if(zinfo->controls.flags & FLAG_NCZARR_V1) { - /* build NCZGROUP path */ - if((stat = nczm_concat(fullpath,NCZGROUP,&key))) - goto done; - /* Read */ - jdict = NULL; - stat=NCZ_downloadjson(map,key,&jdict); - v1 = 1; - } else { - /* build ZGROUP path */ - if((stat = nczm_concat(fullpath,ZGROUP,&key))) - goto done; - /* Read */ - switch (stat=NCZ_downloadjson(map,key,&jgroup)) { - case NC_NOERR: /* Extract the NCZ_V2_GROUP dict */ - if((stat = NCJdictget(jgroup,NCZ_V2_GROUP,&jdict))) goto done; - if(!stat && jdict == NULL) - {if((stat = NCJdictget(jgroup,NCZ_V2_GROUP_UC,&jdict))) goto done;} - break; - case NC_EEMPTY: /* does not exist, use search */ - if((stat = parse_group_content_pure(zinfo,grp,varnames,subgrps))) - goto done; - purezarr = 1; - break; - default: goto done; - } + if(jgroup == NULL) { /* does not exist, use search */ + if((stat = parse_group_content_pure(zinfo,grp,varnames,subgrps))) goto done; + purezarr = 1; + } + if(jattrs == NULL) { /* does not exist, use search */ + if((stat = parse_group_content_pure(zinfo,grp,varnames,subgrps))) goto done; + purezarr = 1; + } else { /* Extract the NCZ_V2_GROUP attribute*/ + if((stat = getnczarrkey((NC_OBJ*)grp,NCZ_V2_GROUP,&jnczgrp))) goto done; } nullfree(key); key = NULL; - if(jdict) { + if(jnczgrp) { /* Pull out lists about group content */ - if((stat = parse_group_content(jdict,dimdefs,varnames,subgrps))) + if((stat = parse_group_content(jnczgrp,dimdefs,varnames,subgrps))) goto done; } } @@ -1234,9 +1181,7 @@ define_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp) if((stat = define_subgrps(file,grp,subgrps))) goto done; done: - if(v1) NCJreclaim(jdict); NCJreclaim(json); - NCJreclaim(jgroup); nclistfreeall(dimdefs); nclistfreeall(varnames); nclistfreeall(subgrps); @@ -1259,48 +1204,46 @@ int ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container) { int stat = NC_NOERR; - int i; + size_t i; char* fullpath = NULL; char* key = NULL; NCZ_FILE_INFO_T* zinfo = NULL; NC_VAR_INFO_T* var = NULL; NCZ_VAR_INFO_T* zvar = NULL; NC_GRP_INFO_T* grp = NULL; - NCZMAP* map = NULL; + NCZ_GRP_INFO_T* zgrp = NULL; NC_ATT_INFO_T* att = NULL; NCindex* attlist = NULL; - NCjson* jattrs = NULL; - NClist* atypes = NULL; nc_type typeid; size_t len, typelen; void* data = NULL; NC_ATT_INFO_T* fillvalueatt = NULL; nc_type typehint = NC_NAT; - int purezarr; + int purezarr,zarrkeys; + const NCjson* jattrs = NULL; + const NCjson* jtypes = NULL; + struct ZARROBJ* zobj = NULL; ZTRACE(3,"file=%s container=%s",file->controller->path,container->name); zinfo = file->format_file_info; - map = zinfo->map; - purezarr = (zinfo->controls.flags & FLAG_PUREZARR)?1:0; + zarrkeys = (zinfo->controls.flags & FLAG_NCZARR_KEY)?1:0; if(container->sort == NCGRP) { grp = ((NC_GRP_INFO_T*)container); attlist = grp->att; + zgrp = (NCZ_GRP_INFO_T*)(grp->format_grp_info); + zobj = &zgrp->zgroup; } else { var = ((NC_VAR_INFO_T*)container); - zvar = (NCZ_VAR_INFO_T*)(var->format_var_info); attlist = var->att; + zvar = (NCZ_VAR_INFO_T*)(var->format_var_info); + zobj = &zvar->zarray; } + assert(purezarr || zarrkeys || zobj->obj != NULL); - switch ((stat = load_jatts(map, container, (zinfo->controls.flags & FLAG_NCZARR_V1), &jattrs, &atypes))) { - case NC_NOERR: break; - case NC_EEMPTY: /* container has no attributes */ - stat = NC_NOERR; - break; - default: goto done; /* true error */ - } + if((stat = download_jatts(file, container, &jattrs, &jtypes))) goto done; if(jattrs != NULL) { /* Iterate over the attributes to create the in-memory attributes */ @@ -1334,7 +1277,7 @@ ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container) /* case 2: name = _ARRAY_DIMENSIONS, sort==NCVAR, flags & HIDDENATTRFLAG */ if(strcmp(aname,NC_XARRAY_DIMS)==0 && var != NULL && (ra->flags & HIDDENATTRFLAG)) { /* store for later */ - int i; + size_t i; assert(NCJsort(value) == NCJ_ARRAY); if((zvar->xarray = nclistnew())==NULL) {stat = NC_ENOMEM; goto done;} @@ -1352,7 +1295,7 @@ ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container) typehint = var->type_info->hdr.id ; /* if unknown use the var's type for _FillValue */ /* Create the attribute */ /* Collect the attribute's type and value */ - if((stat = computeattrinfo(aname,atypes,typehint,purezarr,value, + if((stat = computeattrinfo(aname,jtypes,typehint,purezarr,value, &typeid,&typelen,&len,&data))) goto done; if((stat = ncz_makeattr(container,attlist,aname,typeid,len,data,&att))) @@ -1383,8 +1326,6 @@ ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container) done: if(data != NULL) stat = NC_reclaim_data(file->controller,att->nc_typeid,data,len); - NCJreclaim(jattrs); - nclistfreeall(atypes); nullfree(fullpath); nullfree(key); return ZUNTRACE(THROW(stat)); @@ -1435,378 +1376,371 @@ define_dims(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* diminfo) return ZUNTRACE(THROW(stat)); } + /** - * @internal Materialize vars into memory; + * @internal Materialize single var into memory; * Take xarray and purezarr into account. * * @param file Pointer to file info struct. * @param grp Pointer to grp info struct. - * @param varnames List of names of variables in this group + * @param varname name of variable in this group * * @return ::NC_NOERR No error. * @author Dennis Heimbigner */ static int -define_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* varnames) +define_var1(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, const char* varname) { int stat = NC_NOERR; - size_t i,j; + size_t j; NCZ_FILE_INFO_T* zinfo = NULL; - NCZMAP* map = NULL; int purezarr = 0; int xarray = 0; - int formatv1 = 0; + /* per-variable info */ + NC_VAR_INFO_T* var = NULL; + NCZ_VAR_INFO_T* zvar = NULL; + const NCjson* jvar = NULL; + const NCjson* jatts = NULL; /* corresponding to jvar */ + const NCjson* jncvar = NULL; + const NCjson* jdimrefs = NULL; + const NCjson* jvalue = NULL; + char* varpath = NULL; + char* key = NULL; + size64_t* shapes = NULL; + NClist* dimnames = NULL; + int varsized = 0; + int suppress = 0; /* Abort processing of this variable */ + nc_type vtype = NC_NAT; + int vtypelen = 0; + size_t rank = 0; + size_t zarr_rank = 0; /* Need to watch out for scalars */ +#ifdef NETCDF_ENABLE_NCZARR_FILTERS + const NCjson* jfilter = NULL; + int chainindex = 0; +#endif - ZTRACE(3,"file=%s grp=%s |varnames|=%u",file->controller->path,grp->hdr.name,nclistlength(varnames)); + ZTRACE(3,"file=%s grp=%s varname=%s",file->controller->path,grp->hdr.name,varname); zinfo = file->format_file_info; - map = zinfo->map; if(zinfo->controls.flags & FLAG_PUREZARR) purezarr = 1; - if(zinfo->controls.flags & FLAG_NCZARR_V1) formatv1 = 1; if(zinfo->controls.flags & FLAG_XARRAYDIMS) {xarray = 1;} - /* Load each var in turn */ - for(i = 0; i < nclistlength(varnames); i++) { - /* per-variable info */ - NC_VAR_INFO_T* var = NULL; - NCZ_VAR_INFO_T* zvar = NULL; - NCjson* jvar = NULL; - NCjson* jncvar = NULL; - NCjson* jdimrefs = NULL; - NCjson* jvalue = NULL; - char* varpath = NULL; - char* key = NULL; - const char* varname = NULL; - size64_t* shapes = NULL; - NClist* dimnames = NULL; - int varsized = 0; - int suppress = 0; /* Abort processing of this variable */ - nc_type vtype = NC_NAT; - int vtypelen = 0; - int rank = 0; - int zarr_rank = 0; /* Need to watch out for scalars */ -#ifdef NETCDF_ENABLE_NCZARR_FILTERS - NCjson* jfilter = NULL; - int chainindex = 0; -#endif - - dimnames = nclistnew(); - varname = nclistget(varnames,i); - - if((stat = nc4_var_list_add2(grp, varname, &var))) - goto done; + dimnames = nclistnew(); - /* And its annotation */ - if((zvar = calloc(1,sizeof(NCZ_VAR_INFO_T)))==NULL) - {stat = NC_ENOMEM; goto done;} - var->format_var_info = zvar; - zvar->common.file = file; + if((stat = nc4_var_list_add2(grp, varname, &var))) + goto done; - /* pretend it was created */ - var->created = 1; + /* And its annotation */ + if((zvar = calloc(1,sizeof(NCZ_VAR_INFO_T)))==NULL) + {stat = NC_ENOMEM; goto done;} + var->format_var_info = zvar; + zvar->common.file = file; - /* Indicate we do not have quantizer yet */ - var->quantize_mode = -1; + /* pretend it was created */ + var->created = 1; - /* Construct var path */ - if((stat = NCZ_varkey(var,&varpath))) - goto done; + /* Indicate we do not have quantizer yet */ + var->quantize_mode = -1; - /* Construct the path to the zarray object */ - if((stat = nczm_concat(varpath,ZARRAY,&key))) - goto done; - /* Download the zarray object */ - if((stat=NCZ_readdict(map,key,&jvar))) - goto done; - nullfree(key); key = NULL; - assert(NCJsort(jvar) == NCJ_DICT); + /* Construct var path */ + if((stat = NCZ_varkey(var,&varpath))) + goto done; - /* Extract the .zarray info from jvar */ + /* Download */ + if((stat = downloadzarrobj(file,&zvar->zarray,varpath,ZARRAY))) goto done; + jvar = zvar->zarray.obj; + jatts = zvar->zarray.atts; + assert(jvar == NULL || NCJsort(jvar) == NCJ_DICT); + assert(jatts == NULL || NCJsort(jatts) == NCJ_DICT); - /* Verify the format */ - { - int version; - if((stat = NCJdictget(jvar,"zarr_format",&jvalue))) goto done; - sscanf(NCJstring(jvalue),"%d",&version); - if(version != zinfo->zarr.zarr_version) - {stat = (THROW(NC_ENCZARR)); goto done;} - } + /* Verify the format */ + { + int version; + if((stat = NCJdictget(jvar,"zarr_format",&jvalue))<0) {stat = NC_EINVAL; goto done;} + sscanf(NCJstring(jvalue),"%d",&version); + if(version != zinfo->zarr.zarr_version) + {stat = (THROW(NC_ENCZARR)); goto done;} + } - /* Set the type and endianness of the variable */ - { - int endianness; - if((stat = NCJdictget(jvar,"dtype",&jvalue))) goto done; - /* Convert dtype to nc_type + endianness */ - if((stat = ncz_dtype2nctype(NCJstring(jvalue),NC_NAT,purezarr,&vtype,&endianness,&vtypelen))) + /* Set the type and endianness of the variable */ + { + int endianness; + if((stat = NCJdictget(jvar,"dtype",&jvalue))<0) {stat = NC_EINVAL; goto done;} + /* Convert dtype to nc_type + endianness */ + if((stat = ncz_dtype2nctype(NCJstring(jvalue),NC_NAT,purezarr,&vtype,&endianness,&vtypelen))) + goto done; + if(vtype > NC_NAT && vtype <= NC_MAX_ATOMIC_TYPE) { + /* Locate the NC_TYPE_INFO_T object */ + if((stat = ncz_gettype(file,grp,vtype,&var->type_info))) goto done; - if(vtype > NC_NAT && vtype <= NC_MAX_ATOMIC_TYPE) { - /* Locate the NC_TYPE_INFO_T object */ - if((stat = ncz_gettype(file,grp,vtype,&var->type_info))) - goto done; - } else {stat = NC_EBADTYPE; goto done;} + } else {stat = NC_EBADTYPE; goto done;} #if 0 /* leave native in place */ - if(endianness == NC_ENDIAN_NATIVE) - endianness = zinfo->native_endianness; - if(endianness == NC_ENDIAN_NATIVE) - endianness = (NCZ_isLittleEndian()?NC_ENDIAN_LITTLE:NC_ENDIAN_BIG); - if(endianness == NC_ENDIAN_LITTLE || endianness == NC_ENDIAN_BIG) { - var->endianness = endianness; - } else {stat = NC_EBADTYPE; goto done;} -#else + if(endianness == NC_ENDIAN_NATIVE) + endianness = zinfo->native_endianness; + if(endianness == NC_ENDIAN_NATIVE) + endianness = (NCZ_isLittleEndian()?NC_ENDIAN_LITTLE:NC_ENDIAN_BIG); + if(endianness == NC_ENDIAN_LITTLE || endianness == NC_ENDIAN_BIG) { var->endianness = endianness; + } else {stat = NC_EBADTYPE; goto done;} +#else + var->endianness = endianness; #endif - var->type_info->endianness = var->endianness; /* Propagate */ - if(vtype == NC_STRING) { - zvar->maxstrlen = vtypelen; - vtypelen = sizeof(char*); /* in-memory len */ - if(zvar->maxstrlen <= 0) zvar->maxstrlen = NCZ_get_maxstrlen((NC_OBJ*)var); - } + var->type_info->endianness = var->endianness; /* Propagate */ + if(vtype == NC_STRING) { + zvar->maxstrlen = vtypelen; + vtypelen = sizeof(char*); /* in-memory len */ + if(zvar->maxstrlen <= 0) zvar->maxstrlen = NCZ_get_maxstrlen((NC_OBJ*)var); } + } - if(!purezarr) { - /* Extract the _NCZARR_ARRAY values */ - /* Do this first so we know about storage esp. scalar */ - if(formatv1) { - /* Construct the path to the zarray object */ - if((stat = nczm_concat(varpath,NCZARRAY,&key))) - goto done; - /* Download the nczarray object */ - if((stat=NCZ_readdict(map,key,&jncvar))) - goto done; - nullfree(key); key = NULL; - } else {/* format v2 */ - /* Extract the NCZ_V2_ARRAY dict */ - if((stat = NCJdictget(jvar,NCZ_V2_ARRAY,&jncvar))) goto done; - if(!stat && jncvar == NULL) - {if((stat = NCJdictget(jvar,NCZ_V2_ARRAY_UC,&jncvar))) goto done;} - } - if(jncvar == NULL) {stat = NC_ENCZARR; goto done;} - assert((NCJsort(jncvar) == NCJ_DICT)); - /* Extract scalar flag */ - if((stat = NCJdictget(jncvar,"scalar",&jvalue))) - goto done; - if(jvalue != NULL) { - var->storage = NC_CHUNKED; - zvar->scalar = 1; - } - /* Extract storage flag */ - if((stat = NCJdictget(jncvar,"storage",&jvalue))) - goto done; - if(jvalue != NULL) { - var->storage = NC_CHUNKED; - } - /* Extract dimrefs list */ - switch ((stat = NCJdictget(jncvar,"dimrefs",&jdimrefs))) { - case NC_NOERR: /* Extract the dimref names */ - assert((NCJsort(jdimrefs) == NCJ_ARRAY)); - if(zvar->scalar) { - assert(NCJlength(jdimrefs) == 0); - } else { - rank = NCJlength(jdimrefs); - for(j=0;jstorage = NC_CHUNKED; + zvar->scalar = 1; + } + /* Extract storage flag */ + if((stat = NCJdictget(jncvar,"storage",&jvalue))<0) {stat = NC_EINVAL; goto done;} + if(jvalue != NULL) + var->storage = NC_CHUNKED; + /* Extract dimrefs list */ + if((stat = dictgetalt(jncvar,"dimension_references","dimensions",&jdimrefs))) goto done; + if(jdimrefs != NULL) { /* Extract the dimref names */ + assert((NCJsort(jdimrefs) == NCJ_ARRAY)); + if(zvar->scalar) { + assert(NCJlength(jdimrefs) == 0); + } else { + rank = NCJlength(jdimrefs); + for(j=0;jdimension_separator = 0; + if((stat = NCJdictget(jvar,"dimension_separator",&jvalue))<0) {stat = NC_EINVAL; goto done;} + if(jvalue != NULL) { + /* Verify its value */ + if(NCJsort(jvalue) == NCJ_STRING && NCJstring(jvalue) != NULL && strlen(NCJstring(jvalue)) == 1) + zvar->dimension_separator = NCJstring(jvalue)[0]; } + /* If value is invalid, then use global default */ + if(!islegaldimsep(zvar->dimension_separator)) + zvar->dimension_separator = ngs->zarr.dimension_separator; /* use global value */ + assert(islegaldimsep(zvar->dimension_separator)); /* we are hosed */ + } - /* Capture dimension_separator (must precede chunk cache creation) */ - { - NCglobalstate* ngs = NC_getglobalstate(); - assert(ngs != NULL); - zvar->dimension_separator = 0; - if((stat = NCJdictget(jvar,"dimension_separator",&jvalue))) goto done; - if(jvalue != NULL) { - /* Verify its value */ - if(NCJsort(jvalue) == NCJ_STRING && NCJstring(jvalue) != NULL && strlen(NCJstring(jvalue)) == 1) - zvar->dimension_separator = NCJstring(jvalue)[0]; - } - /* If value is invalid, then use global default */ - if(!islegaldimsep(zvar->dimension_separator)) - zvar->dimension_separator = ngs->zarr.dimension_separator; /* use global value */ - assert(islegaldimsep(zvar->dimension_separator)); /* we are hosed */ + /* fill_value; must precede calls to adjust cache */ + { + if((stat = NCJdictget(jvar,"fill_value",&jvalue))<0) {stat = NC_EINVAL; goto done;} + if(jvalue == NULL || NCJsort(jvalue) == NCJ_NULL) + var->no_fill = 1; + else { + size_t fvlen; + nc_type atypeid = vtype; + var->no_fill = 0; + if((stat = computeattrdata(var->type_info->hdr.id, &atypeid, jvalue, NULL, &fvlen, &var->fill_value))) + goto done; + assert(atypeid == vtype); + /* Note that we do not create the _FillValue + attribute here to avoid having to read all + the attributes and thus foiling lazy read.*/ } + } - /* fill_value; must precede calls to adjust cache */ - { - if((stat = NCJdictget(jvar,"fill_value",&jvalue))) goto done; - if(jvalue == NULL || NCJsort(jvalue) == NCJ_NULL) - var->no_fill = 1; - else { - size_t fvlen; - nc_type atypeid = vtype; - var->no_fill = 0; - if((stat = computeattrdata(var->type_info->hdr.id, &atypeid, jvalue, NULL, &fvlen, &var->fill_value))) - goto done; - assert(atypeid == vtype); - /* Note that we do not create the _FillValue - attribute here to avoid having to read all - the attributes and thus foiling lazy read.*/ - } + /* shape */ + { + if((stat = NCJdictget(jvar,"shape",&jvalue))<0) {stat = NC_EINVAL; goto done;} + if(NCJsort(jvalue) != NCJ_ARRAY) {stat = (THROW(NC_ENCZARR)); goto done;} + + /* Process the rank */ + zarr_rank = NCJlength(jvalue); + if(zarr_rank == 0) { + /* suppress variable */ + ZLOG(NCLOGWARN,"Empty shape for variable %s suppressed",var->hdr.name); + suppress = 1; + goto suppressvar; } - /* shape */ - { - if((stat = NCJdictget(jvar,"shape",&jvalue))) goto done; - if(NCJsort(jvalue) != NCJ_ARRAY) {stat = (THROW(NC_ENCZARR)); goto done;} - - /* Process the rank */ - zarr_rank = NCJlength(jvalue); - if(zarr_rank == 0) { - /* suppress variable */ - ZLOG(NCLOGWARN,"Empty shape for variable %s suppressed",var->hdr.name); - suppress = 1; - goto suppressvar; - } + if(zvar->scalar) { + rank = 0; + zarr_rank = 1; /* Zarr does not support scalars */ + } else + rank = (zarr_rank = NCJlength(jvalue)); - if(zvar->scalar) { - rank = 0; - zarr_rank = 1; /* Zarr does not support scalars */ - } else - rank = (zarr_rank = NCJlength(jvalue)); - - if(zarr_rank > 0) { - /* Save the rank of the variable */ - if((stat = nc4_var_set_ndims(var, rank))) goto done; - /* extract the shapes */ - if((shapes = (size64_t*)malloc(sizeof(size64_t)*(size_t)zarr_rank)) == NULL) - {stat = (THROW(NC_ENOMEM)); goto done;} - if((stat = decodeints(jvalue, shapes))) goto done; - } + if(zarr_rank > 0) { + /* Save the rank of the variable */ + if((stat = nc4_var_set_ndims(var, rank))) goto done; + /* extract the shapes */ + if((shapes = (size64_t*)malloc(sizeof(size64_t)*(size_t)zarr_rank)) == NULL) + {stat = (THROW(NC_ENOMEM)); goto done;} + if((stat = decodeints(jvalue, shapes))) goto done; } + } - /* chunks */ - { - size64_t chunks[NC_MAX_VAR_DIMS]; - if((stat = NCJdictget(jvar,"chunks",&jvalue))) goto done; - if(jvalue != NULL && NCJsort(jvalue) != NCJ_ARRAY) + /* chunks */ + { + size64_t chunks[NC_MAX_VAR_DIMS]; + if((stat = NCJdictget(jvar,"chunks",&jvalue))<0) {stat = NC_EINVAL; goto done;} + if(jvalue != NULL && NCJsort(jvalue) != NCJ_ARRAY) + {stat = (THROW(NC_ENCZARR)); goto done;} + /* Verify the rank */ + if(zvar->scalar || zarr_rank == 0) { + if(var->ndims != 0) {stat = (THROW(NC_ENCZARR)); goto done;} - /* Verify the rank */ - if(zvar->scalar || zarr_rank == 0) { - if(var->ndims != 0) - {stat = (THROW(NC_ENCZARR)); goto done;} - zvar->chunkproduct = 1; - zvar->chunksize = zvar->chunkproduct * var->type_info->size; - /* Create the cache */ - if((stat = NCZ_create_chunk_cache(var,var->type_info->size*zvar->chunkproduct,zvar->dimension_separator,&zvar->cache))) - goto done; - } else {/* !zvar->scalar */ - if(zarr_rank == 0) {stat = NC_ENCZARR; goto done;} - var->storage = NC_CHUNKED; - if(var->ndims != rank) + zvar->chunkproduct = 1; + zvar->chunksize = zvar->chunkproduct * var->type_info->size; + /* Create the cache */ + if((stat = NCZ_create_chunk_cache(var,var->type_info->size*zvar->chunkproduct,zvar->dimension_separator,&zvar->cache))) + goto done; + } else {/* !zvar->scalar */ + if(zarr_rank == 0) {stat = NC_ENCZARR; goto done;} + var->storage = NC_CHUNKED; + if(var->ndims != rank) + {stat = (THROW(NC_ENCZARR)); goto done;} + if((var->chunksizes = malloc(sizeof(size_t)*(size_t)zarr_rank)) == NULL) + {stat = NC_ENOMEM; goto done;} + if((stat = decodeints(jvalue, chunks))) goto done; + /* validate the chunk sizes */ + zvar->chunkproduct = 1; + for(j=0;jchunksizes = malloc(sizeof(size_t)*(size_t)zarr_rank)) == NULL) - {stat = NC_ENOMEM; goto done;} - if((stat = decodeints(jvalue, chunks))) goto done; - /* validate the chunk sizes */ - zvar->chunkproduct = 1; - for(j=0;jchunksizes[j] = (size_t)chunks[j]; - zvar->chunkproduct *= chunks[j]; - } - zvar->chunksize = zvar->chunkproduct * var->type_info->size; - /* Create the cache */ - if((stat = NCZ_create_chunk_cache(var,var->type_info->size*zvar->chunkproduct,zvar->dimension_separator,&zvar->cache))) - goto done; + var->chunksizes[j] = (size_t)chunks[j]; + zvar->chunkproduct *= chunks[j]; } - if((stat = NCZ_adjust_var_cache(var))) goto done; - } - /* Capture row vs column major; currently, column major not used*/ - { - if((stat = NCJdictget(jvar,"order",&jvalue))) goto done; - if(strcmp(NCJstring(jvalue),"C") > 0) - ((NCZ_VAR_INFO_T*)var->format_var_info)->order = 1; - else ((NCZ_VAR_INFO_T*)var->format_var_info)->order = 0; + zvar->chunksize = zvar->chunkproduct * var->type_info->size; + /* Create the cache */ + if((stat = NCZ_create_chunk_cache(var,var->type_info->size*zvar->chunkproduct,zvar->dimension_separator,&zvar->cache))) + goto done; } - /* filters key */ - /* From V2 Spec: A list of JSON objects providing codec configurations, - or null if no filters are to be applied. Each codec configuration - object MUST contain a "id" key identifying the codec to be used. */ - /* Do filters key before compressor key so final filter chain is in correct order */ - { + if((stat = NCZ_adjust_var_cache(var))) goto done; + } + /* Capture row vs column major; currently, column major not used*/ + { + if((stat = NCJdictget(jvar,"order",&jvalue))<0) {stat = NC_EINVAL; goto done;} + if(strcmp(NCJstring(jvalue),"C") > 0) + ((NCZ_VAR_INFO_T*)var->format_var_info)->order = 1; + else ((NCZ_VAR_INFO_T*)var->format_var_info)->order = 0; + } + /* filters key */ + /* From V2 Spec: A list of JSON objects providing codec configurations, + or null if no filters are to be applied. Each codec configuration + object MUST contain a "id" key identifying the codec to be used. */ + /* Do filters key before compressor key so final filter chain is in correct order */ + { #ifdef NETCDF_ENABLE_NCZARR_FILTERS - if(var->filters == NULL) var->filters = (void*)nclistnew(); - if(zvar->incompletefilters == NULL) zvar->incompletefilters = (void*)nclistnew(); - chainindex = 0; /* track location of filter in the chain */ - if((stat = NCZ_filter_initialize())) goto done; - if((stat = NCJdictget(jvar,"filters",&jvalue))) goto done; - if(jvalue != NULL && NCJsort(jvalue) != NCJ_NULL) { - int k; - if(NCJsort(jvalue) != NCJ_ARRAY) {stat = NC_EFILTER; goto done;} - for(k=0;;k++) { - jfilter = NULL; - jfilter = NCJith(jvalue,k); - if(jfilter == NULL) break; /* done */ - if(NCJsort(jfilter) != NCJ_DICT) {stat = NC_EFILTER; goto done;} - if((stat = NCZ_filter_build(file,var,jfilter,chainindex++))) goto done; - } + if(var->filters == NULL) var->filters = (void*)nclistnew(); + if(zvar->incompletefilters == NULL) zvar->incompletefilters = (void*)nclistnew(); + chainindex = 0; /* track location of filter in the chain */ + if((stat = NCZ_filter_initialize())) goto done; + if((stat = NCJdictget(jvar,"filters",&jvalue))<0) {stat = NC_EINVAL; goto done;} + if(jvalue != NULL && NCJsort(jvalue) != NCJ_NULL) { + int k; + if(NCJsort(jvalue) != NCJ_ARRAY) {stat = NC_EFILTER; goto done;} + for(k=0;;k++) { + jfilter = NULL; + jfilter = NCJith(jvalue,k); + if(jfilter == NULL) break; /* done */ + if(NCJsort(jfilter) != NCJ_DICT) {stat = NC_EFILTER; goto done;} + if((stat = NCZ_filter_build(file,var,jfilter,chainindex++))) goto done; } -#endif } +#endif + } - /* compressor key */ - /* From V2 Spec: A JSON object identifying the primary compression codec and providing - configuration parameters, or ``null`` if no compressor is to be used. */ + /* compressor key */ + /* From V2 Spec: A JSON object identifying the primary compression codec and providing + configuration parameters, or ``null`` if no compressor is to be used. */ #ifdef NETCDF_ENABLE_NCZARR_FILTERS - { - if(var->filters == NULL) var->filters = (void*)nclistnew(); - if((stat = NCZ_filter_initialize())) goto done; - if((stat = NCJdictget(jvar,"compressor",&jfilter))) goto done; - if(jfilter != NULL && NCJsort(jfilter) != NCJ_NULL) { - if(NCJsort(jfilter) != NCJ_DICT) {stat = NC_EFILTER; goto done;} - if((stat = NCZ_filter_build(file,var,jfilter,chainindex++))) goto done; - } + { + if(var->filters == NULL) var->filters = (void*)nclistnew(); + if((stat = NCZ_filter_initialize())) goto done; + if((stat = NCJdictget(jvar,"compressor",&jfilter))<0) {stat = NC_EINVAL; goto done;} + if(jfilter != NULL && NCJsort(jfilter) != NCJ_NULL) { + if(NCJsort(jfilter) != NCJ_DICT) {stat = NC_EFILTER; goto done;} + if((stat = NCZ_filter_build(file,var,jfilter,chainindex++))) goto done; } - /* Suppress variable if there are filters and var is not fixed-size */ - if(varsized && nclistlength((NClist*)var->filters) > 0) - suppress = 1; + } + /* Suppress variable if there are filters and var is not fixed-size */ + if(varsized && nclistlength((NClist*)var->filters) > 0) + suppress = 1; #endif - if(zarr_rank > 0) { - if((stat = computedimrefs(file, var, purezarr, xarray, rank, dimnames, shapes, var->dim))) - goto done; - if(!zvar->scalar) { - /* Extract the dimids */ - for(j=0;jdimids[j] = var->dim[j]->hdr.id; - } + if(zarr_rank > 0) { + if((stat = computedimrefs(file, var, purezarr, xarray, rank, dimnames, shapes, var->dim))) + goto done; + if(!zvar->scalar) { + /* Extract the dimids */ + for(j=0;jdimids[j] = var->dim[j]->hdr.id; } + } #ifdef NETCDF_ENABLE_NCZARR_FILTERS - if(!suppress) { - /* At this point, we can finalize the filters */ - if((stat = NCZ_filter_setup(var))) goto done; - } + if(!suppress) { + /* At this point, we can finalize the filters */ + if((stat = NCZ_filter_setup(var))) goto done; + } #endif suppressvar: - if(suppress) { - /* Reclaim NCZarr variable specific info */ - (void)NCZ_zclose_var1(var); - /* Remove from list of variables and reclaim the top level var object */ - (void)nc4_var_list_del(grp, var); - var = NULL; - } + if(suppress) { + /* Reclaim NCZarr variable specific info */ + (void)NCZ_zclose_var1(var); + /* Remove from list of variables and reclaim the top level var object */ + (void)nc4_var_list_del(grp, var); + var = NULL; + } - /* Clean up from last cycle */ - nclistfreeall(dimnames); dimnames = NULL; - nullfree(varpath); varpath = NULL; - nullfree(shapes); shapes = NULL; - nullfree(key); key = NULL; - if(formatv1) {NCJreclaim(jncvar); jncvar = NULL;} - NCJreclaim(jvar); jvar = NULL; - var = NULL; +done: + nclistfreeall(dimnames); dimnames = NULL; + nullfree(varpath); varpath = NULL; + nullfree(shapes); shapes = NULL; + nullfree(key); key = NULL; + return THROW(stat); +} + +/** + * @internal Materialize vars into memory; + * Take xarray and purezarr into account. + * + * @param file Pointer to file info struct. + * @param grp Pointer to grp info struct. + * @param varnames List of names of variables in this group + * + * @return ::NC_NOERR No error. + * @author Dennis Heimbigner + */ +static int +define_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* varnames) +{ + int stat = NC_NOERR; + size_t i; + + ZTRACE(3,"file=%s grp=%s |varnames|=%u",file->controller->path,grp->hdr.name,nclistlength(varnames)); + + /* Load each var in turn */ + for(i = 0; i < nclistlength(varnames); i++) { + const char* varname = (const char*)nclistget(varnames,i); + if((stat = define_var1(file,grp,varname))) goto done; + varname = nclistget(varnames,i); } done: @@ -1861,78 +1795,82 @@ int ncz_read_superblock(NC_FILE_INFO_T* file, char** nczarrvp, char** zarrfp) { int stat = NC_NOERR; - NCjson* jnczgroup = NULL; - NCjson* jzgroup = NULL; - NCjson* jsuper = NULL; - NCjson* jtmp = NULL; + const NCjson* jnczgroup = NULL; + const NCjson* jnczattr = NULL; + const NCjson* jzgroup = NULL; + const NCjson* jsuper = NULL; + const NCjson* jtmp = NULL; char* nczarr_version = NULL; char* zarr_format = NULL; - NCZ_FILE_INFO_T* zinfo = (NCZ_FILE_INFO_T*)file->format_file_info; + NCZ_FILE_INFO_T* zinfo = NULL; + NC_GRP_INFO_T* root = NULL; + NCZ_GRP_INFO_T* zroot = NULL; + char* fullpath = NULL; ZTRACE(3,"file=%s",file->controller->path); - /* See if the V1 META-Root is being used */ - switch(stat = NCZ_downloadjson(zinfo->map, NCZMETAROOT, &jnczgroup)) { - case NC_EEMPTY: /* not there */ - stat = NC_NOERR; - break; - case NC_NOERR: - if((stat = NCJdictget(jnczgroup,"nczarr_version",&jtmp))) goto done; - nczarr_version = strdup(NCJstring(jtmp)); - break; - default: goto done; - } - /* Get Zarr Root Group, if any */ - switch(stat = NCZ_downloadjson(zinfo->map, ZMETAROOT, &jzgroup)) { - case NC_NOERR: - break; - case NC_EEMPTY: /* not there */ - stat = NC_NOERR; - assert(jzgroup == NULL); - break; - default: goto done; - } - if(jzgroup != NULL) { - /* See if this NCZarr V2 */ - if((stat = NCJdictget(jzgroup,NCZ_V2_SUPERBLOCK,&jsuper))) goto done; - if(!stat && jsuper == NULL) { /* try uppercase name */ - if((stat = NCJdictget(jzgroup,NCZ_V2_SUPERBLOCK_UC,&jsuper))) goto done; - } - if(jsuper != NULL) { - /* Extract the equivalent attribute */ - if(jsuper->sort != NCJ_DICT) - {stat = NC_ENCZARR; goto done;} - if((stat = NCJdictget(jsuper,"version",&jtmp))) goto done; - nczarr_version = nulldup(NCJstring(jtmp)); - } - /* In any case, extract the zarr format */ - if((stat = NCJdictget(jzgroup,"zarr_format",&jtmp))) goto done; - assert(zarr_format == NULL); - zarr_format = nulldup(NCJstring(jtmp)); - } + root = file->root_grp; + assert(root != NULL); + + zinfo = (NCZ_FILE_INFO_T*)file->format_file_info; + zroot = (NCZ_GRP_INFO_T*)root->format_grp_info; + + /* Construct grp key */ + if((stat = NCZ_grpkey(root,&fullpath))) goto done; + + /* Download the root group .zgroup and associated .zattrs */ + if((stat = downloadzarrobj(file, &zroot->zgroup, fullpath, ZGROUP))) goto done; + jzgroup = zroot->zgroup.obj; + + /* Look for superblock; first in .zattrs and then in .zgroup */ + if((stat = getnczarrkey((NC_OBJ*)root,NCZ_V2_SUPERBLOCK,&jsuper))) goto done; + /* Set the format flags */ - if(jnczgroup == NULL && jsuper == NULL) { + + /* Set where _nczarr_xxx are stored */ + if(jsuper != NULL && zroot->zgroup.nczv1) { + zinfo->controls.flags |= FLAG_NCZARR_KEY; + /* Also means file is read only */ + file->no_write = 1; + } + + if(jsuper == NULL) { /* See if this is looks like a NCZarr/Zarr dataset at all by looking for anything here of the form ".z*" */ if((stat = ncz_validate(file))) goto done; /* ok, assume pure zarr with no groups */ zinfo->controls.flags |= FLAG_PUREZARR; - zinfo->controls.flags &= ~(FLAG_NCZARR_V1); if(zarr_format == NULL) zarr_format = strdup("2"); - } else if(jnczgroup != NULL) { - zinfo->controls.flags |= FLAG_NCZARR_V1; - /* Also means file is read only */ - file->no_write = 1; - } else if(jsuper != NULL) { - /* ! FLAG_NCZARR_V1 && ! FLAG_PUREZARR */ } + + /* Look for _nczarr_group */ + if((stat = getnczarrkey((NC_OBJ*)root,NCZ_V2_GROUP,&jnczgroup))) goto done; + + /* Look for _nczarr_attr*/ + if((stat = getnczarrkey((NC_OBJ*)root,NCZ_V2_ATTR,&jnczattr))) goto done; + + if(jsuper != NULL) { + if(jsuper->sort != NCJ_DICT) {stat = NC_ENCZARR; goto done;} + if((stat = NCJdictget(jsuper,"version",&jtmp))<0) {stat = NC_EINVAL; goto done;} + nczarr_version = nulldup(NCJstring(jtmp)); + } + + if(jzgroup != NULL) { + if(jzgroup->sort != NCJ_DICT) {stat = NC_ENCZARR; goto done;} + /* In any case, extract the zarr format */ + if((stat = NCJdictget(jzgroup,"zarr_format",&jtmp))<0) {stat = NC_EINVAL; goto done;} + if(zarr_format == NULL) + zarr_format = nulldup(NCJstring(jtmp)); + else if(strcmp(zarr_format,NCJstring(jtmp))!=0) + {stat = NC_ENCZARR; goto done;} + } + if(nczarrvp) {*nczarrvp = nczarr_version; nczarr_version = NULL;} if(zarrfp) {*zarrfp = zarr_format; zarr_format = NULL;} done: + nullfree(fullpath); nullfree(zarr_format); nullfree(nczarr_version); - NCJreclaim(jzgroup); - NCJreclaim(jnczgroup); return ZUNTRACE(THROW(stat)); } @@ -1940,21 +1878,22 @@ ncz_read_superblock(NC_FILE_INFO_T* file, char** nczarrvp, char** zarrfp) /* Utilities */ static int -parse_group_content(NCjson* jcontent, NClist* dimdefs, NClist* varnames, NClist* subgrps) +parse_group_content(const NCjson* jcontent, NClist* dimdefs, NClist* varnames, NClist* subgrps) { - int i,stat = NC_NOERR; - NCjson* jvalue = NULL; + int stat = NC_NOERR; + size_t i; + const NCjson* jvalue = NULL; ZTRACE(3,"jcontent=|%s| |dimdefs|=%u |varnames|=%u |subgrps|=%u",NCJtotext(jcontent),(unsigned)nclistlength(dimdefs),(unsigned)nclistlength(varnames),(unsigned)nclistlength(subgrps)); - if((stat=NCJdictget(jcontent,"dims",&jvalue))) goto done; + if((stat=dictgetalt(jcontent,"dimensions","dims",&jvalue))) goto done; if(jvalue != NULL) { if(NCJsort(jvalue) != NCJ_DICT) {stat = (THROW(NC_ENCZARR)); goto done;} /* Extract the dimensions defined in this group */ for(i=0;imap,zakey,&jvar))) - goto done; - assert((NCJsort(jvar) == NCJ_DICT)); - nullfree(varkey); varkey = NULL; - nullfree(zakey); zakey = NULL; - /* Extract the shape */ - if((stat=NCJdictget(jvar,"shape",&jvalue))) goto done; - if((stat = decodeints(jvalue, shapes))) goto done; - -done: - NCJreclaim(jvar); - NCJreclaim(jvalue); - nullfree(varkey); varkey = NULL; - nullfree(zakey); zakey = NULL; - return ZUNTRACE(THROW(stat)); -} -#endif - static int searchvars(NCZ_FILE_INFO_T* zfile, NC_GRP_INFO_T* grp, NClist* varnames) { @@ -2132,9 +2038,10 @@ searchsubgrps(NCZ_FILE_INFO_T* zfile, NC_GRP_INFO_T* grp, NClist* subgrpnames) /* Convert a list of integer strings to 64 bit dimension sizes (shapes) */ static int -decodeints(NCjson* jshape, size64_t* shapes) +decodeints(const NCjson* jshape, size64_t* shapes) { - int i, stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; for(i=0;icommon.file->controller->path); - - /* If V2, then do not create a superblock per-se */ - if(!(zinfo->controls.flags & FLAG_NCZARR_V1)) goto done; - - map = zinfo->map; - - /* create superblock json */ - if((stat = NCJnew(NCJ_DICT,&json))) - goto done; - - /* fill */ - snprintf(version,sizeof(version),"%d",zinfo->zarr.zarr_version); - if((stat = NCJaddstring(json,NCJ_STRING,"zarr_format"))) goto done; - if((stat = NCJaddstring(json,NCJ_INT,version))) goto done; - if((stat = NCJaddstring(json,NCJ_STRING,NCZ_V2_VERSION))) goto done; - { - char ver[1024]; - snprintf(ver,sizeof(ver),"%lu.%lu.%lu", - zinfo->zarr.nczarr_version.major, - zinfo->zarr.nczarr_version.minor, - zinfo->zarr.nczarr_version.release); - if((stat = NCJaddstring(json,NCJ_STRING,ver))) goto done; - } - /* Write back to map */ - if((stat=NCZ_uploadjson(map,NCZMETAROOT,json))) - goto done; -done: - NCJreclaim(json); - return ZUNTRACE(stat); -} -#endif - /* Compute the set of dim refs for this variable, taking purezarr and xarray into account */ static int computedimrefs(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int purezarr, int xarray, int ndims, NClist* dimnames, size64_t* shapes, NC_DIM_INFO_T** dims) @@ -2406,11 +2271,12 @@ computedimrefs(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int purezarr, int xarra /* If pure zarr and we have no dimref names, then fake it */ if(purezarr && nclistlength(dimnames) == 0) { + int i; createdims = 1; for(i=0;icontroller->path,grp->hdr.name); + + if(jatts == NULL) goto done; + + zinfo = file->format_file_info; + map = zinfo->map; + + if(container->sort == NCVAR) { + var = (NC_VAR_INFO_T*)container; + } else if(container->sort == NCGRP) { + grp = (NC_GRP_INFO_T*)container; + } + + /* Construct container path */ + if(container->sort == NCGRP) + stat = NCZ_grpkey(grp,&fullpath); + else + stat = NCZ_varkey(var,&fullpath); + if(stat) goto done; + + /* write .zattrs*/ + if((stat = nczm_concat(fullpath,ZATTRS,&key))) goto done; + if((stat=NCZ_uploadjson(map,key,jatts))) goto done; + nullfree(key); key = NULL; + +done: + nullfree(fullpath); + return ZUNTRACE(THROW(stat)); +} + +#if 0 +/** +@internal Get contents of a meta object; fail it it does not exist +@param zmap - [in] map +@param key - [in] key of the object +@param jsonp - [out] return parsed json || NULL if not exists +@return NC_NOERR +@return NC_EXXX +@author Dennis Heimbigner +*/ +static int +readarray(NCZMAP* zmap, const char* key, NCjson** jsonp) +{ + int stat = NC_NOERR; + NCjson* json = NULL; + + if((stat = NCZ_downloadjson(zmap,key,&json))) goto done; + if(json != NULL && NCJsort(json) != NCJ_ARRAY) {stat = NC_ENCZARR; goto done;} + if(jsonp) {*jsonp = json; json = NULL;} +done: + NCJreclaim(json); + return stat; +} +#endif + +/* Get one of two key values from a dict */ +static int +dictgetalt(const NCjson* jdict, const char* name, const char* alt, const NCjson** jvaluep) +{ + int stat = NC_NOERR; + const NCjson* jvalue = NULL; + if((stat = NCJdictget(jdict,name,&jvalue))<0) {stat = NC_EINVAL; goto done;} /* try this first */ + if(jvalue == NULL) { + if((stat = NCJdictget(jdict,alt,&jvalue))<0) {stat = NC_EINVAL; goto done;} /* try this alternative*/ + } + if(jvaluep) *jvaluep = jvalue; +done: + return THROW(stat); +} + +/* Get _nczarr_xxx from either .zXXX or .zattrs */ +static int +getnczarrkey(NC_OBJ* container, const char* name, const NCjson** jncxxxp) +{ + int stat = NC_NOERR; + const NCjson* jxxx = NULL; + NC_GRP_INFO_T* grp = NULL; + NC_VAR_INFO_T* var = NULL; + struct ZARROBJ* zobj = NULL; + + /* Decode container */ + if(container->sort == NCGRP) { + grp = (NC_GRP_INFO_T*)container; + zobj = &((NCZ_GRP_INFO_T*)grp->format_grp_info)->zgroup; + } else { + var = (NC_VAR_INFO_T*)container; + zobj = &((NCZ_VAR_INFO_T*)var->format_var_info)->zarray; + } + + /* Try .zattrs first */ + if(zobj->atts != NULL) { + jxxx = NULL; + if((stat = NCJdictget(zobj->atts,name,&jxxx))<0) {stat = NC_EINVAL; goto done;} + } + if(name == NULL) { + jxxx = NULL; + /* Try .zxxx second */ + if(zobj->obj != NULL) { + if((stat = NCJdictget(zobj->obj,name,&jxxx))<0) {stat = NC_EINVAL; goto done;} + } + /* Mark as old style with _nczarr_xxx in obj not attributes */ + zobj->nczv1 = 1; + } + if(jncxxxp) *jncxxxp = jxxx; +done: + return THROW(stat); +} + +static int +downloadzarrobj(NC_FILE_INFO_T* file, struct ZARROBJ* zobj, const char* fullpath, const char* objname) +{ + int stat = NC_NOERR; + char* key = NULL; + NCZMAP* map = ((NCZ_FILE_INFO_T*)file->format_file_info)->map; + + /* Download .zXXX and .zattrs */ + nullfree(zobj->prefix); + zobj->prefix = strdup(fullpath); + NCJreclaim(zobj->obj); zobj->obj = NULL; + NCJreclaim(zobj->atts); zobj->obj = NULL; + if((stat = nczm_concat(fullpath,objname,&key))) goto done; + if((stat=NCZ_downloadjson(map,key,&zobj->obj))) goto done; + nullfree(key); key = NULL; + if((stat = nczm_concat(fullpath,ZATTRS,&key))) goto done; + if((stat=NCZ_downloadjson(map,key,&zobj->atts))) goto done; +done: + nullfree(key); + return THROW(stat); +} diff --git a/libnczarr/zutil.c b/libnczarr/zutil.c index 1d49738542..8ca4602b24 100644 --- a/libnczarr/zutil.c +++ b/libnczarr/zutil.c @@ -226,8 +226,9 @@ ncz_splitkey(const char* key, NClist* segments) @internal Down load a .z... structure into memory @param zmap - [in] controlling zarr map @param key - [in] .z... object to load -@param jsonp - [out] root of the loaded json +@param jsonp - [out] root of the loaded json (NULL if key does not exist) @return NC_NOERR +@return NC_EXXX @author Dennis Heimbigner */ int @@ -238,17 +239,22 @@ NCZ_downloadjson(NCZMAP* zmap, const char* key, NCjson** jsonp) char* content = NULL; NCjson* json = NULL; - if((stat = nczmap_len(zmap, key, &len))) - goto done; + switch(stat = nczmap_len(zmap, key, &len)) { + case NC_NOERR: break; + case NC_ENOOBJECT: case NC_EEMPTY: + stat = NC_NOERR; + goto exit; + default: goto done; + } if((content = malloc(len+1)) == NULL) {stat = NC_ENOMEM; goto done;} if((stat = nczmap_read(zmap, key, 0, len, (void*)content))) goto done; content[len] = '\0'; - if((stat = NCJparse(content,0,&json)) < 0) {stat = NC_ENCZARR; goto done;} +exit: if(jsonp) {*jsonp = json; json = NULL;} done: @@ -310,13 +316,9 @@ NCZ_createdict(NCZMAP* zmap, const char* key, NCjson** jsonp) NCjson* json = NULL; /* See if it already exists */ - stat = NCZ_downloadjson(zmap,key,&json); - if(stat != NC_NOERR) { - if(stat == NC_EEMPTY) {/* create it */ - if((stat = nczmap_def(zmap,key,NCZ_ISMETA))) - goto done; - } else - goto done; + if((stat = NCZ_downloadjson(zmap,key,&json))) goto done; + ifjson == NULL) { + if((stat = nczmap_def(zmap,key,NCZ_ISMETA))) goto done; } else { /* Already exists, fail */ stat = NC_EINVAL; @@ -346,18 +348,14 @@ NCZ_createarray(NCZMAP* zmap, const char* key, NCjson** jsonp) int stat = NC_NOERR; NCjson* json = NULL; - stat = NCZ_downloadjson(zmap,key,&json); - if(stat != NC_NOERR) { - if(stat == NC_EEMPTY) {/* create it */ - if((stat = nczmap_def(zmap,key,NCZ_ISMETA))) - goto done; - /* Create the initial array */ - if((stat = NCJnew(NCJ_ARRAY,&json))) - goto done; - } else { - stat = NC_EINVAL; - goto done; - } + if((stat = NCZ_downloadjson(zmap,key,&json))) goto done; + if(json == NULL) { /* create it */ + if((stat = nczmap_def(zmap,key,NCZ_ISMETA))) goto done; + /* Create the initial array */ + if((stat = NCJnew(NCJ_ARRAY,&json))) goto done; + } else { + stat = NC_EINVAL; + goto done; } if(json->sort != NCJ_ARRAY) {stat = NC_ENCZARR; goto done;} if(jsonp) {*jsonp = json; json = NULL;} @@ -367,54 +365,6 @@ NCZ_createarray(NCZMAP* zmap, const char* key, NCjson** jsonp) } #endif /*0*/ -/** -@internal Get contents of a meta object; fail it it does not exist -@param zmap - [in] map -@param key - [in] key of the object -@param jsonp - [out] return parsed json -@return NC_NOERR -@return NC_EEMPTY [object did not exist] -@author Dennis Heimbigner -*/ -int -NCZ_readdict(NCZMAP* zmap, const char* key, NCjson** jsonp) -{ - int stat = NC_NOERR; - NCjson* json = NULL; - - if((stat = NCZ_downloadjson(zmap,key,&json))) - goto done; - if(NCJsort(json) != NCJ_DICT) {stat = NC_ENCZARR; goto done;} - if(jsonp) {*jsonp = json; json = NULL;} -done: - NCJreclaim(json); - return stat; -} - -/** -@internal Get contents of a meta object; fail it it does not exist -@param zmap - [in] map -@param key - [in] key of the object -@param jsonp - [out] return parsed json -@return NC_NOERR -@return NC_EEMPTY [object did not exist] -@author Dennis Heimbigner -*/ -int -NCZ_readarray(NCZMAP* zmap, const char* key, NCjson** jsonp) -{ - int stat = NC_NOERR; - NCjson* json = NULL; - - if((stat = NCZ_downloadjson(zmap,key,&json))) - goto done; - if(NCJsort(json) != NCJ_ARRAY) {stat = NC_ENCZARR; goto done;} - if(jsonp) {*jsonp = json; json = NULL;} -done: - NCJreclaim(json); - return stat; -} - #if 0 /** @internal Given an nc_type, produce the corresponding @@ -664,7 +614,7 @@ primarily on the first atomic value encountered recursively. */ int -NCZ_inferattrtype(NCjson* value, nc_type typehint, nc_type* typeidp) +NCZ_inferattrtype(const NCjson* value, nc_type typehint, nc_type* typeidp) { int i,stat = NC_NOERR; nc_type typeid; @@ -1093,7 +1043,7 @@ checksimplejson(NCjson* json, int depth) /* Return 1 if the attribute will be stored as a complex JSON valued attribute; return 0 otherwise */ int -NCZ_iscomplexjson(NCjson* json, nc_type typehint) +NCZ_iscomplexjson(const NCjson* json, nc_type typehint) { int i, stat = 0; diff --git a/libnczarr/zxcache.c b/libnczarr/zxcache.c index 957ed15250..f4ab040d72 100644 --- a/libnczarr/zxcache.c +++ b/libnczarr/zxcache.c @@ -78,7 +78,7 @@ NCZ_set_var_chunk_cache(int ncid, int varid, size_t cachesize, size_t nelems, fl assert(grp && h5); /* Find the var. */ - if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, varid))) + if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, (size_t)varid))) {retval = NC_ENOTVAR; goto done;} assert(var && var->hdr.id == varid); @@ -140,7 +140,7 @@ fprintf(stderr,"xxx: adjusting cache for: %s\n",var->hdr.name); zcache->chunksize = zvar->chunksize; zcache->chunkcount = 1; if(var->ndims > 0) { - int i; + size_t i; for(i=0;indims;i++) { zcache->chunkcount *= var->chunksizes[i]; } @@ -184,7 +184,7 @@ NCZ_create_chunk_cache(NC_VAR_INFO_T* var, size64_t chunksize, char dimsep, NCZC cache->chunkcount = 1; if(var->ndims > 0) { - int i; + size_t i; for(i=0;indims;i++) { cache->chunkcount *= var->chunksizes[i]; } @@ -297,7 +297,7 @@ NCZ_read_cache_chunk(NCZChunkCache* cache, const size64_t* indices, void** datap /* Create a new entry */ if((entry = calloc(1,sizeof(NCZCacheEntry)))==NULL) {stat = NC_ENOMEM; goto done;} - memcpy(entry->indices,indices,rank*sizeof(size64_t)); + memcpy(entry->indices,indices,(size_t)rank*sizeof(size64_t)); /* Create the key for this cache */ if((stat = NCZ_buildchunkpath(cache,indices,&entry->key))) goto done; entry->hashkey = hkey; @@ -496,7 +496,8 @@ NCZ_flush_chunk_cache(NCZChunkCache* cache) int NCZ_ensure_fill_chunk(NCZChunkCache* cache) { - int i, stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; NC_VAR_INFO_T* var = cache->var; nc_type typeid = var->type_info->hdr.id; size_t typesize = var->type_info->size; @@ -605,7 +606,7 @@ int NCZ_buildchunkkey(size_t R, const size64_t* chunkindices, char dimsep, char** keyp) { int stat = NC_NOERR; - int r; + size_t r; NCbytes* key = ncbytesnew(); if(keyp) *keyp = NULL; @@ -670,7 +671,7 @@ put_chunk(NCZChunkCache* cache, NCZCacheEntry* entry) if((stat = NC_reclaim_data_all(file->controller,tid,entry->data,cache->chunkcount))) goto done; entry->data = NULL; entry->data = strchunk; strchunk = NULL; - entry->size = cache->chunkcount * maxstrlen; + entry->size = (cache->chunkcount * (size64_t)maxstrlen); entry->isfixedstring = 1; } @@ -865,7 +866,7 @@ NCZ_dumpxcacheentry(NCZChunkCache* cache, NCZCacheEntry* e, NCbytes* buf) { char s[8192]; char idx[64]; - int i; + size_t i; ncbytescat(buf,"{"); snprintf(s,sizeof(s),"modified=%u isfiltered=%u indices=", diff --git a/libsrc4/nc4internal.c b/libsrc4/nc4internal.c index f7f32c7ca8..3274c89a6c 100644 --- a/libsrc4/nc4internal.c +++ b/libsrc4/nc4internal.c @@ -49,13 +49,15 @@ static NC_reservedatt NC_reserved[] = { {NC_ATT_FORMAT, READONLYFLAG}, /*_Format*/ {ISNETCDF4ATT, READONLYFLAG|NAMEONLYFLAG|VIRTUALFLAG}, /*_IsNetcdf4*/ {NCPROPS,READONLYFLAG|NAMEONLYFLAG|HIDDENATTRFLAG}, /*_NCProperties*/ - {NC_NCZARR_ATTR_UC, READONLYFLAG|NAMEONLYFLAG|HIDDENATTRFLAG}, /*_NCZARR_ATTR */ {NC_ATT_COORDINATES, READONLYFLAG|HIDDENATTRFLAG}, /*_Netcdf4Coordinates*/ {NC_ATT_DIMID_NAME, READONLYFLAG|HIDDENATTRFLAG}, /*_Netcdf4Dimid*/ {SUPERBLOCKATT, READONLYFLAG|NAMEONLYFLAG|VIRTUALFLAG}, /*_SuperblockVersion*/ {NC_ATT_NC3_STRICT_NAME, READONLYFLAG}, /*_nc3_strict*/ {NC_ATT_NC3_STRICT_NAME, READONLYFLAG}, /*_nc3_strict*/ {NC_NCZARR_ATTR, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_attr */ + {NC_NCZARR_GROUP, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_group */ + {NC_NCZARR_ARRAY, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_array */ + {NC_NCZARR_SUPERBLOCK, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_superblock */ }; #define NRESERVED (sizeof(NC_reserved) / sizeof(NC_reservedatt)) /*|NC_reservedatt*/ diff --git a/nczarr_test/CMakeLists.txt b/nczarr_test/CMakeLists.txt index e25cd2fca1..382b46d91e 100644 --- a/nczarr_test/CMakeLists.txt +++ b/nczarr_test/CMakeLists.txt @@ -191,6 +191,9 @@ IF(NETCDF_ENABLE_TESTS) add_sh_test(nczarr_test run_quantize) add_sh_test(nczarr_test run_notzarr) + # Test back compatibility of old key format + add_sh_test(nczarr_test run_oldkeys) + # This has timeout under CMake # if(NOT ISCMAKE) add_sh_test(nczarr_test run_interop) diff --git a/nczarr_test/Makefile.am b/nczarr_test/Makefile.am index 5415267807..e6b95d1b5d 100644 --- a/nczarr_test/Makefile.am +++ b/nczarr_test/Makefile.am @@ -13,10 +13,10 @@ LDADD = ${top_builddir}/liblib/libnetcdf.la TESTS_ENVIRONMENT = TEST_EXTENSIONS = .sh -#SH_LOG_DRIVER = $(SHELL) $(top_srcdir)/test-driver-verbose -#sh_LOG_DRIVER = $(SHELL) $(top_srcdir)/test-driver-verbose -#LOG_DRIVER = $(SHELL) $(top_srcdir)/test-driver-verbose -#TESTS_ENVIRONMENT += export SETX=1; +SH_LOG_DRIVER = $(SHELL) $(top_srcdir)/test-driver-verbose +sh_LOG_DRIVER = $(SHELL) $(top_srcdir)/test-driver-verbose +LOG_DRIVER = $(SHELL) $(top_srcdir)/test-driver-verbose +TESTS_ENVIRONMENT += export SETX=1; #TESTS_ENVIRONMENT += export NCTRACING=1; AM_CPPFLAGS += -I${top_srcdir} -I${top_srcdir}/libnczarr @@ -121,6 +121,9 @@ if USE_HDF5 TESTS += run_fillonlyz.sh endif +# Test back compatibility of old key format +TESTS += run_oldkeys.sh + if BUILD_BENCHMARKS UTILSRC = bm_utils.c timer_utils.c test_utils.c bm_utils.h bm_timer.h @@ -207,7 +210,7 @@ run_filter.sh \ run_newformat.sh run_nczarr_fill.sh run_quantize.sh \ run_jsonconvention.sh run_nczfilter.sh run_unknown.sh \ run_scalar.sh run_strings.sh run_nulls.sh run_notzarr.sh run_external.sh \ -run_unlim_io.sh run_corrupt.sh +run_unlim_io.sh run_corrupt.sh run_oldkeys.sh EXTRA_DIST += \ ref_ut_map_create.cdl ref_ut_map_writedata.cdl ref_ut_map_writemeta2.cdl ref_ut_map_writemeta.cdl \ @@ -220,7 +223,7 @@ ref_perdimspecs.cdl ref_fillonly.cdl \ ref_whole.cdl ref_whole.txt \ ref_skip.cdl ref_skip.txt ref_skipw.cdl \ ref_rem.cdl ref_rem.dmp ref_ndims.cdl ref_ndims.dmp \ -ref_misc1.cdl ref_misc1.dmp ref_misc2.cdl \ +ref_misc1.cdl ref_misc1.dmp ref_misc2.cdl ref_zarr_test_data_meta.cdl \ ref_avail1.cdl ref_avail1.dmp ref_avail1.txt \ ref_xarray.cdl ref_purezarr.cdl ref_purezarr_base.cdl ref_nczarr2zarr.cdl \ ref_bzip2.cdl ref_filtered.cdl ref_multi.cdl \ @@ -228,16 +231,15 @@ ref_any.cdl ref_oldformat.cdl ref_oldformat.zip ref_newformatpure.cdl \ ref_groups.h5 ref_byte.zarr.zip ref_byte_fill_value_null.zarr.zip \ ref_groups_regular.cdl ref_byte.cdl ref_byte_fill_value_null.cdl \ ref_jsonconvention.cdl ref_jsonconvention.zmap \ -ref_string.cdl ref_string_nczarr.baseline ref_string_zarr.baseline ref_scalar.cdl \ -ref_nulls_nczarr.baseline ref_nulls_zarr.baseline ref_nulls.cdl ref_notzarr.tar.gz +ref_string.cdl ref_string_nczarr.baseline ref_string_zarr.baseline ref_scalar.cdl ref_scalar_nczarr.cdl \ +ref_nulls_nczarr.baseline ref_nulls_zarr.baseline ref_nulls.cdl ref_notzarr.tar.gz \ +ref_oldkeys.cdl ref_oldkeys.file.zip ref_oldkeys.zmap \ +ref_noshape.file.zip -# Interoperability files +# Interoperability files from external sources EXTRA_DIST += ref_power_901_constants_orig.zip ref_power_901_constants.cdl ref_quotes_orig.zip ref_quotes.cdl \ ref_zarr_test_data.cdl.gz ref_zarr_test_data_2d.cdl.gz -# Additional Files -EXTRA_DIST += ref_noshape.file.zip - CLEANFILES = ut_*.txt ut*.cdl tmp*.nc tmp*.cdl tmp*.txt tmp*.dmp tmp*.zip tmp*.nc tmp*.dump tmp*.tmp tmp*.zmap tmp_ngc.c ref_zarr_test_data.cdl tst_*.nc.zip ref_quotes.zip ref_power_901_constants.zip BUILT_SOURCES = test_quantize.c test_filter_vlen.c test_unlim_vars.c test_endians.c \ diff --git a/nczarr_test/ncdumpchunks.c b/nczarr_test/ncdumpchunks.c index 0c93ca8c9a..758314aa2d 100644 --- a/nczarr_test/ncdumpchunks.c +++ b/nczarr_test/ncdumpchunks.c @@ -50,7 +50,7 @@ typedef struct Format { int debug; int linear; int holevalue; - int rank; + size_t rank; size_t dimlens[NC_MAX_VAR_DIMS]; size_t chunklens[NC_MAX_VAR_DIMS]; size_t chunkcounts[NC_MAX_VAR_DIMS]; @@ -60,7 +60,7 @@ typedef struct Format { } Format; typedef struct Odometer { - int rank; /*rank */ + size_t rank; /*rank */ size_t start[NC_MAX_VAR_DIMS]; size_t stop[NC_MAX_VAR_DIMS]; size_t max[NC_MAX_VAR_DIMS]; /* max size of ith index */ @@ -71,11 +71,11 @@ typedef struct Odometer { #define ceildiv(x,y) (((x) % (y)) == 0 ? ((x) / (y)) : (((x) / (y)) + 1)) static char* captured[4096]; -static int ncap = 0; +static size_t ncap = 0; extern int nc__testurl(const char*,char**); -Odometer* odom_new(int rank, const size_t* stop, const size_t* max); +Odometer* odom_new(size_t rank, const size_t* stop, const size_t* max); void odom_free(Odometer* odom); int odom_more(Odometer* odom); int odom_next(Odometer* odom); @@ -120,9 +120,9 @@ cleanup(void) } Odometer* -odom_new(int rank, const size_t* stop, const size_t* max) +odom_new(size_t rank, const size_t* stop, const size_t* max) { - int i; + size_t i; Odometer* odom = NULL; if((odom = calloc(1,sizeof(Odometer))) == NULL) return NULL; @@ -339,12 +339,12 @@ dump(Format* format) { void* chunkdata = NULL; /*[CHUNKPROD];*/ Odometer* odom = NULL; - int r; + size_t r; size_t offset[NC_MAX_VAR_DIMS]; int holechunk = 0; char sindices[64]; #ifdef H5 - int i; + size_t i; hid_t fileid, grpid, datasetid; hid_t dxpl_id = H5P_DEFAULT; /*data transfer property list */ unsigned int filter_mask = 0; @@ -388,7 +388,7 @@ dump(Format* format) if((chunkdata = calloc(sizeof(int),format->chunkprod))==NULL) usage(NC_ENOMEM); - printf("rank=%d dims=(%s) chunks=(%s)\n",format->rank,printvector(format->rank,format->dimlens), + printf("rank=%zu dims=(%s) chunks=(%s)\n",format->rank,printvector(format->rank,format->dimlens), printvector(format->rank,format->chunklens)); while(odom_more(odom)) { @@ -506,12 +506,14 @@ filenamefor(const char* f0) int main(int argc, char** argv) { - int i,stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; Format format; int ncid, varid, dimids[NC_MAX_VAR_DIMS]; int vtype, storage; int mode; int c; + int r; memset(&format,0,sizeof(format)); @@ -577,7 +579,8 @@ main(int argc, char** argv) /* Get the info about the var */ if((stat=nc_inq_varid(ncid,format.var_name,&varid))) usage(stat); - if((stat=nc_inq_var(ncid,varid,NULL,&vtype,&format.rank,dimids,NULL))) usage(stat); + if((stat=nc_inq_var(ncid,varid,NULL,&vtype,&r,dimids,NULL))) usage(stat); + format.rank = (size_t)r; if(format.rank == 0) usage(NC_EDIMSIZE); if((stat=nc_inq_var_chunking(ncid,varid,&storage,format.chunklens))) usage(stat); if(storage != NC_CHUNKED) usage(NC_EBADCHUNK); diff --git a/nczarr_test/ref_any.cdl b/nczarr_test/ref_any.cdl index bbbc30e860..3486f32e4d 100644 --- a/nczarr_test/ref_any.cdl +++ b/nczarr_test/ref_any.cdl @@ -4,39 +4,21 @@ dimensions: dim1 = 4 ; dim2 = 4 ; variables: - int ivar(dim0, dim1, dim2) ; - ivar:_FillValue = -2147483647 ; - ivar:_Storage = @chunked@ ; - ivar:_ChunkSizes = 4, 4, 4 ; - ivar:_Filter = @IH5@ ; - ivar:_Codecs = @ICX@ ; float fvar(dim0, dim1, dim2) ; fvar:_FillValue = 9.96921e+36f ; fvar:_Storage = @chunked@ ; fvar:_ChunkSizes = 4, 4, 4 ; fvar:_Filter = @FH5@ ; fvar:_Codecs = @FCX@ ; + int ivar(dim0, dim1, dim2) ; + ivar:_FillValue = -2147483647 ; + ivar:_Storage = @chunked@ ; + ivar:_ChunkSizes = 4, 4, 4 ; + ivar:_Filter = @IH5@ ; + ivar:_Codecs = @ICX@ ; data: - ivar = - 0, 1, 2, 3, - 4, 5, 6, 7, - 8, 9, 10, 11, - 12, 13, 14, 15, - 16, 17, 18, 19, - 20, 21, 22, 23, - 24, 25, 26, 27, - 28, 29, 30, 31, - 32, 33, 34, 35, - 36, 37, 38, 39, - 40, 41, 42, 43, - 44, 45, 46, 47, - 48, 49, 50, 51, - 52, 53, 54, 55, - 56, 57, 58, 59, - 60, 61, 62, 63 ; - fvar = 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, @@ -54,4 +36,22 @@ data: 52.5, 53.5, 54.5, 55.5, 56.5, 57.5, 58.5, 59.5, 60.5, 61.5, 62.5, 63.5 ; + + ivar = + 0, 1, 2, 3, + 4, 5, 6, 7, + 8, 9, 10, 11, + 12, 13, 14, 15, + 16, 17, 18, 19, + 20, 21, 22, 23, + 24, 25, 26, 27, + 28, 29, 30, 31, + 32, 33, 34, 35, + 36, 37, 38, 39, + 40, 41, 42, 43, + 44, 45, 46, 47, + 48, 49, 50, 51, + 52, 53, 54, 55, + 56, 57, 58, 59, + 60, 61, 62, 63 ; } diff --git a/nczarr_test/ref_blosc_zmap.txt b/nczarr_test/ref_blosc_zmap.txt index fd48ac290b..97ea43278f 100644 --- a/nczarr_test/ref_blosc_zmap.txt +++ b/nczarr_test/ref_blosc_zmap.txt @@ -3,5 +3,5 @@ [2] /.nczgroup : (80) |{"dims": {"dim0": 4,"dim1": 4,"dim2": 4,"dim3": 4},"vars": ["var"],"groups": []}| [3] /.zattrs : (68) |{"_NCProperties": "version=2,netcdf=4.8.1-development,nczarr=1.0.0"}| [4] /.zgroup : (18) |{"zarr_format": 2}| -[6] /var/.nczarray : (67) |{"dimrefs": ["/dim0","/dim1","/dim2","/dim3"],"storage": "chunked"}| +[6] /var/.nczarray : (67) |{"dimension_references": ["/dim0","/dim1","/dim2","/dim3"],"storage": "chunked"}| [7] /var/.zarray : (172) |{"zarr_format": 2,"shape": [4,4,4,4],"dtype": "S1", "globalillegal": ">S1", "_NCProperties": ">S1"}}}| -[1] /.zgroup : () |{"zarr_format": 2, "_nczarr_superblock": {"version": "2.0.0"}, "_nczarr_group": {"dims": {"d1": 1}, "vars": ["v"], "groups": []}}| -[3] /v/.zarray : () |{"zarr_format": 2, "shape": [1], "dtype": "S1", "varjson2": ">S1", "varvec1": ">S1", "varvec2": ">S1"}}}| +[0] /.zattrs : () |{"globalfloat": 1, "globalfloatvec": [1,2], "globalchar": "abc", "globalillegal": "[ [ 1.0, 0.0, 0.0 ], [ 0.0, 1.0, 0.0 ], [ 0.0, 0.0, 1.0 ", "_nczarr_group": {"dimensions": {"d1": 1}, "arrays": ["v"], "groups": []}, "_nczarr_superblock": {"version": "2.0.0"}, "_nczarr_attr": {"types": {"globalfloat": "S1", "globalillegal": ">S1", "_NCProperties": ">S1", "_nczarr_group": "|J0", "_nczarr_superblock": "|J0", "_nczarr_attr": "|J0"}}}| +[1] /.zgroup : () |{"zarr_format": 2}| +[3] /v/.zarray : () |{"zarr_format": 2, "shape": [1], "dtype": "S1", "varjson2": ">S1", "varjson3": ">S1", "varchar1": ">S1", "_nczarr_array": "|J0", "_nczarr_attr": "|J0"}}}| [5] /v/0 : (4) (ubyte) |...| diff --git a/nczarr_test/ref_nczarr2zarr.cdl b/nczarr_test/ref_nczarr2zarr.cdl index 814201c816..68bf86c46c 100644 --- a/nczarr_test/ref_nczarr2zarr.cdl +++ b/nczarr_test/ref_nczarr2zarr.cdl @@ -1,8 +1,8 @@ netcdf nczarr2zarr { dimensions: - _zdim_8 = 8 ; + _Anonymous_Dim_8 = 8 ; variables: - int v(_zdim_8, _zdim_8) ; + int v(_Anonymous_Dim_8, _Anonymous_Dim_8) ; v:_FillValue = -1 ; data: diff --git a/nczarr_test/ref_newformatpure.cdl b/nczarr_test/ref_newformatpure.cdl index 51058889f9..210da3c020 100644 --- a/nczarr_test/ref_newformatpure.cdl +++ b/nczarr_test/ref_newformatpure.cdl @@ -1,8 +1,8 @@ netcdf ref_oldformat { dimensions: lat = 8 ; - _zdim_8 = 8 ; - _zdim_10 = 10 ; + _Anonymous_Dim_8 = 8 ; + _Anonymous_Dim_10 = 10 ; variables: int lat(lat) ; lat:_FillValue = -1 ; @@ -13,7 +13,7 @@ data: group: g1 { variables: - int pos(_zdim_8, _zdim_10) ; + int pos(_Anonymous_Dim_8, _Anonymous_Dim_10) ; pos:_FillValue = -1 ; string pos:pos_attr = "latXlon" ; diff --git a/nczarr_test/ref_oldkeys.cdl b/nczarr_test/ref_oldkeys.cdl new file mode 100644 index 0000000000..d1012a6cd6 --- /dev/null +++ b/nczarr_test/ref_oldkeys.cdl @@ -0,0 +1,20 @@ +netcdf ref_oldkeys { +dimensions: + d0 = 2 ; + d1 = 4 ; + d2 = 6 ; +variables: + int v(d0, d1, d2) ; + v:_FillValue = -1 ; +data: + + v = + 0, 1, 2, 3, 4, 5, + 6, 7, 8, 9, 10, 11, + 12, 13, 14, 15, 16, 17, + 18, 19, 20, 21, 22, 23, + 24, 25, 26, 27, 28, 29, + 30, 31, 32, 33, 34, 35, + 36, 37, 38, 39, 40, 41, + 42, 43, 44, 45, 46, 47 ; +} diff --git a/nczarr_test/ref_oldkeys.file.zip b/nczarr_test/ref_oldkeys.file.zip new file mode 100644 index 0000000000..0845d17f83 Binary files /dev/null and b/nczarr_test/ref_oldkeys.file.zip differ diff --git a/nczarr_test/ref_oldkeys.zmap b/nczarr_test/ref_oldkeys.zmap new file mode 100644 index 0000000000..3d7ec88997 --- /dev/null +++ b/nczarr_test/ref_oldkeys.zmap @@ -0,0 +1,5 @@ +[0] /.zattrs : (109) |{"_NCProperties": "version=2,netcdf=4.9.2,nczarr=2.0.0", "_nczarr_attr": {"types": {"_NCProperties": ">S1"}}}| +[1] /.zgroup : (147) |{"zarr_format": 2, "_nczarr_superblock": {"version": "2.0.0"}, "_nczarr_group": {"dims": {"d0": 2, "d1": 4, "d2": 6}, "vars": ["v"], "groups": []}}| +[3] /v/.zarray : (215) |{"zarr_format": 2, "shape": [2,4,6], "dtype": " tmp_${base}_${zext}.cdl + # Dumping everything causes timeout so dump metadata only + ${NCDUMP} $metaonly $flags $url > tmp_${base}_${zext}.cdl # Find the proper ref file - diff -b ${ISOPATH}/ref_${base}_2d.cdl tmp_${base}_${zext}.cdl - set +x + diff -b ${srcdir}/ref_${base}_meta.cdl tmp_${base}_${zext}.cdl } testallcases() { diff --git a/nczarr_test/run_jsonconvention.sh b/nczarr_test/run_jsonconvention.sh index 64b629d858..9f9724cb84 100755 --- a/nczarr_test/run_jsonconvention.sh +++ b/nczarr_test/run_jsonconvention.sh @@ -23,15 +23,13 @@ deletemap $zext $file ${NCGEN} -4 -b -o "$fileurl" $srcdir/ref_jsonconvention.cdl ${NCDUMP} $fileurl > tmp_jsonconvention_${zext}.cdl ${ZMD} -h $fileurl > tmp_jsonconvention_${zext}.txt -# | sed -e 's/,key1=value1|key2=value2//' -e '/"_NCProperties"/ s/(378)/(354)/' # Clean up extraneous changes so comparisons work -# remove '\n' from ref file before comparing -#sed -e 's|\\n||g' < ${srcdir}/ref_jsonconvention.cdl > tmp_jsonconvention_clean.cdl -cat < ${srcdir}/ref_jsonconvention.cdl > tmp_jsonconvention_clean.cdl -cat < tmp_jsonconvention_${zext}.cdl > tmp_jsonconvention_clean_${zext}.cdl -sed -e 's|\(.z[a-z][a-z]*\) : ([0-9][0-9]*)|\1 : ()|g' < tmp_jsonconvention_${zext}.txt >tmp1.tmp -sed -e 's|"_NCProperties": "version=[0-9],[^"]*",||' tmp_jsonconvention_clean_${zext}.txt -diff -b tmp_jsonconvention_clean.cdl tmp_jsonconvention_clean_${zext}.cdl +cat < tmp_jsonconvention_${zext}.cdl > tmp_jsonconvention_clean_${zext}.cdl +cat < tmp_jsonconvention_${zext}.txt > tmp_jsonconvention_clean_${zext}.txt +sed -i.bak -e 's|"_NCProperties": "version=[0-9],[^"]*",||' tmp_jsonconvention_clean_${zext}.txt +sed -i.bak -e 's|\(.z[a-z][a-z]*\) : ([0-9][0-9]*)|\1 : ()|g' tmp_jsonconvention_clean_${zext}.txt +# compare +diff -b ${srcdir}/ref_jsonconvention.cdl tmp_jsonconvention_clean_${zext}.cdl diff -b ${srcdir}/ref_jsonconvention.zmap tmp_jsonconvention_clean_${zext}.txt } diff --git a/nczarr_test/run_oldkeys.sh b/nczarr_test/run_oldkeys.sh new file mode 100755 index 0000000000..02b273ae75 --- /dev/null +++ b/nczarr_test/run_oldkeys.sh @@ -0,0 +1,27 @@ +#!/bin/sh + +if test "x$srcdir" = x ; then srcdir=`pwd`; fi +. ../test_common.sh + +. "$srcdir/test_nczarr.sh" + +set -e + +isolate "testdir_oldkeys" +THISDIR=`pwd` +cd $ISOPATH + +testcase() { +zext=$1 +fileargs ref_oldkeys "mode=nczarr,$zext" +# need to unpack the nczarr file +rm -fr ref_oldkeys.file +unzip ${srcdir}/ref_oldkeys.file.zip >> tmp_ignore.txt +${NCDUMP} $fileurl > tmp_oldkeys_${zext}.cdl +${ZMD} -t int $fileurl > tmp_oldkeys_${zext}.zmap +diff -b ${srcdir}/ref_oldkeys.cdl tmp_oldkeys_${zext}.cdl +diff -b ${srcdir}/ref_oldkeys.zmap tmp_oldkeys_${zext}.zmap +} + +# Only test file case +testcase file diff --git a/nczarr_test/run_scalar.sh b/nczarr_test/run_scalar.sh index b7c268ee5b..3e09303ef2 100755 --- a/nczarr_test/run_scalar.sh +++ b/nczarr_test/run_scalar.sh @@ -50,7 +50,7 @@ ${NCDUMP} -n ref_scalar $nczarrurl > tmp_scalar_nczarr_${zext}.cdl ${ZMD} -h $nczarrurl > tmp_scalar_nczarr_${zext}.txt echo "*** verify" -diff -bw $top_srcdir/nczarr_test/ref_scalar.cdl tmp_scalar_nczarr_${zext}.cdl +diff -bw $top_srcdir/nczarr_test/ref_scalar_nczarr.cdl tmp_scalar_nczarr_${zext}.cdl # Fixup zarrscalar tmp_scalar_zarr_${zext}.cdl tmp_rescale_zarr_${zext}.cdl diff --git a/nczarr_test/ut_json.c b/nczarr_test/ut_json.c index 37ab65d231..9dd4d3fee4 100644 --- a/nczarr_test/ut_json.c +++ b/nczarr_test/ut_json.c @@ -159,7 +159,8 @@ jclone(NCjson* json, NCjson** clonep) static int cloneArray(NCjson* array, NCjson** clonep) { - int i, stat=NC_NOERR; + int stat=NC_NOERR; + size_t i; NCjson* clone = NULL; if((stat=NCJnew(NCJ_ARRAY,&clone))) goto done; for(i=0;i