You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be nice to have documentation on how to use H5Z-ZFP in the context of netCDF-4. I've spent the last couple of days struggling with making this work, and I believe some additional documentation could save a lot of people grief.
First, the netCDF filter documentation mentions that HDF5 filters can indeed be used, e.g., from command-line tools like nccopy with the -F switch (similar to but not the same as the h5repack-f switch) There are a few things the documentation does not mention, however:
It seems that you cannot nccopy a file that has already been compressed, say, using zlib, to another compressed format. nccopy will simply silently ignore such requests and not use the requested compression filter. You first have to use -F none to copy the file to a temporary intermediate uncompressed file. And ncdump -h will not tell you whether or not the file has been compressed. For that, you need to use the -s switch also, e.g., ncdump -hs file.nc.
The netCDF filter parameters are similar to yet distinct from how they're fed to h5repack. As a concrete example, suppose we want to use H5Z-ZFP in fixed-accuracy mode with a tolerance of 1.0. This would be specified to h5repack using
-f UD=32013,0,4,3,0,0,1072693248
where these numbers mean
32013: filter ID (zfp compression)
0: unused (h5repack only)
4: number of 32-bit unsigned integer compression parameters (cd_values) that follow (h5repack only)
3: zfp fixed-accuracy mode
0: unused
0,1072683248: two 32-bit unsigned integers representing a type-punned double-precision tolerance of 1.0 in little-endian order
With nccopy, you don't need the 0 following the filter ID, nor do you specify the number of cd_values. Rather, you would provide this:
-F varname,32013,3,0,0,1072693248
You have to tell nccopy the name of the variable (dataset) you want to apply the filter to. You can also specify * for varname to apply compression to all variables, though I believe H5Z-ZFP will fail on certain types, e.g., chars. After the filter ID, you specify only the actual cd_values given to h5repack.
One nice thing about nccopy is that it understands how to do type punning. The above example could also be specified as
-F varname,32013,3,0,1.0d
Here 1.0d is interpreted as a double-precision number. This works fine on little-endian machines; my reading of the netCDF documentation is that this would not work correctly on a big-endian machine, but who has one of those these days?
Perhaps a short section "Using H5Z-ZFP Plugin with nccopy" can be added to the documentation? Maybe even a separate netCDF tool like print_h5repack_farg can be provided, or have print_h5repack_farg print both h5repack and nccopy arguments.
The text was updated successfully, but these errors were encountered:
I guess I had already forgotten as I did participate on that thread. :-) But #143 deals only with how to support compression programmatically--it says nothing about how to use the CLI tools, which I suspect is the more common use case. For instance, how often do you call zlib vs. compress a file using gzip?
It would be nice to have documentation on how to use H5Z-ZFP in the context of netCDF-4. I've spent the last couple of days struggling with making this work, and I believe some additional documentation could save a lot of people grief.
First, the netCDF filter documentation mentions that HDF5 filters can indeed be used, e.g., from command-line tools like
nccopy
with the-F
switch (similar to but not the same as theh5repack
-f
switch) There are a few things the documentation does not mention, however:It seems that you cannot
nccopy
a file that has already been compressed, say, using zlib, to another compressed format.nccopy
will simply silently ignore such requests and not use the requested compression filter. You first have to use-F none
to copy the file to a temporary intermediate uncompressed file. Andncdump -h
will not tell you whether or not the file has been compressed. For that, you need to use the-s
switch also, e.g.,ncdump -hs file.nc
.The netCDF filter parameters are similar to yet distinct from how they're fed to
h5repack
. As a concrete example, suppose we want to use H5Z-ZFP in fixed-accuracy mode with a tolerance of 1.0. This would be specified toh5repack
usingwhere these numbers mean
With
nccopy
, you don't need the 0 following the filter ID, nor do you specify the number ofcd_values
. Rather, you would provide this:You have to tell
nccopy
the name of the variable (dataset) you want to apply the filter to. You can also specify*
for varname to apply compression to all variables, though I believe H5Z-ZFP will fail on certain types, e.g., chars. After the filter ID, you specify only the actualcd_values
given toh5repack
.One nice thing about
nccopy
is that it understands how to do type punning. The above example could also be specified asHere
1.0d
is interpreted as a double-precision number. This works fine on little-endian machines; my reading of the netCDF documentation is that this would not work correctly on a big-endian machine, but who has one of those these days?Perhaps a short section "Using H5Z-ZFP Plugin with nccopy" can be added to the documentation? Maybe even a separate netCDF tool like
print_h5repack_farg
can be provided, or haveprint_h5repack_farg
print bothh5repack
andnccopy
arguments.The text was updated successfully, but these errors were encountered: