Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Software engineering improvements #108

Open
markcmiller86 opened this issue Mar 4, 2023 · 5 comments
Open

Software engineering improvements #108

markcmiller86 opened this issue Mar 4, 2023 · 5 comments

Comments

@markcmiller86
Copy link
Member

markcmiller86 commented Mar 4, 2023

Consider here several possible improvements to the way the code is designed...

Encode ZFP params only in as expert mode params

With regards to a proposal for a solution for this issue, #100 (comment) makes a compelling case for what to encode cb_values in a clean way, that is, to always encode cb_values as if expert mode was used.

#100 (comment) makes the point that the current situation is needed for a user interface point of view, to avoid making h5repack useable only for users who can juggle with expert values.

I wondered there's a way to extract/compute the expert level values from the zfp_stream_set_{rate|precision|accuracy|reversible} zfp calls. I checked what they do, and indeed, they seem to simply compute and set the 4 "expert mode" integers in the zfp header. So it's possible to turn all other zfpmode inputs into cb_values encoded in expert mode.

A patch proposal would thus be to:

  • Document that only hd5 files encoded in expert mode are portable across different architectures
  • Change print_h5repack_farg.c to always generate cb_values with zfpmode=4 (-f UD=32013,0,5,4,…)
  • Update the examples and tests accordingly
  • Possibly print a warning about portability during encoding/decoding on big endian machines, if zfpmode in cb_values is not 4/expert

This would:

  • always generate files compatible with old versions of H5Z-ZFP, which supports expert mode cb_values
  • stop generating new files that cannot be ported across architectures
  • require no change in the h5repack API: only changes to print_h5repack_farg.c and documentation

Originally posted by @spanezz in #100 (comment)

Also, see this note from @lindstro

Simplify headers and plugin/lib to single header and single source file

There are 6 header files and 2 sources files. I think this could be simplified down to a single source, H5Zzfp.c and single header file, H5Zzfp.h and still maintain all the current build options and interfaces.

This would mean putting all the property functions currently in H5Zzfp_props.c in H5Zzfp.c which would mean that source file contains more than just the code implementing the HDF5 filter interface. But, I think that is fine and there is already some of that going on there because it implements the H5Z_zfp_initialize() and H5Z_zfp_finalize() functions. Callers wishing to use the filter as a library would just explicitly link to the filter instead of only ensuring HDF5 library can find and dlopen it via HDF5_PLUGIN_PATH env. variable.

Should we eliminate GNU Make?

We're maintaining two build interfaces. I tend to use GNU Make but I know others like/want CMake. We get support on Windows only from CMake. I don't like forcing a dependence on CMake in order to build. OTOH, CMake is pretty ubiquitous now...especially if we're conservative about min version requirement. Our CI is uses GNU Make for Linux and CMake for Windows. We don't run tests with CMake...we'd need to plug that together (which may mostly already be addressed in #89).

Better organization of docs

I've got some nitty-gritty details and general usage mixed together in the documentation which I think makes things a bit confusing. So, these could be re-organized a tad better moving a lot of the details to something like an Advanced Issues section.

@markcmiller86
Copy link
Member Author

@brtnfld, @jwsblokland, @spanezz, @helmutg, @leighorf, @lindstro if any of you have comments on the above or related items, please let me know asap. Am targetting updated release end of this month with recent fixes plus these items and some misc. other items @lindstro reported to me in email back in Aug, 2022.

@lindstro
Copy link
Member

lindstro commented Mar 4, 2023

I think I agree with all of the above. There are some other potential improvements, such as support for querying the compression parameters previously used during compression (#105), support for OpenMP (de)compression, and support for GPU (de)compression, but I imagine those are well beyond the scope of what can be done over the coming weeks.

@markcmiller86
Copy link
Member Author

such as support for querying the compression parameters previously used during compression (#105)

FYI...thats included in the 1.1.1 milestone. So, I am hoping to complete that. In fact, I've coded most of it but have been puzzing over where to put the code,

H5Z-ZFP/src/H5Zzfp_props.c

Lines 114 to 281 in c9878d1

herr_t H5Pget_zfp(hid_t plist, int mode, ...)
{
static char const *_funcname_ = "H5Pget_zfp";
static size_t ctrls_sz = sizeof(h5z_zfp_controls_t);
unsigned int flags, fconfig;
unsigned int cd_vals[10];
size_t cd_nelmts = sizeof(cd_vals)/sizeof(cd_vals[0]);
char fname[100];
h5z_zfp_controls_t ctrls;
va_list ap;
herr_t retval = SUCCESS;
if (0 >= H5Pisa_class(plist, H5P_DATASET_CREATE))
H5Z_ZFP_PUSH_AND_GOTO(H5E_PLIST, H5E_BADTYPE, FAIL, "not a dataset creation property list class");
va_start(ap, mode);
if (0 < H5Pexist(plist, "zfp_controls"))
{
H5Pget(plist, "zfp_controls", &ctrls); /* should always succeed */
switch (mode)
{
case H5Z_ZFP_MODE_RATE:
{
double *rate = va_arg(ap, double*);
*rate = ctrls.details.rate;
break;
}
case H5Z_ZFP_MODE_ACCURACY:
{
double *acc = va_arg(ap, double*);
*acc = ctrls.details.acc;
break;
}
case H5Z_ZFP_MODE_PRECISION:
{
unsigned int *prec = va_arg(ap, unsigned int*);
*prec = ctrls.details.prec;
break;
}
case H5Z_ZFP_MODE_EXPERT:
{
unsigned int *minbits = va_arg(ap, unsigned int*);
*minbits = ctrls.details.expert.minbits;
unsigned int *maxbits = va_arg(ap, unsigned int*);
*maxbits = ctrls.details.expert.maxbits;
unsigned int *maxprec = va_arg(ap, unsigned int*);
*maxprec = ctrls.details.expert.maxprec;
int *minexp = va_arg(ap, int*);
*minexp = ctrls.details.expert.minexp;
break;
}
case H5Z_ZFP_MODE_REVERSIBLE:
{
break;
}
default:
{
H5Z_ZFP_PUSH_AND_GOTO(H5E_PLIST, H5E_BADVALUE, FAIL, "bad ZFP mode.");
break;
}
}
}
else if (0 <= H5Pget_filter_by_id2(plist, H5Z_FILTER_ZFP, &flags, &cd_nelmts, cd_vals, sizeof(fname), fname, &fconfig))
{
/* is this property list pre- or post- modification from having been used in the filter */
if (cd_nelmts > 0 && cd_vals[0] <= H5Z_ZFP_MODE_REVERSIBLE)
{
switch (mode)
{
case H5Z_ZFP_MODE_RATE:
{
double *rate = va_arg(ap, double*);
*rate = H5Pget_zfp_rate_cdata(cd_nelmts, cd_vals);
break;
}
case H5Z_ZFP_MODE_ACCURACY:
{
double *acc = va_arg(ap, double*);
*acc = H5Pget_zfp_accuracy_cdata(cd_nelmts, cd_vals);
break;
}
case H5Z_ZFP_MODE_PRECISION:
{
unsigned int *prec = va_arg(ap, unsigned int*);
*prec = H5Pget_zfp_precision_cdata(cd_nelmts, cd_vals);
break;
}
case H5Z_ZFP_MODE_EXPERT:
{
unsigned int *minbits = va_arg(ap, unsigned int*);
unsigned int *maxbits = va_arg(ap, unsigned int*);
unsigned int *maxprec = va_arg(ap, unsigned int*);
int *minexp = va_arg(ap, int*);
H5Pget_zfp_expert_cdata(cd_nelmts, cd_vals, (*minbits), (*maxbits), (*maxprec), (*minexp));
break;
}
case H5Z_ZFP_MODE_REVERSIBLE:
{
break;
}
default:
{
H5Z_ZFP_PUSH_AND_GOTO(H5E_PLIST, H5E_BADVALUE, FAIL, "bad ZFP mode.");
break;
}
}
}
else /* cd_vals for post-modified by filter */
{
if (cd_nelmts != H5Z_ZFP_CD_NELMTS_MAX)
H5Z_ZFP_PUSH_AND_GOTO(H5E_PLIST, H5E_CANTGET, FAIL, "ZFP filter cd-vals incorrect length");
#if 0
// cd_vals contains, starting at entry index 1, the ZFP stream header. So, now, open that as a bitstream...
bitstream *dummy_bstr = stream_open(&cd_vals[1], sizeof(cd_vals))));
zfp_stream *dummy_zstr = zfp_stream_open(dummy_bstr);
// now, query stream for info you seek...
zfp_mode zm = zfp_stream_compression_mode(dummy_zstr);
double rate = zfp_stream_rate(dummy_zstr, dim);
double accuracy = zfp_stream_accuracy(dummy_zstr);
uint precision = zfp_stream_precision(dummy_zstr);
zfp_stream_close(dummy_zstr);
stream_close(dummy_bstr);
#endif
}
}
else
{
H5Z_ZFP_PUSH_AND_GOTO(H5E_PLIST, H5E_CANTGET, FAIL, "ZFP filter properties");
}
va_end(ap);
done:
return retval;
}
herr_t H5Pget_zfp_rate(hid_t plist, double *rate)
{
return H5Pget_zfp(plist, H5Z_ZFP_MODE_RATE, rate);
}
herr_t H5Pget_zfp_precision(hid_t plist, unsigned int *prec)
{
return H5Pget_zfp(plist, H5Z_ZFP_MODE_PRECISION, prec);
}
herr_t H5Pget_zfp_accuracy(hid_t plist, double *acc)
{
return H5Pget_zfp(plist, H5Z_ZFP_MODE_ACCURACY, acc);
}
herr_t H5Pget_zfp_expert(hid_t plist,
unsigned int *minbits, unsigned int *maxbits,
unsigned int *maxprec, int *minexp)
{
return H5Pget_zfp(plist, H5Z_ZFP_MODE_EXPERT, minbits, maxbits, maxprec, minexp);
}
herr_t H5Pget_zfp_reversible(hid_t plist)
{
return H5Pget_zfp(plist, H5Z_ZFP_MODE_REVERSIBLE);
}

@brtnfld
Copy link
Collaborator

brtnfld commented Mar 6, 2023

All the improvements sound reasonable to me.

@jwsblokland
Copy link
Contributor

It all sounds good to me. Regarding PR #89. Currently, my time is limited so I do not expect to finalize this pull request soon. Having said that, I will get back to it when my schedule allows it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants