-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-implement the nc_get/put_vars operations for netcdf-4 using the corresponding HDF5 operations. #1001
Conversation
corresponding HDF5 operations. re: github issue #908 also in reference to pydata/xarray#2004 The netcdf-c library has implemented the nc_get_vars and nc_put_vars operations as element at a time. This has resulted in very slow operation. This pr attempts to improve the situation for netcdf-4/hdf5 files by using the slab operations provided by the hdf5 library. The new implementation passes the get/put vars stride information down to the hdf5 slab operations. The result appears to improve performance significantly. Some simple tests on large 2-D arrays shows speedups in excess of 150. Misc. other changes: 1. fix bug in ncgen/semantics.c; using a list's allocated length instead of actual length. 2. Added a temporary hook in the netcdf library plus a performance test case (tst_varsperf.c) to estimate the speedup. After users have had some experience with this, I will remove it, probably after the 4.7 release.
nc4_set_default_vars(int tf) | ||
{ | ||
defaultvars = (tf ? 1 : 0); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this temporary flag be removed?
Changing to the old default methods can easily be done by changing the dispatch table.
#endif | ||
int retval = NC_NOERR, range_error = 0, i, d2; | ||
void *bufr = NULL; | ||
#ifndef HDF5_CONVERT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure where this code came from. It looks like I wrote this code, but if so, why does it seem like it was just added?
In any case, the HDF5_CONVERT macro can never be defined. I have removed it from other code. I can be removed from this code as well. HDF5 never was able to convert data and generate the same output and error results that netCDF does, and HDF5_CONVERT is never set in the build. When it is set, it inevitably will result in failed tests.
Please leave it. When this gets released and actually used, I anticipate problems. |
@DennisHeimbigner you are referring to the defaultvars, not the HDF_CONVERT, correct? I also believe their will be problems with this code, as it does not seem to do the correct thing for parallel I/O in all cases. After it is merged I will add some tests and probably submit a PR with additional code for parallel cases. |
Yes, NCDEFAULT_get/put_vars. |
You don't have to make the test case conditional on not parallel. The parallel issue is not one that any sequentially written code will active - the test has to do something trickier than that. So please just let your test run on parallel builds as normal. |
I am a bit confused. Are you saying that NC4_put/get_vars code will not operate |
Yes, that is what I am saying. I believe they will not operate correctly in some cases. I have some test code for a similar case for the vara get/put which is up in a PR right now. Once it gets merged I will add some tests to check similar behavior on vars. Would you like me to make the necessary changes? I can take a branch and merge this branch, then add the tests and extra code needed. |
|
On second thought I think it would be best if you made the changes |
OK I'll have something tomorrow or the next day... |
#define DIMSIZE1 1024 | ||
#define TOTALSIZE (DIMSIZE0*DIMSIZE1) | ||
|
||
#define CHECK(expr) assert((expr) == NC_NOERR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we cannot use asserts in tests like this.
The reason is that this will cause the test to fail to compile with -DNDEBUG builds. The -DNDEBUG is usual in HPC builds (and also in code coverage builds, which is how my CI system caught this.)
So I have made minor modifications to the test to no longer use asserts, but instead to use the macros in err_macros.h.
OK, I have a version of this code which seems to work well for paralllel I/O. It depends on coord checking being relaxed. If we cannot eliminate the relax coord bound issue, then relaxed coord bounds will have to be turned on for all parallel builds anyway. As soon as we get my current PR merged, we can move on with this issue. @DennisHeimbigner I can confirm that this results in ~600 times speedup, which is great. As part of my vars changes I removed the vara functions, which are just a special case of the vars functions. Otherwise this leads to a large amount of duplicated code. |
Wires got crossed while merging |
Just a heads up, this change seems to have slowed down a different path for indexing in netCDF4-Python: |
re: github issue #908
also in reference to pydata/xarray#2004
The netcdf-c library has implemented the nc_get_vars and nc_put_vars
operations as element at a time. This has resulted in very slow
operation.
This pr attempts to improve the situation for netcdf-4/hdf5 files
by using the slab operations provided by the hdf5 library. The new
implementation passes the get/put vars stride information down to
the hdf5 slab operations.
The result appears to improve performance significantly. Some simple
tests on large 2-D arrays shows speedups in excess of 150.
Misc. other changes:
instead of actual length.
test case (tst_varsperf.c) to estimate the speedup. After users
have had some experience with this, I will remove it, probably
after the 4.7 release.