-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
netcdf 4.9.2 and 1.14.3 : attach_dimscales: Assertion `dsid > 0' failed when closing file #2962
Comments
I have few regrets in life, but many of them have to do with HDF5 dimscales. I will explain in detail, but first suggest you try adding flag NC_NODIMSCALE_ATTACH to your file create mode, to see if that resolves the problem. Let me know if it does. Unfortunately, HDF5 does not have a dimension, and that leads to problems. What netCDF does is create a HDF5 dataset for each dimension (as well as each variable). When you have a variable and dimension which have the same name, netcdf-4 assumes that is a coordinate dimension (the values of the variable are the values along that dimension, and the variable has one dimension, which is the dimension of the same name). So, for example, you would have a dimension "longitude" with length 100. Then you would have a variable called "longitude" which has 100 values, the values of the longitude for each value along that dimension. This is a "coordinate variable" in netCDF. The problem with the above code is that you define a dimension time2, and then a variable, time. Then, you write some data to time, causing all metadata to be written to the file. Then, you go back into define mode and define a dimension named time, the name already in use by the variable. So this is bad netCDF practice, because you have a variable and a dimension of the same name, this should be a coordinate dimension, which means the dimension of variable "time" should be dimension "time", not "time2". In other words, once you define a variable, you should not add a dimension of the same name. If you want a coordinate variable, then define the dimension first, then use it as the dimension of the variable. Is there a good reason for this? Or is it just a corner case that you discovered? This is part of a larger category of renaming problems, starting with #597, but searching for "rename" on the netcdf-c issue list reveals many more. Fixing these would require a rewrite of a significant section of netcdf-4 code, which handles renames. You can also build netCDF with CFLAGS=-NDEBUG to turn off all assertions and see what happens. |
Thanks a lot for your detailed explanation! Yes, the assertion failure is indeed not triggered when I use
We have just the I fully agree that creating such NetCDF confusing and indeed bad practice. It came about by copying the NetCDF variables along with dimension and names from a pre-existing file. The user was not aware that there were different dimension with inconsistent names. In NCDatasets.jl, one can copy a NetCDF variables from one file to another and all dimensions are created implicitly if necessary. There should have been only one time dimension. But user error happend. I am ok with returning an error message in this case, but an abort is a bit drastic. You are right, that
|
Edit: Sorry I missed your comment that it would take a lot of efforts to fix the problem. I understand. First of all, @Alexander-Barth 's post is about the assertion error. So, the following discussion is not central to the issue.
I thought that we need two conditions to be met:
In the submitted example, condition 1 is met but condition 2 isn't. Indeed, if you swap @Alexander-Barth created the example specifically to produce the assertion error. I'll present a simpler code to illustrate the HDF problem: // try_netcdf_error.cc
#include <iostream>
#include "netcdf.h"
int main() {
int i, ncid, dimid_time, varid_tax, dimid_tax;
const int tax[] = {1,2,3,4,5,6,7,8,9,10};
i = nc_create("test.nc", NC_NETCDF4, &ncid);
i = nc_def_dim(ncid, "TIME", 10, &dimid_time);
i = nc_def_dim(ncid, "tax", 7, &dimid_tax); //(1) -> fine.
if (i != NC_NOERR) {std::cerr << nc_strerror(i) << std::endl;}
const int dims[] = {dimid_time};
i = nc_def_var(ncid, "tax", NC_INT, 1, dims, &varid_tax);
//i = nc_def_dim(ncid, "tax", 7, &dimid_tax); //(2) -> HDF error on put_var
//if (i != NC_NOERR) {std::cerr << nc_strerror(i) << std::endl;}
i = nc_enddef(ncid);
i = nc_put_var(ncid, varid_tax, tax);
if (i != NC_NOERR) {std::cerr << nc_strerror(i) << std::endl;}
i = nc_close(ncid);
} This program produces a correct netCDF file: netcdf test {
dimensions:
TIME = 10 ;
tax = 7 ;
variables:
int tax(TIME) ;
data:
tax = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ;
} Here there is a variable named "tax" and there is a dimension named "tax". That is, condition 1 is met. But, the variable "tax" has a dimension "TIME" and the netCDF file is happily created. In my code, swap So, I suppose this problem is fixable. Am I correct? Even if it is hard to fix it, it would be nice if a meaningful error is issued at |
What about changing the --- ./libhdf5/nc4hdf.c.orig
+++ ./libhdf5/nc4hdf.c
@@ -1466,7 +1466,9 @@
dsid = ((NC_HDF5_VAR_INFO_T *)(var->dim[d]->coord_var->format_var_info))->hdf_datasetid;
else
dsid = ((NC_HDF5_DIM_INFO_T *)var->dim[d]->format_dim_info)->hdf_dimscaleid;
- assert(dsid > 0);
+
+ if (dsid <= 0)
+ return NC_EDIMSCALE;
/* Attach the scale. */
if (H5DSattach_scale(hdf5_var->hdf_datasetid, dsid, d) < 0) The example code above runs ok with this change. |
If you wish to submit a PR for that, I would have no objection. |
…variables are named inconsistently (Unidata#2962)
This is a follow-up of the #1772 (comment).
While the issue #1772 is rather about a current limitation of the NetCDF4/HDF5 implementation, this issue is the about an assertion error than occurs in this context when the file is closed. At first glance, it seems that only recent version of HDF5 trigger this error, while #1772 itself is a long-standing issue.
NetCDF 4.9.2 and HDF5 1.14.3
Alpine Linux 3.15 (the build environment of Julia's C packages) with GCC 5.2.0
Other build dependencies (including indirect ones) are:
The NetCDF is configured as:
These features are enabled:
The following C program leads to an abort in nc4hdf.c:
The output of this program is:
Note that this error is not reproducible with NetCDF 4.9.2 compiled with HDF5 1.10.7 (from ubuntu 22.04). So far this issue could only be reproduced with HDF5 1.14.2 and 1.14.3.
Since the error depends on the HDF5 version and since we have an HDF5 error at the line
nc_put_var_int(ncid, var_myvar_id, data);
I think it is possible that the root cause is not in NetCDF itself.This issue was reported by @ryofurue in Alexander-Barth/NCDatasets.jl#261 for the julia package NCDatasets.jl. While for a typical C or Fortran program, an HDF5 error would end the program, in an interactive julia,python,R,... session however it is quite common that the user can make an error which should not result in aborting the session (actually, here it is not really a user error, but rather an limitation of NetCDF4 not occurring on NetCDF3 for example as discussed in #1772).
While solving #1772 would be quite desirable, the purpose of this issue, as I see it, is just about the assertion failure.
Thank you for your time and your insights!
The text was updated successfully, but these errors were encountered: