Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault while reading virtual datasets #1799

Closed
d70-t opened this issue Jul 22, 2020 · 6 comments · Fixed by #1828
Closed

segfault while reading virtual datasets #1799

d70-t opened this issue Jul 22, 2020 · 6 comments · Fixed by #1828

Comments

@d70-t
Copy link
Contributor

d70-t commented Jul 22, 2020

I am trying to create a netCDF4 compatible dataset which is composed of several different sources. My issue is that the source files may not be changed and are too large to afford making copies of the data. As a consequence, I decided to create new datasets using the HDF5 external and virtual datasets API such that most of my original data can stay in place while creating a new, more user friendly and higher level view to the original data. The first step was to convert a bunch of non-netcdf binary files to a single "virtual" netcdf file using external storage of HDF5, which works great. In the second step, which in my case is changing some coordinate variables and replacing broken data with new data, I tried to use the virtual dataset feature.

This leads to a segmentation fault when opening the datasets using netcdf4 (i.e. ncdump -h). The segfault is in nc4_adjust_var_cache and I assume that this is related to get_chunking_info, which misses a case for H5D_VIRTUAL which may be returned by the H5Pget_layout function. I've not yet tested it further but my guess is that var->storage in this case defaults to 0 which is NC_CHUNKED but var->chunksizes is not allocated, but later read by nc4_adjust_var_cache.

Notably, opening the dataset using h5netcdf works as expected.

I tested this on Ubuntu 18.04.4 with netcdf version 4.6.0. But as the missing case is still present in master, I assume the error will show up there as well.

@d70-t
Copy link
Contributor Author

d70-t commented Jul 22, 2020

I did another test with a freshly compiled master version of netcdf. I get the same segmentation fault and can confirm that there is indeed a null pointer access on chunksizes. I did a very crude fix in d70-t/netcdf-c@2e6a342 . This lets ncdump -h run without complaints, but actually accessing the variable with ncdump -v time still creates an error:

NetCDF: HDF error
Location: file .../netcdf-c/ncdump/vardata.c; line 478
 time = Segmentation fault (core dumped)

The data must be available in the dataset tough as h5netcdf is again happy with it.

Do you have a suggestion on how to fix this issue in a proper way?

@edhartnett
Copy link
Contributor

edhartnett commented Jul 22, 2020 via email

@d70-t
Copy link
Contributor Author

d70-t commented Jul 22, 2020

I did not come up with a C example yet, but I've got some simple python code which generates two files. a.nc directly contains one variable and b.nc refers to a.nc but should look the same.

Running h5dump a.nc and h5dump b.nc is identical (apart from the printed file name):

HDF5 "b.nc" {
GROUP "/" {
   DATASET "v" {
      DATATYPE  H5T_IEEE_F32LE
      DATASPACE  SIMPLE { ( 5 ) / ( 5 ) }
      DATA {
      (0): 0, 1, 2, 3, 4
      }
      ATTRIBUTE "CLASS" {
         DATATYPE  H5T_STRING {
            STRSIZE 16;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
         DATASPACE  SCALAR
         DATA {
         (0): "DIMENSION_SCALE"
         }
      }
      ATTRIBUTE "NAME" {
         DATATYPE  H5T_STRING {
            STRSIZE 2;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
         DATASPACE  SCALAR
         DATA {
         (0): "v"
         }
      }
   }
}
}

Running ncdump a.nc gives:

netcdf a {
dimensions:
	v = 5 ;
variables:
	float v(v) ;
data:

 v = 0, 1, 2, 3, 4 ;
}

But running ncdump b.nc leads to:

netcdf b {
dimensions:
	v = 5 ;
variables:
Speicherzugriffsfehler (Speicherabzug geschrieben)

Here's the script:

import numpy as np
import h5py

def _main():
    # create a simple netcdf compatible dataset
    a = h5py.File("a.nc", "w")
    var = a.create_dataset("v", data=np.arange(5, dtype="f4"))
    var.make_scale("v")
    a.close()

    # create a dataset which refers to the former one
    b = h5py.File("b.nc", "w")
    layout = h5py.VirtualLayout(shape=(5,), dtype="f4", maxshape=(5,))
    layout[:] = h5py.VirtualSource("a.nc", "v", shape=(5,))
    var = b.create_virtual_dataset("v", layout)
    var.make_scale("v")
    b.close()


if __name__ == "__main__":
    _main()

And the two generated files for reference: virtual_datasets.zip

@edwardhartnett
Copy link
Contributor

Well since netcdf-c is C and not python, python tests cannot be included in the library, so do me little good. ;-)

The first step in solving this remains to translate your simple test into C so that it can be included in the test directory nc_test4.

@d70-t
Copy link
Contributor Author

d70-t commented Jul 22, 2020

I'll have a look at the test directory and see what I can do. But that'll take me a while.

@d70-t
Copy link
Contributor Author

d70-t commented Jul 22, 2020

In d70-t/netcdf-c@1a7dd23, I've added a test which triggers the same segmentation fault.

d70-t added a commit to d70-t/netcdf-c that referenced this issue Sep 2, 2020
It seems like it is part of the design of HDF5 virtual datasets that
objects within a file remain opened while the files is aready "closed".
Setting the fclose degree to SEMI would cause the library to bail out.
This commit makes nc_test4/tst_virtual_dataset succeed.

See also Unidata#1799
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants