Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow lazy-read of netCDF-4/HDF5 files #857

Closed
edhartnett opened this issue Feb 9, 2018 · 15 comments
Closed

Allow lazy-read of netCDF-4/HDF5 files #857

edhartnett opened this issue Feb 9, 2018 · 15 comments

Comments

@edhartnett
Copy link
Contributor

At this point, this represents more an aspiration that a plan, but there has been some discussion (see PR #849) of how to enable lazy reads of netCDF-4 file metadata.

Files with a very large amount of metadata take a long time to load because netCDF reads all metadata at file open. For classic files, this doesn't seem to bother people much. But for netCDF-4/HDF5 files, it does. Perhaps this can be explained by the use of netCDF-4/HDF5 for some really complex and large datasets, which end up with tens of thousands of attributes, variables, dimensions, and/or groups. Or perhaps the classic formats, having all their metadata in a block at the beginning of the file, just load faster.

This has already cost us satellite users - the NPP uses netCDF-4, but the follow-on JPSS spacecraft switched to HDF5 without netCDF, due to the slow load times. I was told a similar story about a ESA satellite system by a very active netCDF user in the Netherlands. (Satellite L2 data files generally contain a very large number of attributes, some of which may be reasonably large arrays.)

One idea I suggested is to read each group only as needed. This would be pretty easy to implement I think. It would help where there's lot of groups. @DennisHeimbigner points out that this will not help with files that contain lots of vars. He indicates a known use case with a very large number of vars, all in the root group.

Well, that's another good idea all shot to hell. ;-)

In order to do lazy reads as Dennis suggests I think much of the libsrc4 code would have to be rewritten. (The good news is that with #849 soon to merge, and #856 to follow, the libsrc4 code will be a fair bit smaller than it is now.)

For example, if we open a file and read nothing, and then the user does an nc_inq(), we need to find out how many variables there are. In the current code, we count our list, because we have already read them. In the lazy-read code, we would rsee if there's a way we can get the numbers we need without reading every variable's metadata. That is probably possible in HDF5, but not how the code is currently written.

Handling dimensions in a lazy-read is going to be particularly tricky. They may be in different groups from a variable. So if the user opens a file and does a nc_inq_var() on a var deep in the group structure, we will have to have code smart enough to find all the dimensions in whatever group they are in. All this information is in the HDF5 file, but the code to read it and use it properly remains to be written.

@DennisHeimbigner
Copy link
Collaborator

I should note that the metadata read speedup that I already have
still reads all the metadata at one time (i.e., not lazily). It gets
its speedup from pervasive use of hashing. I just put up a PR
for the initial integration of this speed up. Waiting for the netcdf-4
rename stuff to be completed before I do the final step.

@DennisHeimbigner
Copy link
Collaborator

Another issue I discovered was that if we do lazy eval, then the various integer
ids (dimid, varid, etc) will probably change depending on the order in which
the meta-data elements are read. Specifically, any user code that assumes anything
about the ids is potentially going to fail. Issue #851 is relevant here.

@edhartnett
Copy link
Contributor Author

Maybe an easy way to do this would be a special attribute that contains a table of metadata. So instead of reading each object in a lazy way, we read the table. Then all our data structure code continues to work.

Files that do not have the table of metadata can not be read quickly. But they could add the table and then quick reads would work on the file.

We could store the metadata in various ways, like a file level attribute that lists all the dims in the file, a group-level table that lists all vars in a group, etc. Or we can package it up as one big metadata element (perhaps just store the output of ncdump -h as a char attribute).

@DennisHeimbigner
Copy link
Collaborator

Not a bad idea. The problem is that if we have large amounts of metadata, then
that extra attribute itself becomes very large. We would need some very
good encoding method.

@edhartnett
Copy link
Contributor Author

Even a very large ncgen -h output, stored as a char array, would read hugely faster than opening all the objects in HDF5 and querying them each. The whole thing would be 1 disk access, instead of tens or hundreds of thousands. Essentially it would be O(1) instead of O(N), right?

The more I think about it, the more I like it. Instead of re-writing all the libsrc4 code, it all will still work just fine.

@edhartnett
Copy link
Contributor Author

edhartnett commented Feb 9, 2018

Would it be at all possible/practical to put the existing ncgen/ncdump parsing code into the library, so that we could literally use the output of ncdump -h? We could zip it to make it smaller.

Or is that just nutty?

@DennisHeimbigner
Copy link
Collaborator

That is just nutty :-)
It also occurs to me that we need to allow for non-lazy (bulk) read of the
metadata when the caller is e.g. something like ncdump.

@edhartnett
Copy link
Contributor Author

Let us say we have an optional text attribute in the root group with a protected name which contains some compressed, easy to parse representation of the metadata in the file.

Each time any metadata is added or altered, the attribute is updated so it is always correct.

We add a function nc_put_metadata() which does the bulk read of metadata and (re-)creates this attribute.

Then why wouldn't ncdump be able to use the metadata too? Seems like that would be useful and a lot faster.

@DennisHeimbigner
Copy link
Collaborator

But this does not solve anything; innstead of reading all the metadata at once
as we do now, we are reading the equivalent in that single attribute. Plus
we have the overhead of parsing it, which is potentially a big cost. This
reminds me of the HDF-EOS solution.

@edhartnett
Copy link
Contributor Author

It does remind me of the HDF-EOS thing too, and that is certainly not a compliment to the idea. So perhaps not the way to go.

@DennisHeimbigner
Copy link
Collaborator

Before this goes much further, we need to have
a software architecture and design document to
codfy the API and the critical implementation
details. Something in markdown like a cross between
this https://github.com/Unidata/netcdf-c/blob/master/docs/filters.md
and this https://support.hdfgroup.org/HDF5/doc/Advanced/DynamicallyLoadedFilters/HDF5DynamicallyLoadedFilters.pdf

@edhartnett
Copy link
Contributor Author

Having worked with this code a bunch, I do see an optimization which would be pretty easy, and could result in significant speed-up of file opens.

In NC_VAR_INFO_T we we add a field atts_read.

When opening the file we don't read any variable atts.

When the user asks for a variable att, we check atts_read. If zero, we read all atts for that variable and set it to 1.

Anyone have any thoughts or opinions on this approach?

@Dave-Allured
Copy link
Contributor

Developers, have you considered no attribute cacheing as an alternative strategy to cacheing and lazy reads, in the case of netCDF4? HDF5 offers efficient attribute access by name and by index number. It seems to me that a thin wrapper approach for attributes would result in less memory demand and simpler library code, with little impact on performance in most real cases.

This approach may also be valid for other alternative storage formats such as cloud.

@edhartnett
Copy link
Contributor Author

edhartnett commented Dec 7, 2018

An update on this ticket:

  • Lazy att reads were part of the last release. Atts for a group or var are all read when any information is requested about any atts in that group/var. That is, if there are 10 atts in a var, all 10 will be read into memory if you ask about any of them.

  • Lazy var reads are more of a problem. I have worked out a faster way to read vars, but am now trying to make it less impactfull on the code and the data format. (That is, I am looking for a lazier and easier way to obtain the same speedup.)

Here's some gprof output for the following code:

      int ncid;
     int i;
     for (i = 0; i < 1000; i++)
     {
        char new[NC_MAX_NAME + 1];
        char cmd[NC_MAX_NAME + 1];
        sprintf(new, "wb_%d.nc", i);
        sprintf(cmd, "/bin/cp -p \'%s\' \'%s\'", "wrfbdy_d01", new);
        system(cmd);
        if (nc_open(new, NC_WRITE, &ncid)) ERR;
        if (nc_close(ncid)) ERR;
        remove(new);
     }

The file used is from a WRF model run, it's got lots of vars and attributes, but it is a real data file which is read a gadjillion times a day, all over the world, by anyone using the WRF model.

(I have cut off the call graph where times dropped below .1 s.)

This is with current master, + PR #1234 + some additional minor cleanups and optimizations that will be in a future PR.

index % time    self  children    called     name
                                                 <spontaneous>
[1]     66.6    0.02    0.26                 read_hdf5_obj [1]
                0.00    0.26  140000/140000      read_dataset [2]
-----------------------------------------------
                0.00    0.26  140000/140000      read_hdf5_obj [1]
[2]     61.8    0.00    0.26  140000         read_dataset [2]
                0.14    0.12  131000/131000      read_var [3]
                0.00    0.00    9000/9000        read_scale [33]
-----------------------------------------------
                0.14    0.12  131000/131000      read_dataset [2]
[3]     61.0    0.14    0.12  131000         read_var [3]
                0.01    0.06  131000/131000      nc4_var_list_add [13]
                0.04    0.00  131000/131000      get_type_info2 [21]
                0.01    0.00  131000/131000      nc4_adjust_var_cache [28]
-----------------------------------------------
                                                 <spontaneous>
[4]     28.7    0.00    0.12                 main [4]
                0.00    0.08    1000/1000        nc_close [9]
                0.00    0.04    1000/1000        nc_open [18]
-----------------------------------------------
                0.00    0.08    1000/1000        nc_close [9]
[5]     19.0    0.00    0.08    1000         NC4_close [5]
                0.00    0.08    1000/1000        nc4_close_hdf5_file [6]
                0.00    0.00    1000/1000        nc4_find_nc_grp_h5 [78]
-----------------------------------------------
                0.00    0.08    1000/1000        NC4_close [5]
[6]     19.0    0.00    0.08    1000         nc4_close_hdf5_file [6]
                0.00    0.08    1000/1000        nc4_close_netcdf4_file [7]
                0.00    0.00    1000/1000        sync_netcdf4_file [90]
                0.00    0.00    1000/1000        nc4_rec_grp_HDF5_del [79]
-----------------------------------------------
                0.00    0.08    1000/1000        nc4_close_hdf5_file [6]
[7]     19.0    0.00    0.08    1000         nc4_close_netcdf4_file [7]
                0.00    0.08    1000/1000        nc4_rec_grp_del [8]
                0.00    0.00    3000/143002      nclistfree [43]
                0.00    0.00    1000/1000        NC4_free_provenance [60]
-----------------------------------------------
                0.00    0.08    1000/1000        nc4_close_netcdf4_file [7]
[8]     19.0    0.00    0.08    1000         nc4_rec_grp_del [8]
                0.01    0.07  131000/131000      var_free [10]
                0.00    0.00    5000/136000      ncindexfree [12]
                0.00    0.00  140000/2931000     ncindexith [37]
                0.00    0.00    9000/9000        dim_free [50]
-----------------------------------------------
                0.00    0.08    1000/1000        main [4]
[9]     19.0    0.00    0.08    1000         nc_close [9]
                0.00    0.08    1000/1000        NC4_close [5]
                0.00    0.00    1000/2000        NC_check_id [53]
                0.00    0.00    1000/1000        del_from_NCList [72]
                0.00    0.00    1000/1000        free_NC [75]
-----------------------------------------------
                0.01    0.07  131000/131000      nc4_rec_grp_del [8]
[10]    18.4    0.01    0.07  131000         var_free [10]
                0.00    0.07  131000/136000      ncindexfree [12]
                0.00    0.00  131000/131000      nc4_type_free [45]
-----------------------------------------------
                0.07    0.00  136000/136000      ncindexfree [12]
[11]    16.7    0.07    0.00  136000         NC_hashmapfree [11]
-----------------------------------------------
                0.00    0.00    5000/136000      nc4_rec_grp_del [8]
                0.00    0.07  131000/136000      var_free [10]
[12]    16.7    0.00    0.07  136000         ncindexfree [12]
                0.07    0.00  136000/136000      NC_hashmapfree [11]
                0.00    0.00  136000/143002      nclistfree [43]
-----------------------------------------------
                0.01    0.06  131000/131000      read_var [3]
[13]    15.8    0.01    0.06  131000         nc4_var_list_add [13]
                0.00    0.06  131000/131000      nc4_var_list_add2 [14]
                0.01    0.00  131000/131000      nc4_var_set_ndims [31]
-----------------------------------------------
                0.00    0.06  131000/131000      nc4_var_list_add [13]
[14]    13.4    0.00    0.06  131000         nc4_var_list_add2 [14]
                0.00    0.03  131000/140000      ncindexadd [25]
                0.00    0.01  131000/141000      NC_hashmapkey [26]
                0.01    0.00  131000/136000      ncindexnew [27]
-----------------------------------------------
                0.00    0.04    1000/1000        NC_open [16]
[15]     9.6    0.00    0.04    1000         NC4_open [15]
                0.00    0.04    1000/1000        nc4_open_file [17]
-----------------------------------------------
                0.00    0.04    1000/1000        nc_open [18]
[16]     9.6    0.00    0.04    1000         NC_open [16]
                0.00    0.04    1000/1000        NC4_open [15]
                0.00    0.00    1000/1000        NC_urlmodel [67]
                0.00    0.00    1000/1000        NC_check_file_type [65]
                0.00    0.00    1000/1000        new_NC [84]
                0.00    0.00    1000/1000        add_to_NCList [68]
                0.00    0.00       1/1           nc_initialize [111]
-----------------------------------------------
                0.00    0.04    1000/1000        NC4_open [15]
[17]     9.6    0.00    0.04    1000         nc4_open_file [17]
                0.04    0.00    1000/1000        rec_match_dimscales [22]
                0.00    0.00    1000/1000        nc4_nc4f_list_add [35]
                0.00    0.00    1000/1000        rec_read_metadata [88]
                0.00    0.00    1000/1000        check_for_classic_model [70]
                0.00    0.00    1000/1000        NC4_read_ncproperties [63]
-----------------------------------------------
                0.00    0.04    1000/1000        main [4]
[18]     9.6    0.00    0.04    1000         nc_open [18]
                0.00    0.04    1000/1000        NC_open [16]
-----------------------------------------------
                0.00    0.01  141000/379000      NC_hashmapkey [26]
                0.00    0.03  238000/379000      NC_hashmapadd <cycle 1> [24]
[19]     9.5    0.00    0.04  379000         NC_crc32 [19]
                0.04    0.00  379000/379000      crc32_z [20]
-----------------------------------------------
                0.04    0.00  379000/379000      NC_crc32 [19]
[20]     9.5    0.04    0.00  379000         crc32_z [20]
-----------------------------------------------
                0.04    0.00  131000/131000      read_var [3]
[21]     9.5    0.04    0.00  131000         get_type_info2 [21]
-----------------------------------------------
                0.04    0.00    1000/1000        nc4_open_file [17]
[22]     9.5    0.04    0.00    1000         rec_match_dimscales [22]
                0.00    0.00 2116000/2931000     ncindexith [37]
                0.00    0.00  494000/494000      nc4_find_dim [38]
-----------------------------------------------
[23]     8.4    0.01    0.03  140000+99000   <cycle 1 as a whole> [23]
                0.01    0.03  238000             NC_hashmapadd <cycle 1> [24]
                0.00    0.00    1000             rehash <cycle 1> [89]
-----------------------------------------------
                               98000             rehash <cycle 1> [89]
                0.01    0.03  140000/140000      ncindexadd [25]
[24]     8.4    0.01    0.03  238000         NC_hashmapadd <cycle 1> [24]
                0.00    0.03  238000/379000      NC_crc32 [19]
                0.00    0.00  238000/238000      locate [39]
                                1000             rehash <cycle 1> [89]
-----------------------------------------------
                0.00    0.00    9000/140000      nc4_dim_list_add [32]
                0.00    0.03  131000/140000      nc4_var_list_add2 [14]
[25]     8.4    0.00    0.04  140000         ncindexadd [25]
                0.01    0.03  140000/140000      NC_hashmapadd <cycle 1> [24]
                0.00    0.00  140000/146031      nclistpush [41]
-----------------------------------------------
                0.00    0.00    1000/141000      nc4_grp_list_add [34]
                0.00    0.00    9000/141000      nc4_dim_list_add [32]
                0.00    0.01  131000/141000      nc4_var_list_add2 [14]
[26]     3.5    0.00    0.01  141000         NC_hashmapkey [26]
                0.00    0.01  141000/379000      NC_crc32 [19]
-----------------------------------------------
                0.00    0.00    5000/136000      nc4_grp_list_add [34]
                0.01    0.00  131000/136000      nc4_var_list_add2 [14]
[27]     2.4    0.01    0.00  136000         ncindexnew [27]
                0.00    0.00  136000/143004      nclistnew [42]
                0.00    0.00  136000/149003      nclistsetalloc [40]
                0.00    0.00  136000/136000      NC_hashmapnew [44]
-----------------------------------------------
                0.01    0.00  131000/131000      read_var [3]
[28]     2.4    0.01    0.00  131000         nc4_adjust_var_cache [28]
-----------------------------------------------
                                                 <spontaneous>
[29]     2.4    0.01    0.00                 create_group [29]
-----------------------------------------------
                                                 <spontaneous>
[30]     2.4    0.01    0.00                 read_coord_dimids [30]
-----------------------------------------------
                0.01    0.00  131000/131000      nc4_var_list_add [13]
[31]     1.2    0.01    0.00  131000         nc4_var_set_ndims [31]
-----------------------------------------------
                0.00    0.00    9000/9000        read_scale [33]
[32]     0.8    0.00    0.00    9000         nc4_dim_list_add [32]
                0.00    0.00    9000/140000      ncindexadd [25]
                0.00    0.00    9000/141000      NC_hashmapkey [26]
                0.00    0.00    9000/10000       obj_track [49]
-----------------------------------------------
                0.00    0.00    9000/9000        read_dataset [2]
[33]     0.8    0.00    0.00    9000         read_scale [33]
                0.00    0.00    9000/9000        nc4_dim_list_add [32]
                0.00    0.00    1000/1000        nc4_find_dim_len [77]
-----------------------------------------------
                0.00    0.00    1000/1000        nc4_nc4f_list_add [35]
[34]     0.1    0.00    0.00    1000         nc4_grp_list_add [34]
                0.00    0.00    5000/136000      ncindexnew [27]
                0.00    0.00    1000/141000      NC_hashmapkey [26]
                0.00    0.00    1000/10000       obj_track [49]
-----------------------------------------------
                0.00    0.00    1000/1000        nc4_open_file [17]
[35]     0.1    0.00    0.00    1000         nc4_nc4f_list_add [35]
                0.00    0.00    1000/1000        nc4_grp_list_add [34]
                0.00    0.00    3000/143004      nclistnew [42]

Here's (significant part of) the flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 33.33      0.14     0.14   131000     1.07     1.96  read_var
 16.67      0.21     0.07   136000     0.51     0.51  NC_hashmapfree
  9.52      0.25     0.04   379000     0.11     0.11  crc32_z
  9.52      0.29     0.04   131000     0.31     0.31  get_type_info2
  9.52      0.33     0.04     1000    40.00    40.00  rec_match_dimscales
  4.76      0.35     0.02                             read_hdf5_obj
  2.38      0.36     0.01   238000     0.04     0.15  NC_hashmapadd
  2.38      0.37     0.01   136000     0.07     0.07  ncindexnew
  2.38      0.38     0.01   131000     0.08     0.08  nc4_adjust_var_cache
  2.38      0.39     0.01   131000     0.08     0.59  var_free
  2.38      0.40     0.01                             create_group
  2.38      0.41     0.01                             read_coord_dimids
  1.19      0.42     0.01   131000     0.04     0.51  nc4_var_list_add
  1.19      0.42     0.01   131000     0.04     0.04  nc4_var_set_ndims

I am analyzing this now to see where the time is really being spent opening and closing a file.

@edhartnett
Copy link
Contributor Author

I am going to close, as the changes discussed here have all be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants