Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetCDF should reject hdf5 files with cycles of symbolic links #320

Closed
4 of 12 tasks
Yurlungur opened this issue Oct 8, 2016 · 15 comments · Fixed by #1119
Closed
4 of 12 tasks

NetCDF should reject hdf5 files with cycles of symbolic links #320

Yurlungur opened this issue Oct 8, 2016 · 15 comments · Fixed by #1119

Comments

@Yurlungur
Copy link

Yurlungur commented Oct 8, 2016

Hello,

I am contributing to a project which reads files in a large number of formats, including NetCDF and a format built on hdf5 which contains cycles of symbolic links. While working on integrating all these formats, we noticed that we can feed this hdf5 format into the NetCDF API and cause a crash after a long delay.

I would like to suggest adding some heuristics for the hdf5 backend that check whether or not the hdf5 file is in fact a NetCDF file.

(I note that I have only tried this with the Python API.)

Thanks for your time.

Environment Information

  • What platform are you using? (please provide specific distribution/version in summary)
    • Linux
    • Windows
    • OSX
    • Other
    • NA
  • 32 and/or 64 bit?
    • 32-bit
    • 64-bit
  • What build system are you using?
    • autotools (configure)
    • cmake
  • Can you provide a sample netCDF file or C code to recreate the issue?
    • Yes (please attach to this issue, thank you!)
    • No
    • Not at this time

Steps to reproduce the behavior

The following Python script creates an hdf5 file and then reads it into NetCDF, reproducing the problem.

import h5py
import numpy as np
f = h5py.File("test.h5","w")
grp=f.create_group('discretizations')
subgrp=grp.create_group('root-disc')
subgrp.create_dataset('disc',data=np.array([0,1,2]))
subsubgrp=subgrp.create_group('first-level')
f['discretizations/root-disc/first-level/parent']=h5py.SoftLink('/discretizations/root-disc')
subsubgrp.create_dataset('disc',data=np.array([0,1,2]))
f.close()

from netCDF4 import Dataset
f=Dataset('test.h5')
@DennisHeimbigner
Copy link
Collaborator

Currently, netcdf-c explicitly will not accept and HDF5 file
with group cycles (including symlinks). Testing this on a file
can be somewhat time consuming, so it would best be pushed off
to a tool.

@Yurlungur
Copy link
Author

Thanks for your reply. That makes total sense.

In our case, we use heuristics such as file name or opening the file via a backend to check it looks like a NetCDF file. We just wanted to make sure that you were aware that when faced with such a file the API does not fail gracefully.

@ngoldbaum
Copy link

It may be worth noting that the crash is in the form of a segmentation fault, so this is a possible security/DOS issue.

@edhartnett
Copy link
Contributor

Can you send me your test file that causes a crash?

@ngoldbaum
Copy link

Is the test script in the issue not sufficient? It’s been a couple years now and we may not be able to track down the original problematic file.

@DennisHeimbigner
Copy link
Collaborator

We should probably figure out where this seg fault is occurring.
My guess is that the code is either recursing until the stack fails
or we are accumulating group pointers in some kind of list and
it is running out of memory. If we can identify what is causing this,
then we could define a "MAX_PATH_LENGTH" to prevent this
from causing a crash.

@edhartnett
Copy link
Contributor

I will come up with a test file, probably tomorrow, and start to explore this issue. I will see if I can reproduce the seg fault.

My first thought for a solution is that I can look at the objinfo and get a unique set of integers that identify a group. If I do that for each group, I should be able to tell that I am in the same group again. In that way, I can detect loops.

We can just error at that point or, if we can figure out what to do that is useful, we can do that. First I will see if I can detect that it is happening, and I would like to ensure that checking it does not impact performance, since I've been working hard to increase performance and reduce what the library does when opening a file. Most files do not have loops, and I don't want to kill performance for everyone checking for loops.

@edhartnett
Copy link
Contributor

I came up with a test file, and current master errors out when I try to open it, just as it should. So what is the bug? There does not seem to be a long delay for me - HDF5 errors out right away.

Here's the file I'm creating in HDF5:

HDF5 "tst_files6.h5" {
GROUP "/" {
   GROUP "Britany_Fans" {
      GROUP "Toxic_Fans" {
         SOFTLINK "True_Toxic_Fans" {
            LINKTARGET "Britany_Fans"
         }
      }
   }
}
}

I created this file with the following code (in tst_files6.c):

      /* First use HDF5 to create a file with circular group
       * struct. */
      if ((hdfid = H5Fcreate(HDF5_FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT,
                             H5P_DEFAULT)) < 0) ERR;
      if ((grpid = H5Gcreate2(hdfid, GROUP_NAME, H5P_DEFAULT, H5P_DEFAULT,
                              H5P_DEFAULT)) < 0) ERR;
      if ((grpid2 = H5Gcreate2(grpid, GROUP_NAME_2, H5P_DEFAULT, H5P_DEFAULT,
                              H5P_DEFAULT)) < 0) ERR;
      if (H5Lcreate_soft(GROUP_NAME, grpid2, TRUE_FANS, H5P_DEFAULT,
                         H5P_DEFAULT) < 0) ERR;
      if (H5Fclose(hdfid) < 0) ERR;
      H5close(); /* Force HDF5 to forget about this file. */

When I open this file in netcdf, nc_open returns an error (NC_EHDFERR), the HDF5 error stack (lengthy) looks like this:

HDF5-DIAG: Error detected in HDF5 (1.10.1) thread 0:
  #000: H5O.c line 256 in H5Oopen(): unable to open object
    major: Object header
    minor: Can't open object
  #001: H5O.c line 1505 in H5O_open_name(): object not found
    major: Object header
    minor: Object not found
  #002: H5Gloc.c line 428 in H5G_loc_find(): can't find object
    major: Symbol table
    minor: Object not found
  #003: H5Gtraverse.c line 867 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #004: H5Gtraverse.c line 615 in H5G_traverse_real(): special link traversal failed
    major: Links
    minor: Link traversal failure
  #005: H5Gtraverse.c line 418 in H5G__traverse_special(): symbolic link traversal failed
    major: Links
    minor: Link traversal failure
  #006: H5Gtraverse.c line 360 in H5G_traverse_slink(): unable to follow symbolic link
    major: Symbol table
    minor: Object not found
  #007: H5Gtraverse.c line 639 in H5G_traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #008: H5Gtraverse.c line 134 in H5G_traverse_slink_cb(): component not found
    major: Symbol table
    minor: Object not found
HDF5-DIAG: Error detected in HDF5 (1.10.1) thread 0:
  #000: H5L.c line 1239 in H5Literate(): link iteration failed
    major: Symbol table
    minor: Iteration failed
  #001: H5Gint.c line 830 in H5G_iterate(): error iterating over links
    major: Symbol table
    minor: Iteration failed
  #002: H5Gobj.c line 708 in H5G__obj_iterate(): can't iterate over symbol table
    major: Symbol table
    minor: Iteration failed
  #003: H5Gstab.c line 564 in H5G__stab_iterate(): iteration operator failed
    major: Symbol table
    minor: Can't move to next iterator location
  #004: H5B.c line 1225 in H5B_iterate(): B-tree iteration failed
    major: B-Tree node
    minor: Iteration failed
  #005: H5B.c line 1181 in H5B__iterate_helper(): B-tree iteration failed
    major: B-Tree node
    minor: Iteration failed
  #006: H5Gnode.c line 1037 in H5G__node_iterate(): iteration operator failed
    major: Symbol table
    minor: Can't move to next iterator location
  #007: H5O.c line 256 in H5Oopen(): unable to open object
    major: Object header
    minor: Can't open object
  #008: H5O.c line 1505 in H5O_open_name(): object not found
    major: Object header
    minor: Object not found
  #009: H5Gloc.c line 428 in H5G_loc_find(): can't find object
    major: Symbol table
    minor: Object not found
  #010: H5Gtraverse.c line 867 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #011: H5Gtraverse.c line 615 in H5G_traverse_real(): special link traversal failed
    major: Links
    minor: Link traversal failure
  #012: H5Gtraverse.c line 418 in H5G__traverse_special(): symbolic link traversal failed
    major: Links
    minor: Link traversal failure
  #013: H5Gtraverse.c line 360 in H5G_traverse_slink(): unable to follow symbolic link
    major: Symbol table
    minor: Object not found
  #014: H5Gtraverse.c line 639 in H5G_traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #015: H5Gtraverse.c line 134 in H5G_traverse_slink_cb(): component not found
    major: Symbol table
    minor: Object not found
HDF5-DIAG: Error detected in HDF5 (1.10.1) thread 0:
  #000: H5L.c line 1239 in H5Literate(): link iteration failed
    major: Symbol table
    minor: Iteration failed
  #001: H5Gint.c line 830 in H5G_iterate(): error iterating over links
    major: Symbol table
    minor: Iteration failed
  #002: H5Gobj.c line 708 in H5G__obj_iterate(): can't iterate over symbol table
    major: Symbol table
    minor: Iteration failed
  #003: H5Gstab.c line 564 in H5G__stab_iterate(): iteration operator failed
    major: Symbol table
    minor: Can't move to next iterator location
  #004: H5B.c line 1225 in H5B_iterate(): B-tree iteration failed
    major: B-Tree node
    minor: Iteration failed
  #005: H5B.c line 1181 in H5B__iterate_helper(): B-tree iteration failed
    major: B-Tree node
    minor: Iteration failed
  #006: H5Gnode.c line 1037 in H5G__node_iterate(): iteration operator failed
    major: Symbol table
    minor: Can't move to next iterator location
  #007: H5O.c line 256 in H5Oopen(): unable to open object
    major: Object header
    minor: Can't open object
  #008: H5O.c line 1505 in H5O_open_name(): object not found
    major: Object header
    minor: Object not found
  #009: H5Gloc.c line 428 in H5G_loc_find(): can't find object
    major: Symbol table
    minor: Object not found
  #010: H5Gtraverse.c line 867 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #011: H5Gtraverse.c line 615 in H5G_traverse_real(): special link traversal failed
    major: Links
    minor: Link traversal failure
  #012: H5Gtraverse.c line 418 in H5G__traverse_special(): symbolic link traversal failed
    major: Links
    minor: Link traversal failure
  #013: H5Gtraverse.c line 360 in H5G_traverse_slink(): unable to follow symbolic link
    major: Symbol table
    minor: Object not found
  #014: H5Gtraverse.c line 639 in H5G_traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #015: H5Gtraverse.c line 134 in H5G_traverse_slink_cb(): component not found
    major: Symbol table
    minor: Object not found

So is there an issue here? Was this with an older version of HDF5? Perhaps HDF5 improvements have rendered this issue moot?

@ngoldbaum
Copy link

On python3.7, using h5py 2.8.0 (not sure what the underlying hdf5 version is), numpy 1.15.0, and netCDF4 1.4.0 running the script @Yurlungur attached in the initial report consumes ~25 GB of RAM on my MacOS 10.13 system. It doesn't crash but it does cause my system to start hitting the page file pretty hard and uses all available RAM.

@ngoldbaum
Copy link

Ah, here we are:

Summary of the h5py configuration
---------------------------------

h5py    2.8.0
HDF5    1.10.2
Python  3.7.0 (default, Jun 29 2018, 20:13:13)
[Clang 9.1.0 (clang-902.0.39.2)]
sys.platform    darwin
sys.maxsize     9223372036854775807
numpy   1.15.0

@edhartnett
Copy link
Contributor

I have tested with both 1.10.1 and 1.10.2, and in both cases it rejects my circularly linked file with NC_EHDFERR. There is no detectable difference in time to execute with or without the statement that attempts to open the file.

I have a test for this which will be added to the code base when it gets merged, so we can detect if this fails in the future, or in other circumstances (i.e. one of the many CI test variants).

@DennisHeimbigner
Copy link
Collaborator

Since detecting cycles is O(V+E), you are unlikely to see any major performance
hit unless you have a large number of groups.
It occurs to me that we might be able to do incremental cycle detection
as the netcdf library walks part or all of the groups. As each group is read,
we could do the cycle check for just that group. The cost would be the same
but would be spread out over the group read costs. Since we do not do
lazy group reading, this might not save much.

@edhartnett
Copy link
Contributor

OK, good point, but the sample given to recreate the problem uses only a couple of groups.

@Yurlungur did the problems only occur on files with lots of groups? Do you have any sample files on which a problem occurred?

@Yurlungur
Copy link
Author

@edhartnett I can try to track down the original problem file for you, but it's been many years since I encountered this problem. The original problem indeed had a large number of groups. However, as @ngoldbaum points out, the test script I provided, which generates a file, caused problems for me. And that file produces only four groups.

@Yurlungur
Copy link
Author

Thank you for looking into this, by the way!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants