-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible memory leak #585
Comments
I will take a look at this to see if I can track it down or duplicate in the C library; environment info and data files would be helpful if they are easy to provide. Thanks :) |
Great, thanks! Here's the environment, although it can probably be more easily created just doing: The data is here, and a script to reproduce the crash here. Unfortunately you'll also have to download this plugin and point to it by setting CIS_PLUGIN_HOME to wherever you download it to... I did say it was hard to reproduce!! |
Hi @WardF, have you had a chance to look into this? I'm happy to help if I can. |
I can seemingly fix the problem if I remove the call to Dataset.filepath(). This is also where one of the valgrind memory errors was coming from, so a good place to look would probably be around line 13596 in _netCDF.c and the call to nc_inq_path but I can't see anything obviously wrong with my rusty C... |
I haven't yet, no, but will investigate as soon as I can. I'll also take a look around |
Off the top of my head, if the pointers being passed to |
Ok, we'll probably need one of the python devs/experts to weigh in. Looking at the valgrind output you provided, I see the following:
First, I should probably replace the call to The way I'm reading this is that I do see some other Valgrind issues reported from libnetcdf, although not directly related to |
Here's the relevant cython code from netcdf4-python with nogil:
ierr = nc_inq_path(self._grpid, &pathlen, NULL)
if ierr != NC_NOERR:
raise RuntimeError((<char *>nc_strerror(ierr)).decode('ascii'))
c_path = <char *>malloc(sizeof(char) * pathlen)
if not c_path:
raise MemoryError()
try:
with nogil:
ierr = nc_inq_path(self._grpid, &pathlen, c_path)
if ierr != NC_NOERR:
raise RuntimeError((<char *>nc_strerror(ierr)).decode('ascii'))
py_path = c_path[:pathlen] # makes a copy of pathlen bytes from c_string
finally:
free(c_path)
return py_path.decode('ascii') nc_inq_path is being call with NULL as the third argument to get the length of the string. |
Thanks @jswhit. @duncanwp I'm looking at the stack.txt file you attached above and I'm seeing the crash is somewhere down in libhdf5, I notice you're using Python 2.7; is it possible to test this with Python 3x? |
I'm trying to recreate this on OSX as described above, but am seeing the following. The output is the same regardless of whether I'm using
|
Also, the data lives in |
I've tried it with Python 3.5 and it has exactly the same issue. Feel free to test either.
Ah, sorry about that - I was hacking around some of our code to see if the deferred loading of data files might be causing the problem (which it doesn't seem to be) and this got committed by mistake. If you take the plugin file again it should be fixed now. Looking through the output again and comparing with the cython @jswhit provided it looks like the allocation of
and then the second call to nc_inq_path (
|
I'm getting a similar segmentation fault. The following minimal script results in a segmentation fault on my system. Note that ONLY a 88 character filename results in a segmentation fault. import netCDF4
dset = netCDF4.Dataset('./filename_of_length_88_filler_filler_filler_filler_filler_filler_filler_filler_filler_.nc', 'w')
dset.close()
infile = 'filename_of_length_88_filler_filler_filler_filler_filler_filler_filler_filler_filler_.nc'
dataset = netCDF4.Dataset(infile)
print(dataset.filepath()) |
I was not able to get cygdb to work but I was able to build a version of netcdf4-python from the current master (97f1515) with debugging symbol against a debug version of Python 3.5.2 and get a backtrace of the segmentation fault in the above script:
Line 13715 of _netCDF4.c is the /* "netCDF4/_netCDF4.pyx":1889
* py_path = c_path[:pathlen] # makes a copy of pathlen bytes from c_string
* finally:
* free(c_path) # <<<<<<<<<<<<<<
* return py_path.decode('ascii')
* ELSE:
*/
/*finally:*/ {
/*normal exit:*/{
free(__pyx_v_c_path);
goto __pyx_L10;
} Is it possible that the line:
Makes reference rather than a copy and that Python is free-ing this memory prior to the Cython free statement? |
Does Specifically the follow change to _netCDF4.pyx removes the segmentation fault in my test case. @@ -1876,7 +1876,7 @@ open/create the Dataset. Requires netcdf >= 4.1.2"""
ierr = nc_inq_path(self._grpid, &pathlen, NULL)
if ierr != NC_NOERR:
raise RuntimeError((<char *>nc_strerror(ierr)).decode('ascii'))
- c_path = <char *>malloc(sizeof(char) * pathlen)
+ c_path = <char *>malloc(sizeof(char) * (pathlen + 1))
if not c_path:
raise MemoryError()
try: |
I can confirm that the cdef int ierr
cdef size_t pathlen
cdef char *c_path
IF HAS_NC_INQ_PATH:
with nogil:
ierr = nc_inq_path(self._grpid, &pathlen, NULL)
if ierr != NC_NOERR:
raise RuntimeError((<char *>nc_strerror(ierr)).decode('ascii'))
c_path = <char *>malloc(sizeof(char) * (pathlen + 2))
c_path[pathlen-1] = 42
c_path[pathlen] = 42
c_path[pathlen+1] = 42
if not c_path:
raise MemoryError()
try:
with nogil:
ierr = nc_inq_path(self._grpid, &pathlen, c_path)
if ierr != NC_NOERR:
raise RuntimeError((<char *>nc_strerror(ierr)).decode('ascii'))
py_path_0 = c_path[pathlen-1] # makes a copy of pathlen bytes from c_string
py_path_1 = c_path[pathlen] # makes a copy of pathlen bytes from c_string
py_path_2 = c_path[pathlen+1] # makes a copy of pathlen bytes from c_string
finally:
free(c_path)
return py_path_0, py_path_1, py_path_2 This returns |
Increase the buffer used to store the filepath by one character to accommodate the NULL character copied by the nc_inq_path function. closes Unidata#585
Great work tracking it down @jjhelmus, hopefully this can be merged in time for the next release. |
Yes, kudos @jjhelmus! Before I merge, I'd like to hear from the unidata devs (@WardF and/or @DennisHeimbigner) as to whether nc_inq_path indeed returns the null terminator, and if so is this the expected behavior or a bug? |
I'm getting consistent seg-faults somewhere out of netCDF-python/netCDF-c/HDF and I'm struggling to track it down.
The only way I can reliable reproduce it is on OS X with a specific set of input data files, and a specific output filename... It also occurs on Windows but less reliably. Predictably I've been unable to reproduce it on Linux with a debug enabled build.
I've attached the stack out of OSX when it occurs, and also some valgrind output from the Linux test (which doesn't seg fault but does produce potential memory leaks). The most interesting lines in the valgrind output are the "Uninitialised value was created by a heap allocation" warnings starting around line 560, and then the "Invalid write of size 1" at 660, and "Address 0x600e530 is 0 bytes after a block of size 16 alloc'd" at 673.
It's also possible these last two might be related to #506.
I'm using HDF 1.8.17, netCDF 4.4.0, netcdf-python 1.2.4.
Apologies I've not given you much to go on, if it helps I can share a recipe for the environment I'm using on OSX and the input data files (about 50Mb).
The text was updated successfully, but these errors were encountered: