Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak in _get_att() when reading a string attribute #495

Closed
ckhroulev opened this issue Dec 7, 2015 · 8 comments
Closed

Comments

@ckhroulev
Copy link
Contributor

Unlike nc_get_att_text, the NetCDF C API function nc_get_att_string allocates storage for the attribute. The user is expected to free this storage using nc_free_string. (Unfortunately the documentation of nc_get_att_string is wrong.)

The implementation of _get_att calls nc_get_att_string, but does not seem to free this storage.

Please let me know if I'm just confused about this (and feel free to close this issue).

Thanks!

@jswhit
Copy link
Collaborator

jswhit commented Dec 8, 2015

Don't know if this is a problem or not. Haven't had any reports of memory leaks - have you?

@jswhit
Copy link
Collaborator

jswhit commented Dec 8, 2015

I tried a simple test that writes a string attribute and then reads it back in a loop, and indeed it does seem to leak. I've created pull request #496 to add an nc_free_string call after reading the attribute, as you suggest. I've also added an nc_free_string call after reading a VLEN string variable. The C lib also has nc_free_vlen, which servers a similar purpose for VLEN variable reads. However, when I add a call to nc_free_vlen after reading VLEN data, tst_vlen.py fails, so I've left this call commented out.

@dopplershift
Copy link
Member

attn: @WardF (regarding C-API docs)

@ckhroulev
Copy link
Contributor Author

@jswhit : I did not know if this is a problem (and no, I haven't seen any memory leak reports)-- I was just reading _netCDF4.pyx and noticed that nc_free_string is not used.

Thanks for the fix in #496. I looked at the test failure you mention; please consider using the patch included below.

From a050ff5020a38363efb4b8fb6e303febfe3db72b Mon Sep 17 00:00:00 2001
From: Constantine Khroulev <ckhroulev@alaska.edu>
Date: Tue, 8 Dec 2015 09:41:05 -0900
Subject: [PATCH] Fix "regular vlen" case in Variable._get(...).

We need to manually copy data from nc_vlen_t structures to NumPy arrays.

In the code prior to this commit the internal storage of dataarr
is *replaced* with the array allocated by the NetCDF library, leaking
memory allocated when dataarr is created and making it impossible to
de-allocate vldata using nc_free_vlens().
---
 netCDF4/_netCDF4.pyx | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/netCDF4/_netCDF4.pyx b/netCDF4/_netCDF4.pyx
index c5cad63..2c82ac8 100644
--- a/netCDF4/_netCDF4.pyx
+++ b/netCDF4/_netCDF4.pyx
@@ -943,6 +943,8 @@ import_array()
 include "netCDF4.pxi"
 include "constants.pyx"

+from libc.string cimport memcpy
+
 # check for required version of netcdf-4 and hdf5.

 def _gethdf5libversion():
@@ -4364,15 +4366,15 @@ The default value of `mask` is `True`
                 for i from 0<=i<totelem:
                     arrlen  = vldata[i].len
                     dataarr = numpy.empty(arrlen, self.dtype)
-                    dataarr.data = <char *>vldata[i].p
+                    memcpy(<void*>dataarr.data, vldata[i].p, dataarr.nbytes)
                     data[i] = dataarr
                 # reshape the output array
                 data = numpy.reshape(data, shapeout)
                 # free vlen data internally allocated in netcdf C lib.
                 # (tst_vlen.py fails with this line uncommented)
-                #ierr = nc_free_vlens(totelem, vldata)
+                ierr = nc_free_vlens(totelem, vldata)
                 # free the pointer array.
-                #free(vldata)
+                free(vldata)
         free(startp)
         free(countp)
         free(stridep)
-- 
2.6.2

@dopplershift : Please see Unidata/netcdf-c#171 .

@ckhroulev
Copy link
Contributor Author

Oh, by the way: I noticed that vlen string array attributes are not supported.

Now that I understand how to use nc_get_att_string I think this would be easy to support and am willing to implement it (and create a pull request). What do you think?

@jswhit
Copy link
Collaborator

jswhit commented Dec 8, 2015

Thanks - the pull request has been updated with your memcpy trick.

@jswhit
Copy link
Collaborator

jswhit commented Dec 8, 2015

Sure, having support for vlen string array attributes would be nice, although I can't think of a use case offhand.

jswhit added a commit that referenced this issue Dec 9, 2015
free string data internally allocated in netcdf-c using nc_free_string (issue #495)
@jswhit
Copy link
Collaborator

jswhit commented Dec 9, 2015

pull request #496 merged, closing issue now. Thanks again for flagging this.

@jswhit jswhit closed this as completed Dec 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants