-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak and netCDF4 #1021
Comments
Sorry the plots in my original post did not show up. I corrected this now. Note that the increase in memory usage (~5mb) is larger in 1.5.3 than 1.4.3 and 1.3.0 (~1.3mb). cc @TimoRoth |
Ups, sorry for not doing my homework, it seems related to #986 The current Netcdf4 wheels still seem to ship with the old netcdf version though. We will try with an updated NetCDF and report back... |
Yes, please do report back when you have a chance to test with netcdf-c 4.7.4 |
This is the result of running against netcdf-c 4.7.3 with Unidata/netcdf-c@3bcdb5f backported. Edit: I massively increased the number of loops, and it seems like there is still a bit of leakage, though magnitudes less than in the unpatched versions, which might even just be the Python GC being unmotivated to act on such a low memory use. Adding a gc.collect() before adding a final memory measurement did not show a drop in usage though. |
Thanks @TimoRoth !
I don't know much about GC, but it sounds likely that something is leaking in C if the GC cannot pick it up... |
Is there any update on this? I have tracked down a memory leak in my application to some code in a third party library that reads a NetCDF file. I am using netCDF4-python v1.6.2. |
It's fixed in netcdf 4.7.4 and higher. |
@TimoRoth Sorry, I'm not a Python expert. I just used pip to install the Python library (pip install netCDF4). How do I check or change the netcdf-c version that comes with this? In the meantime I managed to fork the thrid-party library I was using and refactored it to open the NetCDF file once, rather than opening and closing it in a loop. That workaround seems to work, but I would rather not have to do that. |
If you're using the wheels from pypi, I'd expect any of the last couple
version to have something much more recent than that, so you don't have
to do anything.
If you're building yourself, against your distros set of libraries, make
sure you're using a modern enough distro, so it has the fixed version.
|
All I did was this:
|
I modified @fmaussion 's script to perform a no-op after opening the test file and removed the graph generation since that would use memory. I simply print the memory usage. import gc
import os
import netCDF4
import numpy as np
import psutil
process = psutil.Process(os.getpid())
f = 'dummy_data.nc'
if os.path.exists(f):
os.remove(f)
with netCDF4.Dataset(f, 'w', format='NETCDF4') as nc:
nc.createDimension('time', None)
v = nc.createVariable('prcp', 'f4', ('time',))
v[:] = np.arange(10000)
v = nc.createVariable('temp', 'f4', ('time',))
v[:] = np.arange(10000)
def dummy_read():
with netCDF4.Dataset(f) as nc:
pass
gc.collect()
for i in range(20001):
dummy_read()
if i % 100 == 0:
print("{}: {}".format(i, process.memory_info().rss * 1e-6)) The following results were printed: 0: 40.321024 |
I also ran the following to see if I could determine the version of netcdf-c used in the wheel:
Maybe it is using v4.8.x? Could there be another memory leak or a regression? |
… closing as a workaround for Unidata/netcdf4-python#1021
Hi, this is linux, python 3.5, latest numpy.
While profiling memory leaks in our software (which reads and writes a lot of NetCDF files), I have found what looks like memory leaks in netcdf4.
Here is a simple script to reproduce:
The plots produced look like this with various netCDF4 versions:
Note that replacing
dummy_read()
withdef dummy_numpy()
(just creating some arrays) does not increase mem usage:The text was updated successfully, but these errors were encountered: