Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak with opening/closing of files? #2626

Open
bartvstratum opened this issue Feb 16, 2023 · 7 comments
Open

Memory leak with opening/closing of files? #2626

bartvstratum opened this issue Feb 16, 2023 · 7 comments

Comments

@bartvstratum
Copy link

bartvstratum commented Feb 16, 2023

Software versions:

  • netcdf 4.9.0-3
  • hdf5 1.12.2-1
  • Manjaro linux.

While debugging a memory leak which seems to be related to NetCDF, I was running the C code from this issue: Unidata/netcdf4-python#986, and still see memory usage growing linearly in time. According to that issue, this should have been fixed by: #1634 ? As with the previous issue, the problem disappears when using NC_64BIT_OFFSET, both with the example below, and the real-life issue with our LES model (which only opens/closes the NetCDF files once per experiment).

Code to reproduce:

#include <netcdf.h>
#include <stdio.h>
#include <string>
#include <stdexcept>
#include <iostream>

void nc_check(int return_value)
{
    if (return_value != NC_NOERR)
    {
        std::string error(nc_strerror(return_value));
        throw std::runtime_error(error);
    }
}

int main()
{
    int dataset_id,  time_id, dummyvar_id;
    size_t start[1] = {0};
    size_t count[1] = {100};
    
    double data[100];
    for (int i = 0; i < 100; ++i)
    {
          data[i]=-99;
    };
    
    nc_check( nc_create("test.nc", NC_CLOBBER | NC_NETCDF4, &dataset_id) );
    nc_check( nc_def_dim(dataset_id, "time", NC_UNLIMITED, &time_id) );
    nc_check( nc_def_var(dataset_id, "dummy", NC_DOUBLE, 1, &time_id, &dummyvar_id) );
    nc_check( nc_enddef(dataset_id) );
    nc_check( nc_put_vara(dataset_id, dummyvar_id, start, count, data) );
    nc_check( nc_close(dataset_id) );
    
    for (int i = 0; i < 100000; ++i)
    {
        nc_check( nc_open("test.nc", NC_NOWRITE, &dataset_id) );
        nc_check( nc_close(dataset_id) );
    }
}

Figure_1

@WardF
Copy link
Member

WardF commented Feb 16, 2023

Thanks; taking a look at this now.

@akrherz
Copy link

akrherz commented Feb 16, 2023

Similar to #2589 and Unidata/netcdf4-python#1021

@WardF
Copy link
Member

WardF commented Feb 16, 2023

Thanks; I'm able to recreate this, and have updated the code you provided to show this in real time via calls to getrusage. I'll see if I can figure out whats going on, and hopefully it's something we can manage in the netCDF library; if it's downstream in libhdf5, there isn't much we can do about it potentially.

@WardF
Copy link
Member

WardF commented Feb 17, 2023

Ok, so I while I try to formulate my thoughts around this (and the related issues), I believe that the problem might be out of our hands. The last step I need to take is to recreate this program using pure HDF5 calls, removing libnetcdf from the loop altogether, and see if the issue persists. If so, then there is very little we can do other than document and highlight the issue for people.

I'll post some graphs generated from various profilers I've used, but I'm not finding a memory "leak", per se; while the memory usage is growing, it doesn't appear to be a leak in the sense we think of memory leaks. This is supported by the failure of valgrind, memory sanitization, or other tools to find any memory issues. Valgrind will report issues if the process is killed via SIG-INT, but that is expected; memory was allocated when the process was interrupted, without being free'd.

The alternative situation is if there is something more we should be doing when closing/free'ing an HDF5 file/object. I actually hope that this is the issue, because we can fix that. I would expect the tools I've been using to report these issues, however. Another thing that is dampening my optimism is that if I search for similar 'growing memory usage' issues in HDF5, ignoring netCDF, I find a lot of similar reports. A single process opening and closing HDF5 files thousands of time exhibit growing memory usage. But, we will do our due diligence and see what we can do.

@WardF
Copy link
Member

WardF commented Feb 17, 2023

Further, if the issue is around the handling of the libhdf5 objects, it is non-obvious; the same tests using other storage formats (netCDF3, ncZarr) do not exhibit the same growing memory usage.

@WardF
Copy link
Member

WardF commented Feb 17, 2023

Playing around with this some more, it feels like there might be a dangling open resources when an HDF5 is being closed. I should be able to follow up on this shortly.

@Alexander-Barth
Copy link
Contributor

In julia (NCDatasets.jl) we are also observing this suspected memory leak with NetCDF version 4.9.2 and and HDF5 1.14.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants