Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many open files #11

Closed
sambarluc opened this issue Oct 24, 2016 · 4 comments
Closed

Too many open files #11

sambarluc opened this issue Oct 24, 2016 · 4 comments

Comments

@sambarluc
Copy link
Contributor

Hi,
I am getting an issue with memmaps and a long simulation (several .data files).

/home/cimatori/git/xmitgcm/xmitgcm/utils.py in read_raw_data(datafile, dtype, shape, use_mmap)
    109     if use_mmap:
    110         # print("Reading %s using memmap" % datafile)
--> 111         d = np.memmap(datafile, dtype, 'r')
    112         d.close()
    113     else:

/home/cimatori/installed/miniconda2/lib/python2.7/site-packages/numpy/core/memmap.pyc in __new__(subtype, filename, dtype, mode, offset, shape, order)
    258         bytes -= start
    259         offset -= start
--> 260         mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
    261 
    262         self = ndarray.__new__(subtype, shape, dtype=descr, buffer=mm,

error: [Errno 24] Too many open files

The issue is known (see e.g. this question) but I don't seem to understand how to properly close the memmaped files, as I don't really understand yet how it interfaces with xarray.

On the practical side, I am able to work by setting use_mmap=False

@rabernat
Copy link
Member

This is a general issue with lazily reading large numbers of files. It happens also with xarray on netcdf files (pydata/xarray#463).

The root of the problem is that your system limits the number of open files that are allowed. You can change these "ulimits", which is probably the best option for you:
http://stackoverflow.com/questions/34588/how-do-i-change-the-number-of-open-files-limit-in-linux
http://askubuntu.com/questions/162229/how-do-i-increase-the-open-files-limit-for-a-non-root-user

@sambarluc
Copy link
Contributor Author

Hi Ryan, thanks for the feedback. Silly enough, but I didn't know about ulimits. I was guessing that a python fix would not be trivial, and you confirm it.
I will close the issue, thanks again.
AC

@rabernat rabernat reopened this Oct 25, 2016
@rabernat
Copy link
Member

Several people are hitting this issue, so I want to leave it open.

Some proposed developments in xarray might allow us to avoid opening the files until the data is requested.

@rabernat
Copy link
Member

Since #25, the data are loaded using dask.delayed objects, which means files are only opened when they are needed. Hopefully this will resolve the "too many open files" problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants