autoclose with distributed doesn't seem to work #1394

rabernat · 2017-05-02T15:37:07Z

I am trying to analyze a very large netCDF dataset using xarray and distributed.

I open my dataset with the new autoclose option:

ds = xr.open_mfdataset(ddir + '*.nc', decode_cf=False, autoclose=True)

However, when I try some reduction operation (e.g. ds['Salt'].mean()), I can see my open file count continue to rise monotonically. Eventually the dask worker dies with OSError: [Errno 24] Too many open files: '/proc/65644/sta once I hit the system ulimit.

Am I doing something wrong here? Why are the files not being closed? cc: @pwolfram

The text was updated successfully, but these errors were encountered:

shoyer · 2017-05-02T16:06:18Z

Just to make sure, which version of xarray are you using?

rabernat · 2017-05-02T16:13:56Z

0.9.3

shoyer · 2017-05-02T16:32:07Z

0.9.3

OK, so that shouldn't be a problem. Hmm.

My only suggestion is that we should think about trying to write a fuller test suite for the auto-close functionality, using a mock or fake of some sort that we can interrogate to verify it works properly. One simple thing would be to refactor the autoclose functionality into a single separate adaptor datastore (which wraps an underlying datastore) that we can more easily test, rather than putting it onto each of the underlying datastore classes. I'm not sure why I didn't think of that when @pwolfram was writing this before.

rabernat · 2017-05-02T16:48:32Z

The idea of a fuller test suite is a good idea.

One problem is that many of these applications involve really big datasets, so it is hard to share examples.

pwolfram · 2017-05-02T19:23:30Z

@rabernat, I would say that this is a bug. Is this with the scipy backend or netCDF4? Presumably if you have this problem we could run into too. For the record, we are using netCDF4.

pwolfram · 2017-05-02T19:24:07Z

Note, we don't use decode_cf=False. Does it crash without making this specification, e.g., using the default?

rabernat · 2017-05-02T19:25:59Z

netCDF4. decode_cf doesn't seem to affect anything important.

rabernat · 2017-05-02T19:33:38Z

I think that there is an underlying problem with the way that open_mfdataset is building the dask graph for this dataset (see #1396). Operations seem overly eager to read all the data and load it into memory. So it might not be a problem with autoclose after all.

I do notice that autoclose does work in certain cases. For example, after I open the dataset, it doesn't leave the files open. That's good.

jhamman · 2019-01-13T19:35:10Z

Closing this old issue. I'm assuming this behavior no longer exists following the backend refactors in 2018. @rabernat (or others) please reopen if you feel there is more to do here.

rabernat mentioned this issue May 2, 2017

selecting a point from an mfdataset #1396

Closed

jhamman added topic-backends bug topic-dask labels Jun 28, 2017

jhamman closed this as completed Jan 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

autoclose with distributed doesn't seem to work #1394

autoclose with distributed doesn't seem to work #1394

rabernat commented May 2, 2017

shoyer commented May 2, 2017

Uh oh!

rabernat commented May 2, 2017

Uh oh!

shoyer commented May 2, 2017

Uh oh!

rabernat commented May 2, 2017

Uh oh!

pwolfram commented May 2, 2017

Uh oh!

pwolfram commented May 2, 2017

Uh oh!

rabernat commented May 2, 2017

Uh oh!

rabernat commented May 2, 2017 •

edited

Loading

Uh oh!

jhamman commented Jan 13, 2019

Uh oh!

Uh oh!

autoclose with distributed doesn't seem to work #1394

autoclose with distributed doesn't seem to work #1394

Comments

rabernat commented May 2, 2017

shoyer commented May 2, 2017

Uh oh!

rabernat commented May 2, 2017

Uh oh!

shoyer commented May 2, 2017

Uh oh!

rabernat commented May 2, 2017

Uh oh!

pwolfram commented May 2, 2017

Uh oh!

pwolfram commented May 2, 2017

Uh oh!

rabernat commented May 2, 2017

Uh oh!

rabernat commented May 2, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhamman commented Jan 13, 2019

Uh oh!

rabernat commented May 2, 2017 •

edited

Loading