Skip to content
This repository has been archived by the owner on Oct 24, 2024. It is now read-only.

Automatically close files using open_datatree context manager #93

Closed
TomNicholas opened this issue May 18, 2022 · 7 comments
Closed

Automatically close files using open_datatree context manager #93

TomNicholas opened this issue May 18, 2022 · 7 comments
Labels
enhancement New feature or request IO Representation of particular file formats as trees

Comments

@TomNicholas
Copy link
Member

In xarray it's possible to automatically close a dataset after opening by opening it using a context manager. From the documentation:

Datasets have a Dataset.close() method to close the associated netCDF file. However, it’s often cleaner to use a with statement:

# this automatically closes the dataset after use
In [5]: with xr.open_dataset("saved_on_disk.nc") as ds:
   ...:     print(ds.keys())
   ...: 

We currently don't have a DataTree.close() method, or any context manager behaviour for open_datatree. To add them presumably we would need to iterate over all file handles (i.e. groups) and close them one by one.

Related to #90 @jhamman @thewtex

@TomNicholas TomNicholas added enhancement New feature or request IO Representation of particular file formats as trees labels May 18, 2022
@TomNicholas TomNicholas changed the title open_datatree context manager Automatically close files using open_datatree context manager May 18, 2022
@jrmagers
Copy link

Could there be a load_datatree() method to be consistent with xr.load_dataset()? xr.load_dataset()

@TomNicholas
Copy link
Member Author

Could there be a load_datatree() method

Sure, once we have a .load() method too then writing a load_datatree() function would be simple, just like the code for xr.load_dataset() is simple.

Though currently we haven't implemented dask-specific methods yet.

@TomNicholas
Copy link
Member Author

@aurghs, @alexamici and @malmans2 - this issue and related backends issues seem like a good place for you guys to contribute if you wanted. You have expertise on xarray's backends, I don't, and they are pretty separable.

There are likely to be subtleties with respect to tracking multiple open file handles, and be aware that this will need to be done explicitly via a ._close attribute on DataTree after #41 moves that responsibility away from xarray.Dataset.

@malmans2
Copy link
Member

Sure - we are on it!

@wohenbushuang
Copy link

Is there any update on this issue or context manager? The file becomes occupied after open_datatree, which is so annoying.

@ghiggi
Copy link

ghiggi commented Jul 31, 2023

I am also interested in having this fixed. Can we exploit the logic done in xr.open_mfdataset? With collect the closers of all nodes, and then we assign a partial function to _close like in here? Or would you prefer to design a multicloser class?

@owenlittlejohns
Copy link

Closed in favour of: pydata/xarray#9337

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request IO Representation of particular file formats as trees
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants