-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writeable backends via entrypoints #5954
Comments
Thanks @rabernat for opening up this issue. I think now that the refactor for read support is completed, it is a great time to discuss the opportunities for adding write support to the plugin interface. pinging @aurghs and @alexamici since I know they have some thoughts developed here. |
Is What about making it so that backends can add methods to |
Another option is using a similiarly named store function as the read functions: xr.open_dataset(...)
xr.store_dataset(...) |
If we do that, I'd call it |
@rabernat and all, at the time of the read-only backend refactor @aurghs and I spent quite some time analysing write support and thinking of a unifying strategy. This is my interpretation of our findings:
Also note that ATM @aurghs and I are overloaded at work and we would have very little time that we can spend on this :/ |
Thanks for the info @alexamici!
I'm not sure I understand this comment, specifically what is meant by "serialise writes". I often use Xarray to do distributed writes to Zarr stores using 100+ distributed dask workers. It works great. We would need the same thing from a TileDB backend. We are focusing on the user-facing API, but in the end, whether we call it
|
I should have added "except Zarr" 😅 . All netCDF writers use Concurrent writes a la Zarr are awesome and xarray supports them now, so my point was: we can add non-concurrent write support to the plugin architecture quite easily and that will serve a lot of users. But supporting Zarr and other advanced backends via the plugin architecture is a lot more work. |
The backend refactor has gone a long way towards making it easier to implement custom backend readers via entry points. However, it is still not clear how to implement a writeable backend from a third party package as an entry point. Some of the reasons for this are:
open_dataset
) has a generic name, our writing functions (Dataset.to_netcdf
/Dataset.to_zarr
) are still format specific. (Related to load_store and dump_to_store #3638). I propose we introduce a genericDataset.to
method and deprecate the others.BackendEntrypoint
base class does not have a writing method, justopen_dataset
:xarray/xarray/backends/common.py
Lines 356 to 370 in e0deb9c
(Related to API Design for Xarray Backends #1970)
We should fix this situation! Here are the steps I would take.
BackendEntrypoint
base class.to_zarr
andto_netcdf
(or at least refactor to make a shallow call to a generic method)The text was updated successfully, but these errors were encountered: