-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A read-only TileDB backend #4987
Comments
Hi @jp-dark! Thanks for opening this issue and the draft pull request in #4988. As you probably know, we're in the process of completing a major refactor of our storage backends system (see #4989 and #4810 for the current state of that work). One of the main feature additions in this work is the new entrypoints functionality which will allow backends (like TileDB) to declare backend support without including the code in the Xarray itself. In light of this new functionality, we'd like to see if we can put the TileDB backend in TileDB itself (or in an another adjacent package). The end user functionality would be the same as the entrypoint would be registered at install time. We'd be happy to document the TileDB in the Xarray documentation as well. This is the development pattern we are headed to with most of our backends, including some of the current backends. We'd be happy to help work with you to sort out the details as I'm sure there will be one or two early adopter bumps to work through. |
@jhamman - thanks for the quick response! As I work through bumps would it be best to comment here or on one of the other currently open issues for the backend refactor? |
This is a great spot @jp-dark! Looking forward to seeing your progress. |
As a provider of a third-party backend, I would love to be able to integrate with xarray without including xarray as a dependency in my library, and xarray is actually really close to a place where that would be possible. The main development effort on the xarray side would be creating a generic If there is interest in pursuing this, I can help with developing a prototype to test feasibility. |
@jp-dark How would you feel about writing another small library, e.g., "xarray-tiledb" that can explicitly depend on both xarray and tiledb? We can potentially do some of xarray's backend stuff with protocols, but there are some aspects (especially for more advanced features like lazy loading) that will likely need the hard xarray dependency. |
@jp-dark it is in fact possible to write an xarray backend without explicitly depending on xarray in your We use the setuptools entrypoints infrastructure that triggers a module load only from within xarray itself. This is still work in progress, but we are implementing this strategy in cfgrib with success. You can get inspiration from the following PR by @aurghs: |
@alexamici @shoyer To be clear, my short term plan is absolutely to move the TileDB backend from my draft pull request here to a small plugin in library (thanks for linking the cfgrib @alexamici!). I only bring up the protocol thing because the backend is really close to a place where a lot of the boilerplate for the lazy loading, etc. could be provided on the xarray side with a simple API requirement on the third-party library side. I'll push up a small proof-of-concept with a read-only netCDF4 example shortly. |
See PR #4998 for the example edit: This example should be able to read a NetCDF dataset using the
|
Is there a branch of xarray that currently supports loading backend engines from third-party libraries? |
#4989 includes the full refactor. The plan is to merge this to xarray/master on Monday. |
I was able to use this backend from an external code base with the entry point procedure as described in the new docs, and it was completely painless. Great job with the backend refactor! |
@jhamman Is there a planned date for releasing the backend updates to PyPI? |
Not at this point. We just released 0.17 so I would think we're at least a week or two away from 0.18. |
Just to set expectations — I hadn't thought we were releasing so soon — though I'd be happy to, and it's getting easier & easier to do releases. |
This is a feature request for a read-only TileDB backend for reading a dense TileDB array into an xarray Dataset.
The text was updated successfully, but these errors were encountered: