-
-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
obstore
-based Store implementation
#1661
base: main
Are you sure you want to change the base?
Conversation
Amazing @kylebarron! I'll spend some time playing with this today. |
With roeap/object-store-python#9 it should be possible to fetch multiple ranges within a file concurrently with range coalescing (using That PR also adds a |
Great work @kylebarron! |
I suggest we see whether it makes any improvements first, so it's author's choice for now. |
While @rabernat has seen some impressive perf improvements in some settings when making many requests with Rust's tokio runtime, which would possibly also trickle down to a Python binding, the biggest advantage I see is improved ease of use in installation. A common hurdle I've seen is handling dependency management, especially around boto3, aioboto3, etc dependencies. Versions need to be compatible at runtime with any other libraries the user also has in their environment. And Python doesn't allow multiple versions of the same dependency at the same time in one environment. With a Python library wrapping a statically-linked Rust binary, you can remove all Python dependencies and remove this class of hardship. The underlying Rust object-store crate is stable and under open governance via the Apache Arrow project. We'll just have to wait on some discussion in object-store-python for exactly where that should live. I don't have an opinion myself on where this should live, but it should be on the order of 100 lines of code wherever it is (unless the v3 store api changes dramatically) |
👍
I want to keep an open mind about what the core stores provided by Zarr-Python are. My current thinking is that we should just do a |
This is no longer an issue, s3fs has much more relaxed deps than it used to. Furthermore, it's very likely to be already part of an installation environment. |
I agree with that. I think it is beneficial to keep the number of dependencies of core zarr-python small. But, I am open for discussion.
Sure! That is certainly useful. |
This is awesome work, thank you all!!! |
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
The I'd like to update this PR soonish to use that library instead. |
If the zarr group prefers object-store-rs, we can move it into the zarr-developers org, if you like. I would like to be involved in developing it, particularly if it can grow more explicit fsspec compatible functionality. |
I have a few questions because the
I like that |
This came up in the discussion at https://github.com/zarr-developers/zarr-python/pull/2426/files/5e0ffe80d039d9261517d96ce87220ce8d48e4f2#diff-bb6bb03f87fe9491ef78156256160d798369749b4b35c06d4f275425bdb6c4ad. By default, it's passed as Does it look compatible with what you need? |
object-store
-based Store implementationobstore
-based Store implementation
It looks like this is now passing all tests, with just Read the docs and codecov targets not hit. What are the final steps for this PR? Where should we write documentation? |
Expand test coverage
🎉
|
obstore 0.4.0 was released and this PR was updated to use that latest version. Optional dependencyFrom comment above
CI had been failing because (I think) I removed the re-export from import zarr.storage.obstore themselves. When that import is run, Alternatively We could keep the re-export but ensure that Future follow up PRs:
|
Add obstore to upstream env
Regarding the import of obstore, I suggest moving all imports from obstore into the class itself. Users should get an |
You can have
in |
We can move the imports into the class; that's fine.
It's not clear to me how the typing would work there? How would you define the type hint for |
I'm not usually too worried by typing. I'm sure it can be fixed, but fighting with mypy is never fun. |
Hmm, moving the imports inside the class caused a pickling error. Maybe this won't work:
|
A Zarr store based on
obstore
, which is a Python library that uses the Rustobject_store
crate under the hood.object-store is a rust crate for interoperating with remote object stores like S3, GCS, Azure, etc. See the highlights section of its docs.
obstore
maps async Rust functions to async Python functions, and is able to streamGET
andLIST
requests, which all make it a good candidate for use with the Zarr v3 Store protocol.You should be able to test this branch with the latest pre-release version of
obstore
:TODO: