-
-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v3 stores: Implement efficient get/set_partial_values #1106
Comments
I believe that the most important use case for this is actually uncompressed arrays! That's a much simpler code path and reads no partial-reader (also happens to be the only one important to me for now). How are you proposing that get_partial_buffer should be called? At the moment in (v2) Array._get_selection we iterate over the selections for each chunk, so we have the information right before handing off to the store. |
Indeed, that's a great use-case!
I'll try to dump my thoughts about them: I guess there are at least two ways:
To solve this more holistically, the compressor (or a dummy for uncompressed arrays) should be able to tell if it can decode partial data, and have some interface for "demanding" data. In the uncompressed use-case, the requested array indices can directly be translated to chunk offsets, but in the blosc or other cases with an index, the decoder might need to read data in several passes (e.g. first getting some index, then getting the actual data, based on the index). For such cases, the PartialReadBuffer is a nice abstraction that allows to reload data in several passes, depending on the decoder. If the pattern is always to maybe get some data upfront for a chunk, and then the decoder can translate indices to offsets, this might be also be a viable option. PS: First, we still need to implement efficient get/set_partial_values for stores where this is possible, to gain anything from it. |
This is done now on the |
In #1096
get/set_partial_values
methods were introduced to Zarr v3 stores. The provided method is a viable fallback for stores that cannot read and write partial objects. Other stores however should implement optimized methods, such as fsspec-based stores (using read_block). It might be useful that stores indicate if they have fast partial read/write methods, so that strategies such as partial decompression can be selected automatically.As a follow-up, the new
get/set_partial_values
methods could be used for the actual partial decompression in thePartialReadBuffer
, instead of the current store-specific implementation.Follow-up to #1096
The text was updated successfully, but these errors were encountered: