-
-
Notifications
You must be signed in to change notification settings - Fork 378
Description
Currently, we reading uncompressed data we have the store read the data into a temporary buffer and then copy those bytes into the output buffer.
To get the best performance, we should consider adding an optional get_into API to Store. Instead of taking a prototype, this would take the actual output Buffer to read into. Stores must opt into this, for backwards compatibility, by overriding supports_get_into:
async def get_into(
self,
key: str,
out: Buffer,
byte_range: ByteRequest | None = None,
) -> bool:
raise NotImplementedError
@property
def supports_get_into(self) -> bool:
"""Does the store support get_into?"""
return False
For the special case of
- uncompress data, and
- The chunk being read is a contiguous subset of the output ndarray
then the bytes on disk can be interpreted directly as an ndarray (when combined with a shape and itemsize (and maybe endianness?), and we can avoid a memcpy. Some early testing indicates that this might be worth doing. Over in https://github.com/TomAugspurger/zarr-python/blob/tom/zero-copy-alt/simple.py, I see about 7.5x higher throughput for reading uncompressed data with read_into (compared to about 2.5x higher throughput for compressed data, where this get_into optimization isn't an option).
Real world gains will probably be lower, and remote file system APIs typically don't offer APIs to read directly into a user-allocated output buffer like .readinto does.