-
-
Notifications
You must be signed in to change notification settings - Fork 379
Description
We currently use the global, mutable config object to connect our codec registries with individual arrays. But ideally a global, mutable object would bear as little load as possible in our codebase, when we can express things more clearly via a simpler mechanism: function arguments.
instead of this:
from your.module import NewBytesCodec
from zarr.core.config import register_codec, config
register_codec("bytes", NewBytesCodec)
config.set({"codecs.bytes": "your.module.NewBytesCodec"})
open_array(..., path="foo"...)I propose we do this:
from your.module import NewBytesCodec
from zarr.registry import get_codec_registry
# get a copy of the default registry, and update 1 key to have a new value
my_registry = get_codec_registry("default").update({"bytes": NewBytesCodec})
# open the array with a clearly defined set of codec classes
open_array(..., path="foo"..., codec_registry=my_registry)We would add a keyword-only argument codec_registry to the function signature of open_array like this:
def open_array(..., codec_registry: str | CodecRegistry = "default", ...):The default value of "default" would look up a collection of codecs named "default". We could also have a collection named "cupy", making this express something like a "profile". But users could also provide an explicit codec registry object, in which case only those codecs would be used. This would allow mixing cpu and gpu based codecs, in case anyone wants to do that. I think this API is way more explicit than relying on the config object, so we should definitely consider it. This would also address #3261
thoughts?