Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unifying consolidated and non-consolidated interfaces #117

Closed
manzt opened this issue Sep 26, 2023 · 3 comments
Closed

Unifying consolidated and non-consolidated interfaces #117

manzt opened this issue Sep 26, 2023 · 3 comments

Comments

@manzt
Copy link
Owner

manzt commented Sep 26, 2023

Right now there are two different worlds for consolidated vs non-consolidated metadata:

consolidated

import { openConsolidated, FetchStore } from "zarrita";

const store = new FetchStore("http://localhost:8080/data.zarr");
const { open, root, contents } = openConsolidated(store);
const grp = root();

const knownKey = contents.keys().next().value
const node = open(knownKey, { kind: "array" });
// just an alias for contents.get(knownKey) but type safe and allows relative paths

non-consolidated

import { open, root, FetchStore } from "zarrita";

const store = new FetchStore("http://localhost:8080/data.zarr");
const grp = await open(store);
const node = await open(grp.resolve("foo"));

But then the end users have to deal with switching between these worlds.

I can see a use case where we don't know if the metadata is consolidated or not, but if it is, it would be nice to avoid loading that metadata over the network (i.e., we have something like AnnData).

In #109 , @keller-mark had the idea to start tracking information about the stores (which we can do with WeakMaps). I wonder if we could extend this idea to opening consolidated metadata. I'm wondering if we could do something similar to keep track of the contents we've opened so far for a store:

import { openConsolidated, open, FetchStore } from "zarrita";

const store = new FetchStore("http://localhost:8080/data.zarr");
const contents = await openConsolidated(store);
// Map<AbsolutePath, Array<DataType, Store> | Group<Store>>

const grp = open(store, { kind: "group" }); // uses the consolidated metadata for the store
const store = open(grp.resolve("foo"), { kind: "array" });

This would mean that if you know you have consolidated metadata, you could just use contents directly. But if you don't know if it's consolidated or not, you could "try" to open consolidated for a performance boost:

await openConsolidated(store, { returnContents: false }); // creates a tracker of the consolidated metadata
const grp = open(store, { kind: "group" }); // uses the consolidated metadata for the store
const store = open(grp.resolve("foo"), { kind: "array" });
@manzt
Copy link
Owner Author

manzt commented Sep 26, 2023

Another idea is that this could just be a special store (wrapper):

interface Listable {
  contents(): Array<{ path: AbsolutePath, kind: "array" | "group" }>;
}

async function withConsolidated<Store extends Readable>(store: Store) Pick<Store, "get"> & Listable {
  const known_metadata = await try_consolidated(store);
  return {
    get(...args: Parameters<Store["get"]>) {
      let [key, opts] = args;
      if (key in known_metadata) return known_metadata[key];
      let maybe_bytes = await store.get(key, opts);
      if (is_meta_key(key) && maybe_bytes) { // add to known_metadata
      return maybe_bytes;
    },
    contents() {
      return list_nodes(known_metadata);
    }
  }
}

The reason I chose to use contents over keys is because keys for Map would list all contents (including chunks).

Then openConsolidated could wrap withConslidated. Wonder what you think @keller-mark.

import { withConsolidated, FetchStore, open } from "zarrita";

let store = await withConsolidated(new FetchStore("http://localhost:8080"));
let contents = store.contents() // [ {path: "/", kind: "group" }, { path: "/foo", kind: "array" }, ...]
let foo = await open(contents[1].path, { kind: "array" });

@manzt
Copy link
Owner Author

manzt commented Sep 27, 2023

I am experimenting with the second option.

@manzt
Copy link
Owner Author

manzt commented Jan 14, 2024

Added in #119

@manzt manzt closed this as completed Jan 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant