Skip to content

Allow Stores to opt out of consolidated metadata. #3119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

paraseba
Copy link
Contributor

@paraseba paraseba commented Jun 6, 2025

Some Stores don't benefit from Zarr's consolidated metadata mechanism. These Stores usually implement their own consolidation mechanism, or provide good performance for metadata retrieval out of the box.

These Stores can now implement the supports_consolidated_metadata property returning False. In this situation, Zarr will silently ignore any requests to consolidate the metadata.

Some Stores don't benefit from Zarr's consolidated metadata mechanism.
These Stores usually implement their own consolidation mechanism, or
provide good performance for metadata retrieval out of the box.

These Stores can now implement the `supports_consolidated_metadata`
property returning `False`. In this situation, Zarr will silently ignore
any requests to consolidate the metadata.
@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Jun 6, 2025
Copy link
Contributor

@dstansby dstansby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give some context in to why this is useful? What's an example of a store that would want to put a blanket ban on consolidating metadata?

Can you also explain what the benefit of this approach is over stores overriding consolidate_metadata to raise an error or warning + noop?

If this is implemented this way, I don't think trying to consolidate in this case should silently do nothing, because that provides confusion for a user who is trying to consolidate but nothing is happening. It should raise an error, or at least a warning, so the user knows what is going on, and can correct their code to remove attempts at consolidation.

@paraseba
Copy link
Contributor Author

paraseba commented Jun 7, 2025

@dstansby:

Can you give some context in to why this is useful? What's an example of a store that would want to put a blanket ban on consolidating metadata?

Icechunk is an example of such a Store and my main motivation for this PR. Currently we are having issues with consolidated metadata. It's the default in XArray but it produces inconsistencies. Icechunk users have very concurrent workloads, and the whole point of Icechunk is to maintain consistency. Consolidated metadata breaks consistency before information gets to the Store, so it's too late to fix it.

Can you also explain what the benefit of this approach is over stores overriding consolidate_metadata to raise an error or warning + noop?

Our first approach In Icechunk was to error, see this PR for example. But we checked that it breaks a lot of code. From XArray backend tests (that Icechunk uses) to, more importantly, most code out there that is not passing consolidated=False to XArray. We also analyzed the possibility of silently (or with a waning) ignore the metadata key and not store it. But that would break probably the most important invariant for a store: set(key, x); get(key) == x. A lot of test and user code depend on that invariant being maintained.

If this is implemented this way, I don't think trying to consolidate in this case should silently do nothing, because that provides confusion for a user who is trying to consolidate but nothing is happening. It should raise an error, or at least a warning, so the user knows what is going on, and can correct their code to remove attempts at consolidation.

I like the idea of warning in consolidate_metadata. That would definitely provide useful information and help the user improve their code. I'll add it. Thank you!

@paraseba paraseba requested a review from dstansby June 7, 2025 12:56
@TomAugspurger
Copy link
Contributor

What's the proposed behavior of open_consolidated or open_group(..., consolidated=True)? IMO that should error, since the user has explicitly asked that consolidated metadata be used.

From XArray backend tests (that Icechunk uses) to, more importantly, most code out there that is not passing consolidated=False to XArray

Would it be fair to say that this is a case where the Store has a better idea of whether consolidated metadata is appropriate? If so, it sounds like we should update xarray to defer to the store for what the default should be.

Note that the top-level Zarr APIs all have a use_consolidated=None parameter (except for open_consolidated), which means "use consolidated if it happens to be there. Does this sound like what icechunk wants out of xarray?

These Stores usually implement their own consolidation mechanism, or provide good performance for metadata retrieval out of the box

I see two benefits from consolidated metadata:

  1. Reducing the number of reads / HTTP requests to load a bunch of related nodes under a group.
  2. Providing an API to navigate a hierarchy that you know is I/O free; Without consolidated metadata, getting the metadata off a child node of a Group requires I/O with the store. With consolidated metadata, there's no additional I/O (at the cost of your snapshot potentially not reflecting changes to the store behind your back).

Does that sound right to you @paraseba. Does one or the other provide particular challenges for icechunk?

If icechunk has its own consolidated metadata mechanism, is there any reason not to use it to populate the in-memory Group.metadata.consolidated_metadata field (doing this would require new APIs I think)? i.e. is the issue with storing consolidated_metadata in the root zarr.json, or is the issue with the behavior in zarr-python after the metadata is loaded?

@paraseba
Copy link
Contributor Author

paraseba commented Jun 7, 2025

@TomAugspurger:

What's the proposed behavior of open_consolidated or open_group(..., consolidated=True)? IMO that should error, since the user has explicitly asked that consolidated metadata be used.

I'm basically reinterpreting open_consolidated to be "open the group in the most performant and correct way the Store supports". If the store does its own consolidation that works better than Zarr's default one, it can implement supports_consolidated_metadata to return False.

Would it be fair to say that this is a case where the Store has a better idea of whether consolidated metadata is appropriate?

Exactly. Zarr is currently assuming the Store is "dumb" in terms of metadata, which is perfectly true for most stores. But smarter Stores such as Icechunk, need a way to indicate their behavior to Zarr. There is a lot more that could (and probably should) be done in this front. Zarr should be able to "reflect" on the Store capabilities to optimize its algorithms.

If so, it sounds like we should update xarray to defer to the store for what the default should be.

Yes. In the name of reducing impact I'm reinterpreting consolidated=True to "let the Store + Zarr decide what's best". The XArray user has no interest in the details of how the Store makes that happen.

Note that the top-level Zarr APIs all have a use_consolidated=None parameter (except for open_consolidated), which means "use consolidated if it happens to be there. Does this sound like what icechunk wants out of xarray?.

Kind of, but is this three-state value useful? Why XArray users would distinguish between None and True. Don't they want to just say "do what is best and possible"? As an XArray user I want to do what is faster. If tomorrow a store I'm using learns how to do better consolidation I don't want to have to change my code.

Providing an API to navigate a hierarchy that you know is I/O free; Without consolidated metadata, getting the metadata off a child node of a Group requires I/O with the store. With consolidated metadata, there's no additional I/O (at the cost of your snapshot potentially not reflecting changes to the store behind your back).

Yes. There is some risk of people depending on this behavior, but I'd argue they are depending on an implementation detail. The right way to discover hierarchy should be using the tree or members APIs. Those should provide the ability to do I/O free navigation.

If icechunk has its own consolidated metadata mechanism, is there any reason not to use it to populate the in-memory Group.metadata.consolidated_metadata field (doing this would require new APIs I think)?

We analyzed that possibility. A couple of thoughts:

  • This generates a (minor) performance impact on calls that are not interested in discovering structure. Now every get operation needs to check if it's group metadata and potentially do orders of magnitude more compute to generate the full hierarchy.
  • This would be quite surprising behavior from a store, again breaking the invariant of get after set is identity.
  • Previous point is made worse by consolidated_metadata key not being part of the spec.

i.e. is the issue with storing consolidated_metadata in the root zarr.json,

To make matters more complex, I understand the key can be stored at any level of the hierarchy, not only the root.

or is the issue with the behavior in zarr-python after the metadata is loaded

One of the main issues is there is no way to implement the current API consistently in concurrent scenarios.

@TomAugspurger
Copy link
Contributor

Kind of, but is this three-state value useful? Why XArray users would distinguish between None and True. Don't they want to just say "do what is best and possible"?

In my experience, yes it's useful for "dumb" stores like blob storage. I want to know whether my open_dataset call is going to make one HTTP or one per variable. I wouldn't want it to silently fall back to the slow path, which I gather isn't a problem for Icechunk since all paths are fast :)

I think we're saying essentially the same thing. xarray / Zarr should query the store for the "recommended" way to write stuff by default. I'm probably more bothered more than you by use_consolidated=True (silently or with a warning) meaning something other than "use consolidated metadata". And there's conceptual issue of whether consolidated metadata means "fast mode" or whether it means "write this extra bit of metadata to zarr.json.

Zarr should be able to "reflect" on the Store capabilities to optimize its algorithms.

+100. I have a WIP branch for Stores to read (uncompressed) data into a reallocated buffer without any intermediate copies. I think we'll be seeing this pattern in a few places.

The right way to discover hierarchy should be using the tree or members APIs. Those should provide the ability to do I/O free navigation.

LMK if you want to move this to a separate issue, but focusing just on zarr-python's in-memory model, if I do something like

group = await open_group(store, path)
async for member in group.members():
    ...

do you expect group.members() to always make an HTTP request? This is something I struggled with in consolidated metadata. But at the moment, zarr-python doesn't make an HTTP request as long as group.metadata.consolidated_metadata is set (from the original open_group).

This generates a (minor) performance impact on calls that are not interested in discovering structure. Now every get operation needs to check if it's group metadata and potentially do orders of magnitude more compute to generate the full hierarchy.
This would be quite surprising behavior from a store, again breaking the invariant of get after set is identity.

I didn't follow these two points, but I feel like we're getting a bit far afield from the PR and it's not clear whether this is worth spending time on. I might ping you on a separate issue if I have time to write up some thoughts.

Previous point is made worse by consolidated_metadata key not being part of the spec.

Not for lack of trying zarr-developers/zarr-specs#309


tl/dr my only qualm right now is with use_consolidated=True meaning something other than use consolidated metadata, especially when I think we can use use_consolidated=None to mean "query the store for what's appropriate".

@TomAugspurger
Copy link
Contributor

I just skimmed through earth-mover/icechunk#962 and pydata/xarray#10122 and I think I agree with pydata/xarray#10122 (comment):

  1. Implement Store.supports_consolidated_metadata here in zarr-python
  2. Update xarray to query that before trying to consolidate metadata

@paraseba
Copy link
Contributor Author

paraseba commented Jun 7, 2025

@TomAugspurger:

I'm probably more bothered more than you by use_consolidated=True (silently or with a warning) meaning something other than "use consolidated metadata".

This is a very valid point. I think things still work for Icechunk, if I change open to fail on use_consolidated=True on a non-supporting Store. As long as I can preserve the None behavior of asking the Store what it wants. Would that work for you?

do you expect group.members() to always make an HTTP request? This is something I struggled with in consolidated metadata. But at the moment, zarr-python doesn't make an HTTP request as long as group.metadata.consolidated_metadata is set (from the original open_group).

Consistency issues aside, I think this is the right behavior. And this is why I think depending on the consolidated_metadata metadata key as a user is not necessary.

@TomAugspurger
Copy link
Contributor

Icechunk, if I change open to fail on use_consolidated=True on a non-supporting Store.

Yep, I think that'll be perfect.

@paraseba
Copy link
Contributor Author

paraseba commented Jun 8, 2025

@TomAugspurger letting the Store select behavior when use_consolidated=None has been implemented. I like this version more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs release notes Automatically applied to PRs which haven't added release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants