-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read Zarr with consolidated metadata #3066
base: main
Are you sure you want to change the base?
Conversation
@DennisHeimbigner Failures in the code preventing compilation aside, I'd be interested in your thoughts on this, particularly in advance of our scheduled conversation with @mannreis and Flo re: consolidated metadata. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mannreis I will take a look at the compilation failures in the next couple of days and pitch in where I can. I'm going to convert this to a draft PR for the time being, until we have the compilation and tests passing. Thanks!
This PR is motivated by #2987 and it is a follow up on the closed PR2992. It infers if the dataset is consolidated and acts accordingly. The implementations was inspired on the developments in Zarr3 support (by @DennisHeimbigner) which could simplify adding the same feature on the next version.
In short, this PR adds a layer (
NCZMD
, for NetCDF ZarrMetaData) that implements:.zattrs
,.zgroup
,.zarray
)This layer would be extended in the same way for writing (updating internal consolidated json and sync it on closure)
Depending on the existence of
/.zmetadata
the operations above are either process from it's content, or done directly on the storage, viazmap
.The feature above allows the S3 client implementation to be used against to vanilla HTTP servers, when authentication is out of the picture. But such is only possible because of 0832d450d207223fe43a9ee619bb722f9a29bff8, which avoids the S3 ListObjects.
As an example on how to produce a consolidated dataset in python:
This can be used to check if the reading output remains the same after (re)moving the
.zmetadata
Similar is done on 6346e91 taking into accound
zip
andfile
modes. Integrated tests exercising S3 are limited on my side (i'll try to add some here). However I have used it against my own endpoints and it seems to be functional.