Add new API to write multiscales metadata #149

sbesson · 2021-12-16T07:14:10Z

While investigating the testing requirements for #148, I found that my primary need are API helper methods allowing to write NGFF metadata e.g. write_plate_metadata, write_well_metadata and generate test data

As a first step towards these APIs, this PR refactors the existing multiscales generation code and extract the multiscales metadata addition logic into a separate API write_multiscales_metadata.
A direct application of this method is the scenario where a library would have its own implementation of the data writing but would still like to write the metadata using the core API. An example is omero-cli-zarr where https://github.com/ome/omero-cli-zarr/blob/7589cc47920b912480b3e6daa282c3d773c2bfdf/src/omero_zarr/raw_pixels.py#L276-L289 could be updated to consume this API.

In addition to the new API, the existing internal axes validation method is split into two, separating the validation of the axes against the array dimensions and the inference of axes values for 2D/5D cases from the strict validation of the axes using the allowed values.

A set of unit tests are also added to cover the different scenarios.

- Split the axes validation logic from the axes guessing logic so it can be re-used - Add tests covering the scenarios of format 0.1/0.2 as well as invalid axes

codecov · 2021-12-16T07:20:18Z

Codecov Report

Merging #149 (a2195e5) into master (5c43ff5) will increase coverage by 0.24%.
The diff coverage is 95.83%.

@@            Coverage Diff             @@
##           master     #149      +/-   ##
==========================================
+ Coverage   71.04%   71.28%   +0.24%     
==========================================
  Files          11       11              
  Lines        1126     1132       +6     
==========================================
+ Hits          800      807       +7     
+ Misses        326      325       -1

Impacted Files	Coverage Δ
ome_zarr/writer.py	`95.00% <95.83%> (+1.75%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5c43ff5...a2195e5. Read the comment docs.

will-moore · 2021-12-16T10:43:13Z

Looks good. I don't know if I'd guess the different behaviour of _validate_axes() and _validate_axes_names() from their names, but naming things is hard.
All of this will change soon for v0.4 anyway. I imagined that we might use _validate_axes(axes: List[Dict]) to validate axes objects (in v0.4) whereas validation of their names with _validate_axes_names(axes: List[str]), but I don't know if it will make sense to separate the logic in the same way for v0.4...?

sbesson · 2021-12-16T10:58:07Z

@will-moore thanks for picking that up. Yes the name is not the greatest and I was also pondering how this would evolve with the upcoming axes specification.

As we decided to keep this API private, we can certainly iterate on names etc. At least for 0.3, I identified three different behaviors:

validate the value of the axes i.e. is a subset of TCZYX?
validate the consistency of the axes in relationship to the data i.e. are the dimensions consistent?
infer unspecified axes in some simple scenarios e.g. for 2D or 5D data

I think the spirit of 0c71fbe is to separate these different functionalities.

How do you expect the changes in 0.4 will affect this API? One thought is that the axis guessing based on the dimensionality might become irrelevant. But are the other behaviors that different? Or is it only the input type which will primarily vary?

will-moore · 2021-12-16T11:05:34Z

Yes, as this is a private API, it's OK to go with what we have here and update for 0.4.
I think for v0.4 we could still guess axes for 2D, maybe not for 5D.
The input type will certainly vary, at least for _validate_axes() since axes will be Dicts.

sbesson · 2021-12-17T17:35:33Z

Thanks both. Unless you can feel a particular need for a patch release, I'd propose to get this into the main development branch and bump the version to 0.3.0.dev0.

I have started working on similar APIs for plate/well metadata. My goal is to help to construct the unit tests allowing to test #148 and build the framework for the support of ome/ngff#24.

Said otherwise, I am starting to see a 0.3.0 roadmap emerging with a Jan 2022 timeline including:

new metadata writing API (multiscales, plate, well)
the fix for HCS with <5D data
anything else?

jburel · 2021-12-18T10:15:06Z

Since the ome_zarr_test_suite is triggered on merge, we should wait for the test suite to finish before tagging in case there is an issue.
It is green see https://github.com/ome/ome_zarr_test_suite/actions/runs/1594165903 but we never know

sbesson added 2 commits December 16, 2021 06:41

Split method writing the multiscales metadata

2d581d1

Include axes validation in write_multiscales_metadata

0c71fbe

- Split the axes validation logic from the axes guessing logic so it can be re-used - Add tests covering the scenarios of format 0.1/0.2 as well as invalid axes

Improve docstrings

a2195e5

joshmoore approved these changes Dec 17, 2021

View reviewed changes

sbesson merged commit 7916a95 into ome:master Dec 17, 2021

sbesson deleted the write_multiscales_metadata branch December 17, 2021 21:31

sbesson mentioned this pull request Jan 5, 2022

Add API for writing HCS metadata #153

Merged

This was referenced Feb 3, 2022

write_multiscales_metadata no_transformations #162

Merged

Multiscales metadata API: only support datasets as lists of dictionaries #165

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new API to write multiscales metadata #149

Add new API to write multiscales metadata #149

sbesson commented Dec 16, 2021

codecov bot commented Dec 16, 2021 •

edited

Loading

will-moore commented Dec 16, 2021

sbesson commented Dec 16, 2021 •

edited

Loading

will-moore commented Dec 16, 2021

sbesson commented Dec 17, 2021

jburel commented Dec 18, 2021

Add new API to write multiscales metadata #149

Add new API to write multiscales metadata #149

Conversation

sbesson commented Dec 16, 2021

codecov bot commented Dec 16, 2021 • edited Loading

Codecov Report

will-moore commented Dec 16, 2021

sbesson commented Dec 16, 2021 • edited Loading

will-moore commented Dec 16, 2021

sbesson commented Dec 17, 2021

jburel commented Dec 18, 2021

codecov bot commented Dec 16, 2021 •

edited

Loading

sbesson commented Dec 16, 2021 •

edited

Loading