Add API for writing HCS metadata #153

sbesson · 2022-01-05T11:44:50Z

Follow-up of #149, this adds two new top-level APIs to the ome_zarr.writer module: write_plate_metadata and write_well_metadata for creating plate and well specifications.

Unit tests are added to cover the combinations of parameters both in writing mode (test_writer) as well as in writing/reading mode (test_node/test_reader). This should significantly increase the coverage of the HCS codepath especially in reader.py.

The test_plate_2D5D unit test reproduces the issue raised in #145 and is currently marked as xfail. As the bug should be addressed in #148, I would propose to prioritize the review and integration of this PR first and remove the test marker in #148 together with the fix.

Thinking of the upcoming sepcification changes, the biggest potential implication for this new API are the changes proposed in ome/ngff#24 to the wells attribute of the plate specification with the addition of rowIndex and columnIndex attributes. A proposal would be to preemptively change type of the wells argument to List[dict] and add an internal API similar to _validate_plate_acquisitions to inspect the required fields conditionally to the format version.

Define two new methods write_plate_metadata and write_well_metadata including internal validation methods

…cation

codecov · 2022-01-05T11:49:54Z

Codecov Report

Merging #153 (ff197a0) into master (1c71531) will increase coverage by 9.20%.
The diff coverage is 90.69%.

@@            Coverage Diff             @@
##           master     #153      +/-   ##
==========================================
+ Coverage   71.28%   80.49%   +9.20%     
==========================================
  Files          11       11              
  Lines        1132     1174      +42     
==========================================
+ Hits          807      945     +138     
+ Misses        325      229      -96

Impacted Files	Coverage Δ
ome_zarr/format.py	`94.52% <ø> (ø)`
ome_zarr/writer.py	`93.44% <90.69%> (-1.56%)`	⬇️
ome_zarr/reader.py	`83.66% <0.00%> (+24.75%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1c71531...ff197a0. Read the comment docs.

For writing HCS datasets where group names are case-sensitive (e.g. A), this option alters the name and creates a mismatch with the plate metadata.

sbesson · 2022-01-05T14:45:12Z

After some quick investigation of https://github.com/ome/ome-zarr-py/runs/4714215369?check_suite_focus=true, the failure is unrelated to the Zarr development branch but the tests were simply failing on Ubuntu.

Reproducing them in a Docker environment, I found the issue comes from the key normalization which leads to the creation of Zarr group a for row A causing an obvious mismatch between the hierarchy and the .zattrs metadata. f468a0a, which I also found to be necessary in #148, addresses this issue.
d1311aa also re-enables Ubuntu tests without the Zarr development branch.

joshmoore

👍 but one "future warning" so to speak.

joshmoore · 2022-01-08T15:44:28Z

ome_zarr/writer.py

+            images[index] = {"path": str(image)}
+        elif isinstance(image, dict):
+            if not all(e in VALID_KEYS for e in image.keys()):
+                raise ValueError(f"{image} contains invalid keys")


The spec has been updated with a "MUST NOT" have other keys? Eventually, that could be problematic. Perhaps the prefix mechanism that @will-moore found in json-schema that leaves them unchecked could eventually be introduced here.

Thanks for picking up on this. I have not found a clear statement regarding the validity of additional keys in the ngff spec. This strict implementation is probably derived from my reading of multiscales where I assume metadata is the single key that tried to capture extra arguments.

That being said, I totally see this MUST NOT interpretation is and will be limiting both for the extension of the specification itself as well as for supporting external metadata not covered by the spec.

The well metadata is a perfect example as ome/ngff#24 defines more keys. Assuming someone wanted to write a version 0.3 with these new keys populated, is it legit and desired to let the writer implementation write this metadata? If so, is it worth a logging statement at WARN, INFO or DEBUG level? Or should this writer only care about the keys defined in the spec and ignore anything extra?

I think it was salad schema where you could use a prefix: to avoid validation, but I think for json-schema you don't need to do that and it won't fail for unrecognised attributes.

sbesson · 2022-01-09T21:50:39Z

Pushed 12b341d in response to the concerns expressed in #153 (comment) about unspecified keys. With this commit, the writer handles leniently unspecified keys are present in lists of dictionaries (acquisitions, images). The unit tests are adjusted to ensure the group creation is successful with all keys been written.

If we agree with this approach, I think it makes sense to implement immediately the proposal made at the end of this PR description i.e. update the write_plate_metadata signature to support wells either as Union[List[str], List[dict]] or as List[dict] and handle unspecified keys similarly to acquisitions and images.

will-moore · 2022-01-10T09:30:35Z

@sbesson That commit 12b341d replaces not all with not any in one case and any in the other. Is that right?

sbesson · 2022-01-10T09:38:23Z

Good catch. ff197a0 should unify both behaviors. Since the behavior is a simple logging statement, that's not something that the unit tests picked up

will-moore · 2022-01-11T10:43:23Z

@sbesson Thanks PR looks good. Updating write_plate_metadata to support List[Dict] makes sense. In this PR?

sbesson · 2022-01-11T10:53:36Z

In this PR?

Originally, that's what I thought but with different fires coming up, I feel this will get pushed. If you and @joshmoore are happy with the state of the latest proposed API, I'd propose to get this merged and have the new tests as well as the re-activated Ubuntu tests merged into the mainline. I'll handle the new wells type handling in a follow-up PR which should be self-contained and easier to review

ome_zarr/writer.py

sbesson added 2 commits January 5, 2022 11:26

Add new API for writing plate and well metadata

f16f154

Define two new methods write_plate_metadata and write_well_metadata including internal validation methods

Update node, reader and writer unit tests to cover plate/well specifi…

fab102a

…cation

sbesson added 3 commits January 5, 2022 11:55

Specify types for plate dictionary using type annotation

7105153

ome_zarr.format: do not normalize_keys

f468a0a

For writing HCS datasets where group names are case-sensitive (e.g. A), this option alters the name and creates a mismatch with the plate metadata.

Re-enable Ubuntu tests using various Python versions

d1311aa

joshmoore approved these changes Jan 8, 2022

View reviewed changes

Be lenient about unspecified keys in ome_zarr.writer

12b341d

Remove unexpected not

ff197a0

jburel reviewed Jan 12, 2022

View reviewed changes

ome_zarr/writer.py Show resolved Hide resolved

jburel mentioned this pull request Jan 12, 2022

Rename parameter: fmt -> format #156

Closed

sbesson merged commit 2ed4426 into ome:master Jan 12, 2022

sbesson deleted the write_plate_metadata branch January 12, 2022 09:57

This was referenced Jan 12, 2022

Fix remaining assumptions on 5D dimensions #148

Merged

Add support for passing wells as List[dict] in write_plate_metadata #157

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add API for writing HCS metadata #153

Add API for writing HCS metadata #153

sbesson commented Jan 5, 2022

codecov bot commented Jan 5, 2022 •

edited

Loading

sbesson commented Jan 5, 2022

joshmoore left a comment

joshmoore Jan 8, 2022

sbesson Jan 8, 2022

will-moore Jan 8, 2022

sbesson commented Jan 9, 2022

will-moore commented Jan 10, 2022

sbesson commented Jan 10, 2022

will-moore commented Jan 11, 2022

sbesson commented Jan 11, 2022

Add API for writing HCS metadata #153

Add API for writing HCS metadata #153

Conversation

sbesson commented Jan 5, 2022

codecov bot commented Jan 5, 2022 • edited Loading

Codecov Report

sbesson commented Jan 5, 2022

joshmoore left a comment

Choose a reason for hiding this comment

joshmoore Jan 8, 2022

Choose a reason for hiding this comment

sbesson Jan 8, 2022

Choose a reason for hiding this comment

will-moore Jan 8, 2022

Choose a reason for hiding this comment

sbesson commented Jan 9, 2022

will-moore commented Jan 10, 2022

sbesson commented Jan 10, 2022

will-moore commented Jan 11, 2022

sbesson commented Jan 11, 2022

codecov bot commented Jan 5, 2022 •

edited

Loading