write_multiscales_metadata no_transformations #162

will-moore · 2022-01-25T17:23:21Z

As discussed at ome/omero-cli-zarr#93 (comment)
this allows the 'pathsargument ofwrite_multiscales_metadata()to take a list ofdatasets` dicts.

EDIT: Also updated to validate latest 0.4 spec with coordinateTransformations of scale and translation.

codecov · 2022-01-25T17:34:36Z

Codecov Report

Merging #162 (b081342) into master (437b8b0) will increase coverage by 0.41%.
The diff coverage is 93.02%.

@@            Coverage Diff             @@
##           master     #162      +/-   ##
==========================================
+ Coverage   83.11%   83.53%   +0.41%     
==========================================
  Files          12       12              
  Lines        1285     1354      +69     
==========================================
+ Hits         1068     1131      +63     
- Misses        217      223       +6

Impacted Files	Coverage Δ
ome_zarr/writer.py	`96.31% <92.30%> (-1.51%)`	⬇️
ome_zarr/format.py	`96.81% <93.33%> (-1.42%)`	⬇️
ome_zarr/reader.py	`85.20% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 437b8b0...b081342. Read the comment docs.

sbesson

Overall looks good and inline with the plate.wells handling. Can a few unit tests be added to test_writer to cover datasets as dictionaries as well as the None,[] scenarios?

I suspect the next step will be to add some code to validate optional transformations elements on write. This can be captured as an issue for now.

sbesson · 2022-01-31T09:13:37Z

With transformations being proposed to be renamed as coordinateTransforms and to become a requirement at the datasets level, I think supporting datasets to be passed as dictionaries is becoming extremely timely.

will-moore · 2022-01-31T14:08:31Z

@sbesson The test_writer() does write_image(...transformations) and then reads the result to check transformations are there. The test didn't change in this PR yet because the API for write_image() didn't change.

But I can also add more tests for write_multiscales_metadata directly.
I think I'll include the coordinateTransforms renaming in this PR. It will be slightly out of step with omero-cli-zarr but it is probably simpler to keep the number of PRs smaller?

sbesson

A few API questions primarily related to the questions around defining/validating transformations

ome_zarr/writer.py

sbesson · 2022-01-31T16:45:10Z

ome_zarr/writer.py

+
+    if coordinateTransformations is not None:
+        for dataset, transform in zip(datasets, coordinateTransformations):
+            dataset["coordinateTransformations"] = transform


See also discussion in https://github.com/ome/ngff/pull/85/files#r794486756 alongside the ability to define global vs dataset/level coordinate transformations. If both forms are allowed, I suspect we will need to have a way for the API to handle both variants?

ome_zarr/writer.py

sbesson · 2022-01-31T16:55:31Z

ome_zarr/writer.py

+        raise ValueError("Empty datasets list")
+    for dataset in datasets:
+        if isinstance(dataset, str):
+            validated_datasets.append({"path": dataset})


Similar to the question below, should a scale transformations be auto-inferred if only lists of strings are passed?

Should the default scale automatically include the aspect ratios between the different multi-scale datasets? In this case, this is not an information we will be able to derive from a list of paths. Should this API only support list of dictionaries as an input?

In omero-cli-zarr I use this for generating a set of transformations for a pyramid:

def marshal_transformations(image, levels=1, multiscales_zoom=2.0)

But I realise that this is quite omero-cli-zarr specific (e.g. no down-sampling in Z). So yes, I guess this is required now.

sbesson

A few questions

sbesson · 2022-02-02T13:49:30Z

ome_zarr/format.py

+
+    def validate_coordinate_transformations(
+        self,
+        shapes: List[tuple],


Feels strange to have this API depending on shapes especially as only the shapes length is used in the rest. Given the known assumptions on datasets, I wonder if len(axes) would be sufficient ?

Yes, I think that was left-over from earlier when the validate_coordiante_transformations() included the functionality from generate_coordinate_transformations(shapes).
However, it still validates that len(scale) == len(shape) for each level, so we at least need a [len(data.shape) for data in pyramid].
I don't think we have any validation that len(shape) is the same for all datasets in a pyramid. In fact, I don't even see that this is required by the spec?!
Also we don't have any validation that the first dataset is the largest, which we could do with shapes, but I guess that won't happen in validate_coordinate_transformations() so we don't really need shapes there.

You're correct there is no explicit statement about the lengths of the dimension arrays is not explicit. Probably the closest is this sentence about axes in multiscales "The number of values MUST be the same as the number of dimensions of the arrays corresponding to this image."

I am implicitly reading that that all the arrays for each dataset must have identical dimensions. Am I over interpreting anything @joshmoore @constantinpape ?

ome_zarr/format.py

sbesson · 2022-02-02T13:51:28Z

ome_zarr/writer.py

+    if datasets is None or len(datasets) == 0:
+        raise ValueError("Empty datasets list")
+    for dataset in datasets:
+        if isinstance(dataset, str):


I assume this block can be completely removed now and datasets == str should be handled by the ValueError below

That didn't quite work because we still support List[str] list of paths for < 0.4. Lots of tests failed when I removed it.

Given that this API was introduced in #149 and never made available in a public release, I would be personally fine with completely dropping this form and updating all the failing tests to use the new datasets form. The alternative would be to add some form of check on the format version here and fail hard if fmt.version is not 0.3 or lower.

I think the biggest issue I have reading the code is that it looks like there is a path for using this API and creating a 0.4 version of the multiscales with datasets only composed of path which is invalid. I can look into that in a follow-up PR.

ome_zarr/writer.py

sbesson

Overall looks good. One minor question around the redundant call to the transformations validation.

I think the unit tests should cover most of the validation logic for the transforms. I propose to review the code coverage as well as #162 (comment) in a follow-up PR so that we can start testing this API.

Two general API thoughts which are probably outside the scope of this PR:

I can foresee the need to add support for writing of bot datasets:coordinateTransformations and multiscales:coordinateTransformations. A fairly crude idea would be to use the current API but support coordinate_transformations lists of dimensions len(datasets) + 1 and use the first or last item as the shared transformation
the additional logic in write_multiscales_metadata points at the fact that axes and datasets are now more tightly coupled to each other. Moving forward, I wonder whether _validate_datasets should simply become _validate_multiscales and internalize all the internal checks between fields. Longer-term, I would hope this API cab=n be simply replaced by the usage of generic validation framework/constraints.

In the interim, proposing to merge this so that we can test it together with omero-cli-zarr. Is it worth cutting a new pre-release?

sbesson · 2022-02-03T14:35:17Z

ome_zarr/writer.py

+    shapes = [data.shape for data in pyramid]
+    if coordinate_transformations is None:
+        coordinate_transformations = fmt.generate_coordinate_transformations(shapes)
+    fmt.validate_coordinate_transformations(


Although this does not hurt, is that call now redundant with the one that is called later during the validation of the datasets element?

Not quite! If you pass in a list of transformations that is longer than the number of pyramid levels, this will fail validation at this point. BUT, if we remove the validation here it won't fail later because the transformations are zipped onto the Datasets, so the lengths will match (and the extra transformations will be ignored).
It is probably better to fail than to ignore the mismatch.

👍 Semi-related I discovered that Python 3.10 introduced a strict argument to zip that will eventually allow similar code to fail on mismatching list lengths rather than truncating. Let's keep things as they are for now but maybe add an inline comment explaining this logic?

Done in b081342

sbesson · 2022-02-07T15:26:38Z

@will-moore https://pypi.org/project/ome-zarr/0.3a2/ is now available as a pre-release with this API. Do you want to bump the requirement in napari-ome-zarr and omero-cli-zarr?

remote transformations from write_multiscales_metadata()

008ad0b

will-moore mentioned this pull request Jan 25, 2022

Axes v0.4 ome/omero-cli-zarr#93

Merged

will-moore changed the title ~~remote transformations from write_multiscales_metadata()~~ write_multiscales_metadata no_transformations Jan 26, 2022

sbesson requested changes Jan 26, 2022

View reviewed changes

will-moore mentioned this pull request Jan 26, 2022

Scale dict ome/omero-cli-zarr#103

Merged

will-moore added 2 commits January 31, 2022 14:13

Add test_multi_levels_transformations()

b3547d2

Rename transformations to coordinateTransformations

0747111

sbesson requested changes Jan 31, 2022

View reviewed changes

will-moore added 5 commits February 1, 2022 12:26

For v0.4 validate_coordinate_transforms

32ba4d5

Add tests for coordinateTransformations validation

0d33f39

Move validate_coordinate_transformations() to format

562fcf9

move code into fmt.generate_coordinate_transformations()

f2d6bd4

Add test_validate_coordinate_transforms()

2a1cccc

sbesson reviewed Feb 2, 2022

View reviewed changes

will-moore added 3 commits February 2, 2022 15:29

validate_coordinate_transformations(ndims) doesn't use shapes

a58908c

check only 1 translation and scale comes first

85fda44

_validate_datasets() includes transforms

342cdf6

sbesson approved these changes Feb 3, 2022

View reviewed changes

will-moore mentioned this pull request Feb 3, 2022

coordinateTransformations scale list ome/napari-ome-zarr#32

Merged

Add comment about extra validation step

b081342

sbesson merged commit 3e3327c into ome:master Feb 3, 2022

sbesson mentioned this pull request Feb 4, 2022

Multiscales metadata API: only support datasets as lists of dictionaries #165

Merged

sbesson mentioned this pull request Mar 2, 2022

Support multiscales coordinateTransformations #172

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

write_multiscales_metadata no_transformations #162

write_multiscales_metadata no_transformations #162

will-moore commented Jan 25, 2022 •

edited

Loading

codecov bot commented Jan 25, 2022 •

edited

Loading

sbesson left a comment

sbesson commented Jan 31, 2022

will-moore commented Jan 31, 2022

sbesson left a comment

sbesson Jan 31, 2022

sbesson Jan 31, 2022

sbesson Feb 1, 2022

will-moore Feb 1, 2022

sbesson left a comment

sbesson Feb 2, 2022 •

edited

Loading

will-moore Feb 2, 2022

sbesson Feb 2, 2022

sbesson Feb 2, 2022

will-moore Feb 2, 2022

sbesson Feb 3, 2022

sbesson left a comment

sbesson Feb 3, 2022

will-moore Feb 3, 2022

sbesson Feb 3, 2022

will-moore Feb 3, 2022

sbesson commented Feb 7, 2022

write_multiscales_metadata no_transformations #162

write_multiscales_metadata no_transformations #162

Conversation

will-moore commented Jan 25, 2022 • edited Loading

codecov bot commented Jan 25, 2022 • edited Loading

Codecov Report

sbesson left a comment

Choose a reason for hiding this comment

sbesson commented Jan 31, 2022

will-moore commented Jan 31, 2022

sbesson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbesson left a comment

Choose a reason for hiding this comment

sbesson Feb 2, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbesson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbesson commented Feb 7, 2022

will-moore commented Jan 25, 2022 •

edited

Loading

codecov bot commented Jan 25, 2022 •

edited

Loading

sbesson Feb 2, 2022 •

edited

Loading