Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --split-planes option to write one plane per file #108

Merged
merged 2 commits into from
Jan 18, 2024

Conversation

melissalinkert
Copy link
Member

Fixes #107.

Default and --split behavior should remain unchanged. Using --split-planes writes each plane's pyramid to a separate file. If both --split-planes and --split are used, --split-planes takes precedence.

As mentioned in #107, I'd really prefer not to add more flexibility than this to the OME-TIFF layout; as we've seen with bfconvert, even seemingly simple configurability can add a lot of complexity. With this change, though, there are now 3 options:

  • everything in one big OME-TIFF file (default)
  • each series/Image pyramid in one file (--split)
  • each plane pyramid in one file (--split-planes)

e.g. for a 96 well x 4 fields x 3 channels plate, these options would result in 1, 384, and 1152 files respectively.

@sbesson sbesson self-requested a review August 14, 2023 07:44
Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested similarly to #80 using the three Leica SCN samples, the QPTIFF and the BBBC017 plate. All original data was initially converted to Zarr using bioformats2raw 0.7.0.

Each Zarr was then converted three consecutive times into OME-TIFF using raw2ometiff. The following tables summarizes the number of files, the overall size of the generated OME-TIFF alongside the wall clock time for each conversion.

Using raw2ometiff 0.5.0 with no conversion options

File(s) Size Conversion times
Leica-1 1 3.9G 0m57.827s, 0m35.777s, 0m36.379s
Leica-2 1 12G 1m48.608s, 1m15.134s, 1m13.050s
Leica-3 1 15G 3m11.052s, 1m47.011s, 1m47.685s
LuCa-7color_Scan1 1 2.1G 1m8.457s, 0m41.065s, 0m38.528s
NIRHTa+00 1 2.7G 54m34.824s, 50m50.328s, 53m1.895s

Using raw2ometiff built from this PR with no conversion options

File(s) Size Conversion times
Leica-1 1 3.9G 0m59.106s, 0m37.695s, 0m34.907s
Leica-2 1 12G 1m37.866s, 1m20.039s, 1m15.268s
Leica-3 1 15G 3m10.691s, 1m58.566s, 1m59.207s
LuCa-7color_Scan1 1 2.1G 1m7.541s, 0m40.483s, 0m38.335s
NIRHTa+00 1 2.7G 51m42.813s, 51m15.920s, 51m40.206s

Using raw2ometiff built from this PR with --split

File(s) Size Conversion times
Leica-1 3 3.9G 0m55.614s, 0m36.873s, 0m35.161s
Leica-2 6 12G 1m42.694s, 1m17.741s, 1m26.296s
Leica-3 10 15G 3m2.816s, 1m54.452s, 1m47.574s
LuCa-7color_Scan1 5 2.1G 0m57.902s, 0m38.675s, 0m35.457s
NIRHTa+00 2305 2.7G 52m19.851s, 51m20.621s, 50m1.911s

Using raw2ometiff built from this PR with --split-planes

File(s) Size Conversion times
Leica-1 7 3.9G 0m59.232s, 0m34.384s, 0m35.741s
Leica-2 16 12G 1m35.939s, 1m17.764s, 1m27.106s
Leica-3 28 15G 3m5.254s, 1m48.620s, 1m56.691s
LuCa-7color_Scan1 15 2.1G 1m4.195s, 0m37.367s, 0m39.583s
NIRHTa+00 6913 2.7G 54m58.601s, 51m49.380s, 52m40.072s

The integerity of the binary data was confirmed by running the test-equivalent target from Bio-Formats and comparing each fileset generated by this PR to the one generated raw2ometiff 0.5.0. All OME-TIFF filesets were additionally imported into OMERO for another round of visual inspection.

In conclusion:

  • conversion times are largely identical for all options
  • the size of the generated OME-TIFF filesets is identical for all options
  • the number of files varies as expected with the specified option
  • the binary data is identical for all options

Seconding @melissalinkert's statement, I think the three available conversion layout i.e. single file, multi-file with one TIFF per image, multi-file with one TIFF per plane cover the majority of the use cases. The README could possibly be updated to describe the different options available to the end user.

No objections to seeing this included in the next release of raw2ometiff but we should first confirm that new layout addresses the original issue i.e. the compatibility with CellProfiler expectations.

@melissalinkert
Copy link
Member Author

Following discussion in PR review meeting, at this point need confirmation from @DavidStirling that the output with --split-planes is compatible with CellProfiler.

Copy link
Member

@DavidStirling DavidStirling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested a few results from --split-planes with CellProfiler and they all seem to read without issues.

@sbesson
Copy link
Member

sbesson commented Dec 13, 2023

Capturing a former discussion with @melissalinkert and @muhanadz, the rationale behind #108 (comment) is to re-evaluate specifically the interplay of this PR together with the recent work on RGB (#113).
Although the scenarios in #108 (review) should be minimally retested, the goal is to focus more specifically on brightfield digital pathology examples containing one or multiple whole slide images and/or multiple focal planes using both the --rgb flag and some JPEG/JPEG-2000 compression as well as the --split and --split-planes options

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The series of tests performed in #108 (review) was supplemented by focusing on a few sample brightfield RGB images:

  • the three Leica SCN files from OpenSlide test data containing various numbers of whole slide images (1, 4 and 8 respectively ) in addition to a macro and label image
  • the Hamamatsu sample file from OME public images where the whole slide image is acquired alongside 3 focal planes

Each image was converted using:

  • raw2ometiff 0.6.0 with the --rgb --compression JPEG flags
  • raw2ometiff built from this PR with the --rgb --compression JPEG flags and optionally the --split or --split-planes flag

The tables below give the conversion tables for each scenarion

raw2ometiff-0.6.0/bin/raw2ometiff --rgb --compression JPEG
File(s) Size Conversion times
Leica-1 1 310M 1m25.443s, 0m45.139s, 0m55.417s
Leica-2 1 573M 2m58.437s, 2m18.694s, 1m48.608s
Leica-3 1 766M 4m10.469s, 2m44.894s, 2m33.178s
SR1274-908A 1 3.8G 30m21.554s, 25m37.569s, 25m2.360s
raw2ometiff-0.7.0-SNAPSHOT/bin/raw2ometiff --rgb --compression JPEG
File(s) Size Conversion times
Leica-1 1 310M 1m24.202s, 0m55.281s, 0m58.128s
Leica-2 1 573M 3m1.468s, 2m19.334s, 2m17.852s
Leica-3 1 766M 3m58.731s, 2m33.553s, 2m34.230s
SR1274-908A 1 3.8G 25m10.570s, 25m17.012s, 25m36.347s
raw2ometiff-0.7.0-SNAPSHOT/bin/raw2ometiff --rgb --compression JPEG --split
File(s) Size Conversion times
Leica-1 3 310M 1m10.697s, 0m47.250s, 0m47.932s
Leica-2 6 573M 3m0.624s, 1m57.246s, 2m2.941s
Leica-3 10 766M 4m13.484s, 2m39.012s, 3m11.566s
SR1274-908A 4 3.8G 25m58.071s, 26m21.676s, 26m3.323s
raw2ometiff-0.7.0-SNAPSHOT/bin/raw2ometiff --rgb --compression JPEG --split-planes
File(s) Size Conversion times
Leica-1 3 310M 1m7.061s, 0m56.130s, 0m45.598s
Leica-2 6 573M 2m58.462s, 2m19.262s, 1m51.896s
Leica-3 10 766M 4m1.697s, 3m0.748s, 3m5.106s
SR1274-908A 6 3.8G 25m40.832s, 31m31.718s, 25m40.051s

As previously the test-equivalent Bio-Formats target was executed for each sample produced with this PR against the equivalent data generated using raw2ometiff 0.6.0. Images were also validated manually by loading them into OMERO Plus and using PathViewer GRID for visual inspection.

Overall:

  • the conversion times is equivalent with or without this PR
  • the size of the data is consistent and independent of the presence of --split/--split-planes flag
  • the number of files in the OME-TIFF matches the expectation from the data dimensionality and the `--split/--split-planes
  • the pixel data is identical with and without this PR independently of the presence of --split/--split-planes flag

Overall, the functionality is working across the set of tested modalities:

  • bright field vs fluorescence
  • whole slide images vs high-content screening

As nothing significant will happen over the next 2 weeks given the time of the year, I expect the reasonable next steps are to come back to this early January 2024 and decide on the roadmap for inclusion in an upcoming raw2ometiff release and possibly in the upcoming NGFF-Converter 2 release.

@chris-allan chris-allan merged commit e4fda1f into glencoesoftware:master Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Split channel timepoints slice
4 participants