Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Universal API implementations for Micro-Manager OME-TIFF and NDTiff #185

Merged
merged 79 commits into from
Mar 4, 2024

Conversation

ziw-liu
Copy link
Collaborator

@ziw-liu ziw-liu commented Sep 2, 2023

Implements universal API (UAPI) for MMStack (Micro-Manager OME-TIFF) and NDTiff (written by MM and pycro-manager). This PR contains many breaking changes and is geared towards the v0.2.0 release.

Unresolved issues

User-facing changes

API

CLI

  • Removed support for converting single-page TIFFs
  • Removed arguments from the convert command
    • --format is always auto-inferred
    • --scale-voxels is always TRUE
    • --check-image is always FALSE
  • The chunk size argument --chunks now works for both NDTiff and OME-TIFF
    • The default chunk size is ZYX for all input formats
  • Avoid using the root logger and set the default logging level to INFO (better logging #188)
    • The logging level can be changed with an environment variable, e.g. IOHUB_LOG_LEVEL=DEBUG

Documentation

  • Updated sphinx theme
  • TODO: add version switcher for old releases

Technical changes

Notes

Many issues are linked to be closed by this PR.
Reviewers: please check the boxes and leave a sign-off when an issue has been verified to be fixed.

@ziw-liu
Copy link
Collaborator Author

ziw-liu commented Sep 2, 2023

@JoOkuma Not sure if this is going to work for the DaXi data since I don't have 1024x2048x2048 volumes to test with...
Also if you don't like the Dask conversion, the TiffZarrStore (in acquisition axes ordering) is available through MMStack._store.

@ziw-liu ziw-liu changed the base branch from main to unified-api September 2, 2023 21:11
@ziw-liu ziw-liu changed the base branch from unified-api to main September 2, 2023 21:13
@ziw-liu ziw-liu changed the base branch from main to unified-api September 2, 2023 21:13
@ziw-liu ziw-liu added enhancement New feature or request μManager Micro-Manager files and metadata labels Sep 2, 2023
@JoOkuma
Copy link
Member

JoOkuma commented Sep 5, 2023

@ziw-liu, thank you. I'll give it a try soon

@ziw-liu ziw-liu changed the title MMStack universal API reader Universal API implementations for Micro-Manager OME-TIFF and NDTiff Feb 6, 2024
@ziw-liu ziw-liu added this to the 0.2.0 milestone Feb 24, 2024
@mattersoflight
Copy link
Collaborator

mattersoflight commented Feb 25, 2024

@ziw-liu fantastic work across all the layers of iohub. @ieivanov great suggestions!
It looks like some of the open issues (#71 , #207 , #208 ) are quick to handle. Please test/fix the code if the original authors are busy and close them.

I suggest merging when these features are implemented and tested with real data:

  • iohub convert can chunk by z (looks like that was done in another PR).
  • Unified API is documented on README so that power users/reviewers begin to switch and can provide feedback aligned with future development.

I think the following can be future/smaller PRs that quickly follow this:

  • Proper metadata parsing across the formats. How we handle metadata depends on how we will consume metadata - in the viewer, through API, etc. One PR per format would make sense to iterate through the metadata handling systematically.
  • User friendly examples.

I don't have input on when it is best to rename methods. Since many users are now finding this PR valuable and will provide input as they use the code, it is smart not to delay the merge.

@ziw-liu
Copy link
Collaborator Author

ziw-liu commented Feb 28, 2024

@ieivanov Is there any blocking issues left before we are ready to merge this?

@ieivanov
Copy link
Contributor

One question / request before I approve - when are the docs rebuilt? It looks like only on a push to main? It will be very helpful to have docs built for this PR (or the ometiff-uapi branch) so that users can have easy access to the updated API.

@ziw-liu
Copy link
Collaborator Author

ziw-liu commented Feb 28, 2024

One question / request before I approve - when are the docs rebuilt? It looks like only on a push to main? It will be very helpful to have docs built for this PR (or the ometiff-uapi branch) so that users can have easy access to the updated API.

Currently it is only built when pushing to main. I can attempt to also build for this the unified-api branch, but we won't know if it is going to work (before we actually merge this).

@ziw-liu
Copy link
Collaborator Author

ziw-liu commented Feb 28, 2024

@ieivanov I propose that we handle the docs preview in a separate PR so we can easily revert the change after the transition.

Copy link
Contributor

@ieivanov ieivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this PR merged. A few outstanding non-blocking issues from my point of view based on our discussion so far:

@ieivanov
Copy link
Contributor

I think there is a bug / confusion around the root property of reader classes. BaseFOV defines a root property and BaseFOVMapping does not. MicroManagerFOV then defines

  @property
  def root(self) -> Path:
      return self.parent.root

but its parent MicroManagerFOVMapping does not have a root property. Later on, both MMStack and NDTiffDataset define a root property, even though it's not required in the base MicroManagerFOVMapping class.

Should root be a property of a FOV or a Stack/Dataset?

I think there may be a similar issue with zyx_scale and t_scale - should these be properties of the FOV or Stack/Dataset? Would we allow a Stack/Dataset to contain FOVs with different scale?

@JoOkuma
Copy link
Member

JoOkuma commented Feb 28, 2024

stack could be replaced with dataset, if you prefer. One more iteration could be:

I like @ieivanov's suggestion; I think dataset would be more appropriate because stack is a "well-defined" data structure, which we do not use for these classes.

Copy link
Member

@JoOkuma JoOkuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again, I would like to thank @ziw-liu for this PR. It's a huge change and will improve our implementation of downstream image processing/analysis routines.

It closes the Zarr and file issues we raised before.

My concerns are:

  • The primary usage of xdata on the new FOV classes, which deviates from the common API. From what I chatted with @ziw-liu, this is a requirement for the TIFF data (so it has array-like behavior? I don't remember quite well).
    I'm worried some routines start depending on it and, therefore, are not universal, which is the main goal of this PR.
    This method could be added to BaseFOV so every class must implement its own xdata method; the downside is BaseFOV starts deviating slightly from being an ArrayLike object.
  • The convert function being TIFF specific. The original vision of the universal API was that the conversion function would be BaseFOVMapping -> BaseFOVMapping --- despite, in practice, the output type being a subset of all supported types because some are read-only.
    This leads me to think that the API isn't universal yet.
    I understand that MicroManager-specific features are not available in other classes, but I think these could be wrapped into existing BaseFOVMapping methods; if that's not the case, the BaseFOVMapping could be improved.

iohub/convert.py Show resolved Hide resolved
@ziw-liu
Copy link
Collaborator Author

ziw-liu commented Feb 28, 2024

Should root be a property of a FOV or a Stack/Dataset?

I think there may be a similar issue with zyx_scale and t_scale - should these be properties of the FOV or Stack/Dataset? Would we allow a Stack/Dataset to contain FOVs with different scale?

For MM formats, both FOV and Mapping classes should have the root and zyx_scale attributes and they should be the same value (I added them to MicroManagerFOVMapping). Requiring these attributes for BaseFOV is to make it work in the same way for single-FOV formats (e.g. CC).

For NDTIFF, I couldn't find a way to get the intended time interval, so the t_scale implementation is specific to MMOmeTiffFOV for now.

For scaling, I think we should assume homogeneous mappings but allow it to be used for expressing heterogeneous datasets.

@ziw-liu
Copy link
Collaborator Author

ziw-liu commented Mar 2, 2024

@JoOkuma Can #85 be closed by this?

@ziw-liu
Copy link
Collaborator Author

ziw-liu commented Mar 2, 2024

The primary usage of xdata on the new FOV classes, which deviates from the common API. From what I chatted with @ziw-liu, this is a requirement for the TIFF data (so it has array-like behavior? I don't remember quite well).
I'm worried some routines start depending on it and, therefore, are not universal, which is the main goal of this PR.
This method could be added to BaseFOV so every class must implement its own xdata method; the downside is BaseFOV starts deviating slightly from being an ArrayLike object.

xarray.DataArray is also ArrayLike. In the case of TIFF readers it is only to label the axes of the underlying dask.Array. I think the spirit of UAPI was that one can call napari.view_image with any BaseFOV child.
Edit: IIRC we decided that the full array interface is too broad to implement.

@JoOkuma
Copy link
Member

JoOkuma commented Mar 4, 2024

@JoOkuma Can #85 be closed by this?

@ziw-liu , I think so.

@JoOkuma
Copy link
Member

JoOkuma commented Mar 4, 2024

The primary usage of xdata on the new FOV classes, which deviates from the common API. From what I chatted with @ziw-liu, this is a requirement for the TIFF data (so it has array-like behavior? I don't remember quite well).
I'm worried some routines start depending on it and, therefore, are not universal, which is the main goal of this PR.
This method could be added to BaseFOV so every class must implement its own xdata method; the downside is BaseFOV starts deviating slightly from being an ArrayLike object.

xarray.DataArray is also ArrayLike. In the case of TIFF readers it is only to label the axes of the underlying dask.Array. I think the spirit of UAPI was that one can call napari.view_image with any BaseFOV child. Edit: IIRC we decided that the full array interface is too broad to implement.

Ok, now that I applied closer attention I understand that you're mainly using xdata inside the __getitem__ methods.
I was a bit concerned because of

to_zarr(fov.xdata.data.rechunk(self.chunks), zarr_img)

But there isn't an easy way around it.

@ziw-liu , I'm very happy with the PR, and for me it's ready to merge and remaining tasks could be converted to issues.

@ziw-liu
Copy link
Collaborator Author

ziw-liu commented Mar 4, 2024

@ieivanov @JoOkuma @talonchandler Thanks for the reviews!

@ziw-liu ziw-liu merged commit 3ab89ba into unified-api Mar 4, 2024
7 checks passed
@ziw-liu ziw-liu deleted the ometiff-uapi branch March 4, 2024 19:02
@ziw-liu ziw-liu mentioned this pull request Mar 4, 2024
2 tasks
ieivanov added a commit that referenced this pull request Apr 15, 2024
commit fac2c13
Author: Ivan Ivanov <ivan.ivanov@czbiohub.org>
Date:   Tue Apr 9 11:25:36 2024 -0700

    Fix bug reading dragonfly acquisitions (#215)

    * fix bug reading dragonfly acquisitions

    * fix typo

    * style

    * bugfix

commit 0c6984e
Author: Ivan Ivanov <ivan.ivanov@czbiohub.org>
Date:   Mon Mar 11 12:35:51 2024 -0700

    Fix bug determining number of rows and cols (#214)

    * fix bug determining number of rows and cols

    * add another XY Stage variation

    * add docs and fix style

commit 3ab89ba
Author: Ziwen Liu <67518483+ziw-liu@users.noreply.github.com>
Date:   Mon Mar 4 11:02:49 2024 -0800

    Universal API implementations for Micro-Manager OME-TIFF and NDTiff (#185)

    * wip: draft mmstack ome-tiff fov

    * MM FOV base class

    * tests

    * bump tifffile

    * comment

    * fix indent after rebase

    * use get default

    * test pixel indexing

    * set MM metadata

    * style

    * update dependencies

    * add xarray

    * move old readers to the `_deprecated` namespace

    * uapi for ndtiff

    * refactor test setup to parametrize by dataset
    use globals instead of fixtures since parametrization happens before fixture evaluation

    * convert mmstack

    * fix and test chunking

    * fix metadata conversion and test ndtiff

    * update cli

    * fix scaling

    * test 1.4 and incomplete ome-tiffs

    * move reader tests

    * deprecate reader tests

    * update deprecated tests

    * update ngff tests

    * isort

    * update black target to 3.10

    * lint

    * fix download paths

    * update docs references and theme

    * untrack autogenerated file

    * ignore execution time file

    * add github icon

    * update docstring

    * update docstring

    * show channel names and chunk size in info

    * print plate chunk size if verbose

    * fallback for pixel size

    * remove log level setting

    * do not filter logs and warnings in reader

    * avoid root logger

    * isort

    * set default logging level to INFO

    * format docstring

    * improve conversion messages

    * black

    * fix ome-tiff channel name indexing

    * fix ndtiff channel name indexing

    * update converter test

    * remove use of os.path in `reader`

    * expand _check_ndtiff checks

    * fix iteration

    * fix python 3.10

    using `Path.glob(*/)` to get subdirs was added in 3.11

    * bump zarr version to include resizing fix
    zarr-developers/zarr-python#1540

    * fix cli default

    * set log level with an environment variable

    * fix unset

    * catch non-existent page

    * implement fallback for incomplete channel names
    workaround for the Dragonfly microscope where the multi-camera setup only has one channel name written

    * add debug logs

    * handle virtual frames

    * try reading pages from TiffFile directly

    * filter error logs about ImageJ metadata being broken
    this is a known MM limitation when writing OME-TIFFs

    * fix regex

    * remove use of os.path in `convert.py`

    * better channel indexing in `_get_summary_metadata`

    * style

    * safer NoneType check

    * private default axis names for NDTiff

    * update documentation to reflect new entry point

    * add repr to MM FOV and dataset types

    * rename mm_meta and expose summary metadata

    * add MicroManagerFOVMapping.root

    * add MicroManagerFOVMapping.zyx_scale

    * add warning log for failed position grid

    * fix grid layout

    * suppress hypothesis flakiness

    * different health check suppression

    ---------

    Co-authored-by: Ivan Ivanov <ivan.ivanov@czbiohub.org>
ieivanov added a commit that referenced this pull request Apr 15, 2024
* wip: draft mmstack ome-tiff fov

* MM FOV base class

* tests

* bump tifffile

* comment

* fix indent after rebase

* use get default

* test pixel indexing

* set MM metadata

* style

* update dependencies

* add xarray

* move old readers to the `_deprecated` namespace

* uapi for ndtiff

* refactor test setup to parametrize by dataset
use globals instead of fixtures since parametrization happens before fixture evaluation

* convert mmstack

* fix and test chunking

* fix metadata conversion and test ndtiff

* update cli

* fix scaling

* test 1.4 and incomplete ome-tiffs

* move reader tests

* deprecate reader tests

* update deprecated tests

* update ngff tests

* isort

* update black target to 3.10

* lint

* fix download paths

* update docs references and theme

* untrack autogenerated file

* ignore execution time file

* add github icon

* update docstring

* update docstring

* show channel names and chunk size in info

* print plate chunk size if verbose

* fallback for pixel size

* remove log level setting

* do not filter logs and warnings in reader

* avoid root logger

* isort

* set default logging level to INFO

* format docstring

* improve conversion messages

* black

* fix ome-tiff channel name indexing

* fix ndtiff channel name indexing

* update converter test

* remove use of os.path in `reader`

* expand _check_ndtiff checks

* fix iteration

* fix python 3.10

using `Path.glob(*/)` to get subdirs was added in 3.11

* bump zarr version to include resizing fix
zarr-developers/zarr-python#1540

* fix cli default

* set log level with an environment variable

* fix unset

* catch non-existent page

* implement fallback for incomplete channel names
workaround for the Dragonfly microscope where the multi-camera setup only has one channel name written

* add debug logs

* handle virtual frames

* try reading pages from TiffFile directly

* filter error logs about ImageJ metadata being broken
this is a known MM limitation when writing OME-TIFFs

* fix regex

* remove use of os.path in `convert.py`

* better channel indexing in `_get_summary_metadata`

* style

* safer NoneType check

* private default axis names for NDTiff

* sort metadata keys

* Squashed commit of the following:

commit fac2c13
Author: Ivan Ivanov <ivan.ivanov@czbiohub.org>
Date:   Tue Apr 9 11:25:36 2024 -0700

    Fix bug reading dragonfly acquisitions (#215)

    * fix bug reading dragonfly acquisitions

    * fix typo

    * style

    * bugfix

commit 0c6984e
Author: Ivan Ivanov <ivan.ivanov@czbiohub.org>
Date:   Mon Mar 11 12:35:51 2024 -0700

    Fix bug determining number of rows and cols (#214)

    * fix bug determining number of rows and cols

    * add another XY Stage variation

    * add docs and fix style

commit 3ab89ba
Author: Ziwen Liu <67518483+ziw-liu@users.noreply.github.com>
Date:   Mon Mar 4 11:02:49 2024 -0800

    Universal API implementations for Micro-Manager OME-TIFF and NDTiff (#185)

    * wip: draft mmstack ome-tiff fov

    * MM FOV base class

    * tests

    * bump tifffile

    * comment

    * fix indent after rebase

    * use get default

    * test pixel indexing

    * set MM metadata

    * style

    * update dependencies

    * add xarray

    * move old readers to the `_deprecated` namespace

    * uapi for ndtiff

    * refactor test setup to parametrize by dataset
    use globals instead of fixtures since parametrization happens before fixture evaluation

    * convert mmstack

    * fix and test chunking

    * fix metadata conversion and test ndtiff

    * update cli

    * fix scaling

    * test 1.4 and incomplete ome-tiffs

    * move reader tests

    * deprecate reader tests

    * update deprecated tests

    * update ngff tests

    * isort

    * update black target to 3.10

    * lint

    * fix download paths

    * update docs references and theme

    * untrack autogenerated file

    * ignore execution time file

    * add github icon

    * update docstring

    * update docstring

    * show channel names and chunk size in info

    * print plate chunk size if verbose

    * fallback for pixel size

    * remove log level setting

    * do not filter logs and warnings in reader

    * avoid root logger

    * isort

    * set default logging level to INFO

    * format docstring

    * improve conversion messages

    * black

    * fix ome-tiff channel name indexing

    * fix ndtiff channel name indexing

    * update converter test

    * remove use of os.path in `reader`

    * expand _check_ndtiff checks

    * fix iteration

    * fix python 3.10

    using `Path.glob(*/)` to get subdirs was added in 3.11

    * bump zarr version to include resizing fix
    zarr-developers/zarr-python#1540

    * fix cli default

    * set log level with an environment variable

    * fix unset

    * catch non-existent page

    * implement fallback for incomplete channel names
    workaround for the Dragonfly microscope where the multi-camera setup only has one channel name written

    * add debug logs

    * handle virtual frames

    * try reading pages from TiffFile directly

    * filter error logs about ImageJ metadata being broken
    this is a known MM limitation when writing OME-TIFFs

    * fix regex

    * remove use of os.path in `convert.py`

    * better channel indexing in `_get_summary_metadata`

    * style

    * safer NoneType check

    * private default axis names for NDTiff

    * update documentation to reflect new entry point

    * add repr to MM FOV and dataset types

    * rename mm_meta and expose summary metadata

    * add MicroManagerFOVMapping.root

    * add MicroManagerFOVMapping.zyx_scale

    * add warning log for failed position grid

    * fix grid layout

    * suppress hypothesis flakiness

    * different health check suppression

    ---------

    Co-authored-by: Ivan Ivanov <ivan.ivanov@czbiohub.org>

* black

* bugfix

---------

Co-authored-by: Ziwen Liu <67518483+ziw-liu@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous integration documentation Improvements or additions to documentation enhancement New feature or request maintenance Maintenance work μManager Micro-Manager files and metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants