Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2D image pyramids (WIP) #211

Closed
wants to merge 93 commits into from
Closed

Conversation

sofroniewn
Copy link
Contributor

Description

This PR adds support for massive 2D images that need to be accessed in a tiled fashion, including image pyramids. It does this by creating a new layer type called pyramid that inherits the image layer. It takes a list of images that form the pyramid with the base level at 0. Right now I need to improve the switching between pyramid levels (assumes a 2x pyramid). It also has a max_tile_shape
that prevents too large images being sent to the gpu. There is not explicit tiling, but passing dask arrays or zarr arrays that have chunks in the 2D dimensions allows for very large images that cannot fit in RAM. Using dask opportunistic caching can prevent unnecessarily reloading often requested data. I have not explored computing image pyramids on the fly with dask coarsen but can investigate that too. Some of this was discussed in #103.

See bellow for an example gif with data from the CAMELYON16 pathology challenge. I converted the raw data to a precomputed image pyramid stored as a zarr file with 10 levels, with the following dimensions:
[(97792, 221184), (48896, 110592), (24448, 55296), (12224, 27648), (6112, 13824), (3056, 6912), (1528, 3456), (764, 1728), (382, 864), (191, 432)].
The base of the pyramid is about 16GB. Each level was chunked with a (300, 300) chunking.

image_pyramid9

Note that drawing shapes / markers just works - it occurs in the coordinate system of the base layer of the pyramid regardless of what tile or level is displayed. If you wanted to get the shapes in the coordinates of on of the levels you'd just need to divide by the appropriate scale factor.

Right now I think there are some parts of this that might assume so 2D stuff - though we should make sure it supports nD too before a merge. I'm also open to the idea that this functionality is incorporated into the image layer directly and we don't need a new layer type, but maybe just a flag around if a list of image pyramids is passed.

Type of change

  • New feature (non-breaking change which adds functionality)

How has this been tested?

  • there is no example yet :-( need to add one with small data just to show proof of concept

Final checklist:

  • My PR is the minimum possible work for the desired functionality
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

@henrypinkard
Copy link

@sofroniewn this is super cool and potentially extremely useful. I would fall squarely in categories 1 and 3 described in #141. That is, I would want to use Napari for visualizing and annotating 2D and 3D image datasets (often with time-lapses and many channels), at first on my laptop for testing, and then remotely with datasets on a server that are too big to fit in memory at once. In light of that, and with regards to this functionality, I wonder how easy it would be to provide a custom backend that provides its own multi-resolution data to Napari? I don't think this is mutually exclusive to converting to a specific pyramid format using zarr to achieve optimal performance, but could save an extra step of conversion in some cases, which can be a real pain for huge datasets

@sofroniewn
Copy link
Contributor Author

hi @henrypinkard in general napari right now supports any image data that you can call np.asarray() on, so this includes arrays from numpy, dask, zarr, hdf5. The advantage of using things like zarr and dask is the dynamic loading. If your array is small enough to fit in memory then this isn't a problem, but if your array is large then these really help.

Also right now napari itself doesn't do any file-io (i.e. there is no drop-down menu with file open etc.). Instead you need to load your data in a python script or jupyter notebook and then pass it to napari.

For the multiresolution data the planned input format will just be a list of objects that you can call np.asarray() on and an optional list of shapes describing the size of those arrays. Would that meet your needs?

@henrypinkard
Copy link

Yes, I think that is perfect actually. It sounds like it might be very easy to make my datasets compatible with Napari viewing. Would love to try this out once you have made an example to copy.

@sofroniewn
Copy link
Contributor Author

@henrypinkard that's great - if you want to share with me an example of your large images then I'm happy to make such an example for you. You can email me a google drive / box link.

I'm also going to work on getting this merged into master, most likely by closing this PR and superseding it with a new one

@sofroniewn sofroniewn mentioned this pull request May 26, 2019
6 tasks
@sofroniewn
Copy link
Contributor Author

This is superseded by #295

@sofroniewn sofroniewn closed this May 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants