dask and dask-image to improve ImageContainer #296

giovp · 2021-02-25T08:36:41Z

@GenevieveBuckley thanks a lot for the discussion today! Pinging @hspitzer, key points that I got (please add more if missing):

dask-image could replace rasterio to read images? It's using pims under the hood, and it should be able to handle multi page tiffs https://github.com/soft-matter/pims/
dask-image has good handling of edge cases in e.g. filters etc. should look into it for pre-processing that we currently do with scikit-image (related to fix segmentation at crop boundaries #101, segmentation at crop boundaries)

GenevieveBuckley · 2021-02-26T06:36:14Z

I was able to open the large tiff file with dask-image==0.4.0, pims, and tifffile.

I tried this in a new conda environment

conda create -n squipy-test python=3.8 pip ipython
conda activate squidpy-test
python -m pip install dask-image==0.4.0
python -m pip install tifffile

and then in ipython:

In [1]: from dask_image.imread import imread

In [2]: img = imread('V1_Adult_Mouse_Brain_Coronal_Section_2_image.tif')

In [3]: img
Out[3]: dask.array<concatenate, shape=(3, 24240, 24240), dtype=uint16, chunksize=(1, 24240, 24240), chunktype=numpy.ndarray>

In [4]: img.shape
Out[4]: (3, 24240, 24240)

I can see all three fluorescence channels (I also checked this by loading it into the napari viewer, and it looks fine to me).

By default, you get one chunk in the dask array for each fluorescence channel, but those are pretty big. It's probably more sensible to control the dask array chunksize yourself. You can do that with something like:

img = imread('V1_Adult_Mouse_Brain_Coronal_Section_2_image.tif', chunks=(1, 1000, 1000))

GenevieveBuckley · 2021-02-26T06:39:36Z

BTW, I recommend using dask_image==0.4.0 for now. We've identified a bug in the latest version (0.5.0) and while it probably won't affect you it's still simpler just to steer clear of it.

giovp · 2021-02-28T17:07:32Z

thanks again @GenevieveBuckley ! We'll check it out soon and let you know!

giovp · 2021-03-09T21:46:52Z

Hi @GenevieveBuckley ,
finally got around and tested the lines you suggested. Unfortunately, have encountered some issues:
chunks seems to not be working:

from dask_image.imread import imread
temp = imread("/Users/giovanni.palla/.cache/squidpy/visium_hne_image.tiff", chunks=(1, 1000, 1000))

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-625049f6a946> in <module>
----> 1 temp = imread("/Users/giovanni.palla/.cache/squidpy/visium_hne_image.tiff", chunks=(1, 1000, 1000))

TypeError: imread() got an unexpected keyword argument 'chunks'

after having a look at the source, it seems that chunks are set by nframes, but it has a constraint based on the first shape dimension
https://github.com/dask/dask-image/blob/63543bf2f6553a8150f45289492bf614e1945ac0/dask_image/imread/__init__.py#L55
in this case shape=(1, 11757, 11291, 3). Indeed, if I set nframes=10:

/Users/giovanni.palla/miniconda3/envs/spatial/lib/python3.8/site-packages/dask_image/imread/__init__.py:62: RuntimeWarning: `nframes` larger than number of frames in file. Will truncate to number of frames in file.
  warnings.warn(

finally, I might be missing something but why shape[0]==1?

For reference, this is the output of our custom read using rasterio

https://github.com/theislab/squidpy/blob/d3532fce65c57afaf8cbed61e82435319c63ca54/squidpy/im/_container.py#L271

which we then reshape to (y,x,channels).
I'm on dask 0.4.0

thank you in advance for the help!

GenevieveBuckley · 2021-03-11T08:44:36Z

Sorry for the mix up! Let me set aside some time to write a response in the next couple of days for you

michalk8 · 2021-04-01T18:19:32Z

I'd postpone imageread integration 'till dask/dask-image#181 gets resolved.
As for other other cases where dask/dask-image could be used, I think we can start with processing functions and then go into segmentation.

giovp · 2021-04-27T08:26:29Z

@michalk8 I think we need to prioritize this and move towards full integration with dask_image.
I followed discussion in dask/dask-image#181 and it seems like there is a much bigger effort on the way that is likely to take long.

Also, from @GenevieveBuckley comment here: dask/dask-image#181 (comment) it seems that

For array creation, since dask.array.image.imread reads the first tp from disk for determining image shape and dask_image.imread.imread uses pims to determine shape, the latter can be faster in the case of a few huge files. Otherwise dask.array.image.imread is much faster.

"large images" is exactly our case (we don't really have exaples with many z-stacked images) and should do the job.

The chunking seems to still be an issue (as I can't really pass it in the nframes argument, see #296 (comment) ).

The pure dask approach from here: dask/dask-image#181 (comment)

also seems to work and could be an option. Do you mind having a look?
In short, if we can use dask_image to do everything I think it's better, even if might not be fastest implementation to read images (but again, in our case I think we are working at much smaller scale than the one described in issue).

michalk8 · 2021-04-27T08:51:26Z

Ok, I will start by refactoring the loading to use dask/dask-image + for pre-prorcessing. For the latter, I think we can get rid of the size argument, i.e. for sq.im.process and sq.im.segment. Or maybe modify it so that we can specify chunks.

michalk8 · 2021-05-14T10:23:23Z

closed via #324

giovp added the enhancement ✨ New feature or request label Feb 25, 2021

giovp assigned giovp and unassigned giovp Feb 25, 2021

giovp added the image 🔬 label Mar 9, 2021

michalk8 added this to the 1.1 release milestone Mar 24, 2021

giovp mentioned this issue Apr 4, 2021

improve segmentation with morphology module #174

Closed

michalk8 mentioned this issue Apr 8, 2021

imctutorial squidpy environment install fail scverse/squidpy_notebooks#55

Closed

michalk8 mentioned this issue Apr 28, 2021

Dask image integration #324

Merged

3 tasks

giovp mentioned this issue May 5, 2021

Visium hires image segmentation #298

Closed

michalk8 closed this as completed May 14, 2021

michalk8 mentioned this issue Jul 2, 2021

fix segmentation at crop boundaries #101

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dask and dask-image to improve ImageContainer #296

dask and dask-image to improve ImageContainer #296

giovp commented Feb 25, 2021 •

edited by hspitzer

Loading

GenevieveBuckley commented Feb 26, 2021

GenevieveBuckley commented Feb 26, 2021

giovp commented Feb 28, 2021

giovp commented Mar 9, 2021 •

edited

Loading

GenevieveBuckley commented Mar 11, 2021

michalk8 commented Apr 1, 2021 •

edited

Loading

giovp commented Apr 27, 2021

michalk8 commented Apr 27, 2021 •

edited

Loading

michalk8 commented May 14, 2021

dask and dask-image to improve ImageContainer #296

dask and dask-image to improve ImageContainer #296

Comments

giovp commented Feb 25, 2021 • edited by hspitzer Loading

GenevieveBuckley commented Feb 26, 2021

GenevieveBuckley commented Feb 26, 2021

giovp commented Feb 28, 2021

giovp commented Mar 9, 2021 • edited Loading

GenevieveBuckley commented Mar 11, 2021

michalk8 commented Apr 1, 2021 • edited Loading

giovp commented Apr 27, 2021

michalk8 commented Apr 27, 2021 • edited Loading

michalk8 commented May 14, 2021

giovp commented Feb 25, 2021 •

edited by hspitzer

Loading

giovp commented Mar 9, 2021 •

edited

Loading

michalk8 commented Apr 1, 2021 •

edited

Loading

michalk8 commented Apr 27, 2021 •

edited

Loading