Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dask and dask-image to improve ImageContainer #296

Closed
giovp opened this issue Feb 25, 2021 · 9 comments · Fixed by #324
Closed

dask and dask-image to improve ImageContainer #296

giovp opened this issue Feb 25, 2021 · 9 comments · Fixed by #324
Labels
enhancement ✨ New feature or request image 🔬
Milestone

Comments

@giovp
Copy link
Member

giovp commented Feb 25, 2021

@GenevieveBuckley thanks a lot for the discussion today! Pinging @hspitzer, key points that I got (please add more if missing):

@giovp giovp added the enhancement ✨ New feature or request label Feb 25, 2021
@giovp giovp assigned giovp and unassigned giovp Feb 25, 2021
@GenevieveBuckley
Copy link

I was able to open the large tiff file with dask-image==0.4.0, pims, and tifffile.

I tried this in a new conda environment

conda create -n squipy-test python=3.8 pip ipython
conda activate squidpy-test
python -m pip install dask-image==0.4.0
python -m pip install tifffile

and then in ipython:

In [1]: from dask_image.imread import imread

In [2]: img = imread('V1_Adult_Mouse_Brain_Coronal_Section_2_image.tif')

In [3]: img
Out[3]: dask.array<concatenate, shape=(3, 24240, 24240), dtype=uint16, chunksize=(1, 24240, 24240), chunktype=numpy.ndarray>

In [4]: img.shape
Out[4]: (3, 24240, 24240)

I can see all three fluorescence channels (I also checked this by loading it into the napari viewer, and it looks fine to me).

By default, you get one chunk in the dask array for each fluorescence channel, but those are pretty big. It's probably more sensible to control the dask array chunksize yourself. You can do that with something like:

img = imread('V1_Adult_Mouse_Brain_Coronal_Section_2_image.tif', chunks=(1, 1000, 1000))

@GenevieveBuckley
Copy link

BTW, I recommend using dask_image==0.4.0 for now. We've identified a bug in the latest version (0.5.0) and while it probably won't affect you it's still simpler just to steer clear of it.

@giovp
Copy link
Member Author

giovp commented Feb 28, 2021

thanks again @GenevieveBuckley ! We'll check it out soon and let you know!

@giovp
Copy link
Member Author

giovp commented Mar 9, 2021

Hi @GenevieveBuckley ,
finally got around and tested the lines you suggested. Unfortunately, have encountered some issues:
chunks seems to not be working:

from dask_image.imread import imread
temp = imread("/Users/giovanni.palla/.cache/squidpy/visium_hne_image.tiff", chunks=(1, 1000, 1000))

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-625049f6a946> in <module>
----> 1 temp = imread("/Users/giovanni.palla/.cache/squidpy/visium_hne_image.tiff", chunks=(1, 1000, 1000))

TypeError: imread() got an unexpected keyword argument 'chunks'

after having a look at the source, it seems that chunks are set by nframes, but it has a constraint based on the first shape dimension
https://github.com/dask/dask-image/blob/63543bf2f6553a8150f45289492bf614e1945ac0/dask_image/imread/__init__.py#L55
in this case shape=(1, 11757, 11291, 3). Indeed, if I set nframes=10:

/Users/giovanni.palla/miniconda3/envs/spatial/lib/python3.8/site-packages/dask_image/imread/__init__.py:62: RuntimeWarning: `nframes` larger than number of frames in file. Will truncate to number of frames in file.
  warnings.warn(

finally, I might be missing something but why shape[0]==1?

For reference, this is the output of our custom read using rasterio
image

https://github.com/theislab/squidpy/blob/d3532fce65c57afaf8cbed61e82435319c63ca54/squidpy/im/_container.py#L271

which we then reshape to (y,x,channels).
I'm on dask 0.4.0

thank you in advance for the help!

@GenevieveBuckley
Copy link

Sorry for the mix up! Let me set aside some time to write a response in the next couple of days for you

@michalk8 michalk8 added this to the 1.1 release milestone Mar 24, 2021
@michalk8
Copy link
Collaborator

michalk8 commented Apr 1, 2021

I'd postpone imageread integration 'till dask/dask-image#181 gets resolved.
As for other other cases where dask/dask-image could be used, I think we can start with processing functions and then go into segmentation.

@giovp
Copy link
Member Author

giovp commented Apr 27, 2021

@michalk8 I think we need to prioritize this and move towards full integration with dask_image.
I followed discussion in dask/dask-image#181 and it seems like there is a much bigger effort on the way that is likely to take long.

Also, from @GenevieveBuckley comment here: dask/dask-image#181 (comment) it seems that

For array creation, since dask.array.image.imread reads the first tp from disk for determining image shape and dask_image.imread.imread uses pims to determine shape, the latter can be faster in the case of a few huge files. Otherwise dask.array.image.imread is much faster.

"large images" is exactly our case (we don't really have exaples with many z-stacked images) and should do the job.

The chunking seems to still be an issue (as I can't really pass it in the nframes argument, see #296 (comment) ).

The pure dask approach from here: dask/dask-image#181 (comment)

also seems to work and could be an option. Do you mind having a look?
In short, if we can use dask_image to do everything I think it's better, even if might not be fastest implementation to read images (but again, in our case I think we are working at much smaller scale than the one described in issue).

@michalk8
Copy link
Collaborator

michalk8 commented Apr 27, 2021

Ok, I will start by refactoring the loading to use dask/dask-image + for pre-prorcessing. For the latter, I think we can get rid of the size argument, i.e. for sq.im.process and sq.im.segment. Or maybe modify it so that we can specify chunks.

@michalk8
Copy link
Collaborator

closed via #324

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ✨ New feature or request image 🔬
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants