Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-series and multi-channel nd2 files are loaded incompletely without error/warning #364

Open
StefanHMueller opened this issue May 8, 2024 · 2 comments

Comments

@StefanHMueller
Copy link

Issue

I use the imread function to load nd2 files recorded on Nikon microscopes that contain multiple series (i.e. multiple fields of view are imaged simultaneously) and multiple color channels as well as multiple frames. The imread function just returns all frames at the first series and the first color channel, i.e. a 5D image with x=512, y=512, c=2, t=5, v=3 is reduced to a dask array with shape (5,512,512) without error or warning. Test data in .nd2 format to reproduce can be found here: https://github.com/Single-molecule-Biophysics-UOW/TIRF_tools/blob/main/test_data/multi_series_multiC_002.nd2

from dask_image import imread
import pims
path = r'Z:\In_vitro_replication\Stefan\test\test/multi_series_multiC_001.nd2'
im_dask = imread.imread(path)
im_pims = pims.open(path)

print('dask:',im_dask.shape)
print('pims:',im_pims)

This returns:

dask: (5, 512, 512)
pims: <FramesSequenceND>
Axes: 5
Axis 'x' size: 512
Axis 'y' size: 512
Axis 'c' size: 2
Axis 't' size: 5
Axis 'v' size: 3
Pixel Datatype: <class 'numpy.float64'>

This behavior depends on which reader pims uses internally (options are nd2reader or bioformats), but either way dask_image does not handle the output correctly.

Cause

The reason for this behavior is the following:
In the imread function the shape of the image is determined using len(im) which only returns the number of frames regardless of other dimensions:

    with pims.open(sfname) as imgs:
        shape = (len(imgs),) + imgs.frame_shape
        dtype = np.dtype(imgs.pixel_type)

Workaround

This behavior can be fixed by changing the iteration axis and bundling of axes for the pims reader as follows.
I also added some quick and dirty string shuffling to make sure xy are the two last dimensions:

with pims.open(path) as imgs:    
    shape = (len(imgs),) + imgs.frame_shape
    print(shape)
    #change iteration axis
    imgs.iter_axes = 'v'
    ax_bundle = ''.join(imgs.axes).replace('v','')
    ax_bundle_sorted = ax_bundle.replace('x','').replace('y','') + 'xy'
    imgs.bundle_axes = ax_bundle_sorted
    #get the hsape gain
    shape = (len(imgs),) + imgs.frame_shape
    print(shape)

This returns:

(5, 512, 512)
(3, 2, 5, 512, 512)

As expected the shape as imread determines it is now correct.
To actually use this work around I made my own version of the imread function as well as the _read_frame function and inserted those lines just after the pims.open(... statements.
This works well, but probably is very dependent on the exact file type used and the reader used by pims.
Perhaps it would be feasible to implement a warning if imread returns an array with less dimensions than the pims imageSequenceND has axes though?

@StefanHMueller StefanHMueller changed the title Multi-series and multi-channel nd2 files are not loaded incompletely without error/warning Multi-series and multi-channel nd2 files are loaded incompletely without error/warning May 8, 2024
@m-albert
Copy link
Collaborator

m-albert commented May 8, 2024

@StefanHMueller thanks for reporting this issue and for detailing your workaround 🙏

There are several problems with the current dask_image.imread implementation. Here's an issue tracking some of them and also the roadmaps towards improving the situation.

I'm glad you've found a workaround that works for you. Also consider using the readers mentioned here.

@StefanHMueller
Copy link
Author

Thanks @m-albert . I hadn't see the AICSImageIO readers before.
It produces dask arrays with the right shape and does so quickly.
This at least solves my problem, even though it would be nice not to depend on yet another package one day.
Reading nd2 files of any dimension to dask works for me simply like this:

from aicsimageio import AICSImage
path = 'path_to_file/.nd2'
img = AICSImage(path)
dask_array = img.dask_data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants