Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ome-tif to zarr converter for acquisitions with micromanager beta #123

Merged
merged 2 commits into from
Jul 8, 2022

Conversation

talonchandler
Copy link
Collaborator

@talonchandler talonchandler commented Jun 28, 2022

This PR and a parallel waveorder PR address issue #121.

@JohannaRahm was having difficulty converting a specific .ome.tif file to .zarr, and I traced the issue to a problem with the reader for data acquired with micromanager-2.0.0-beta. The fix requires small changes to recOrder and waveorder.

@JohannaRahm, I have successfully converted the dataset you requested and placed the result at /hpc/projects/comp_micro/sandbox/Talon/. @JohannaRahm can I request that you visit that location and do the following?:

  • verify that the .zarr has been converted appropriately for your downstream processing
  • follow the detailed steps in README.txt to recreate the conversion. (Briefly, you'll create a new conda environement, checkout the appropriate recOrder and waveorder branches, then run the conversion.)
  • perform more conversions in this set (and ideally others) to improve our confidence that this fix is working
  • update here with any successes or failures?

If this works for your tests over the next couple days, I will merge both of these PRs and update the shared recOrder environment.

@deprecated-napari-hub-preview-bot
Copy link

deprecated-napari-hub-preview-bot bot commented Jun 28, 2022

Preview page for your plugin is ready here:
https://preview.napari-hub.org/mehta-lab/recOrder/123
Updated: 2022-06-29T16:39:45.242233

@JohannaRahm
Copy link

Hi @talonchandler,

thanks for having a look at this issue so quickly!!
The converted zarr store at /hpc/projects/comp_micro/sandbox/Talon/
is as expected. However, I have troubles to match the ome.tif files with the zarr movies. For example, the first few ome.tif files are equivalent to the first few columns in the zarr store. The last ome.tif movie (idx 383) does not match with the 383th zarr column though. I am only interested in specific positions of this dataset. How can I find the columns of some positions (=some specific ome.tif files) in the zarr store? I tried to create a zarr store with only a fraction of ome.tiff files in a directory (=all_3_fraction), but this gives a KeyError.

recOrder.convert --input ./all_3_fraction --output ./all_3_fraction_JR.zarr --data_type ometiff --replace_pos_name False

(recorder_dev) [johanna.rahm@gpu-a-001 Talon]$ recOrder.convert --input ./all_3_fraction --output ./all_3_fraction_JR.zarr --data_type ometiff --replace_pos_name False
Initializing Data...
Finished initializing data
Found Dataset all_3_fraction_JR.zarr w/ dimensions (P, T, C, Z, Y, X): (316, 1, 4, 81, 1024, 1024)
Creating new zarr store at ./all_3_fraction_JR.zarr
Running Conversion...
Setting up zarr
Traceback (most recent call last):
  File "/home/johanna.rahm/.local/bin/recOrder.convert", line 33, in <module>
    sys.exit(load_entry_point('recOrder-napari', 'console_scripts', 'recOrder.convert')())
  File "/home/johanna.rahm/repo/recOrder/recOrder/scripts/convert_tiff_to_zarr.py", line 28, in main
    converter.run_conversion()
  File "/home/johanna.rahm/repo/recOrder/recOrder/io/zarr_converter.py", line 368, in run_conversion
    self.init_zarr_structure()
  File "/home/johanna.rahm/repo/recOrder/recOrder/io/zarr_converter.py", line 339, in init_zarr_structure
    clims = self.get_channel_clims(pos)
  File "/home/johanna.rahm/repo/recOrder/recOrder/io/zarr_converter.py", line 316, in get_channel_clims
    img = self.get_image_array(pos, t=0, c=chan, z=self.focus_z)
  File "/home/johanna.rahm/repo/recOrder/recOrder/io/zarr_converter.py", line 300, in get_image_array
    return np.asarray(self.reader.get_image(p, t, c, z))
  File "/home/johanna.rahm/repo/recOrder/waveorder/waveorder/io/reader.py", line 163, in get_image
    return self.reader.get_image(p, t, c, z)
  File "/home/johanna.rahm/repo/recOrder/waveorder/waveorder/io/multipagetiff.py", line 325, in get_image
    coord = self.coord_map[coord_key] # (file, page, offset)
KeyError: (0, 0, 0, 40)

Furthermore, I followed the instructions in the README.txt and could convert ome.tif files from multiple measurements (all saved at /hpc/projects/comp_micro/sandbox/Talon/). They are the same as previously converted files, which is reassuring that the fix is working.

Best,
Johanna

Copy link

@JohannaRahm JohannaRahm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps only allow except KeyError: so nothing else is slipping through here.

@talonchandler
Copy link
Collaborator Author

Great!

Can you give me more details about where/how you're seeing the mismatch between the original ome.tif and the converted zarr?

I just ran the following (admittedly simple) tests to check that the original matches the converted:

$ python
>>> from waveorder.io import WaveorderReader
>>> original = WaveorderReader('./all_3')
>>> converted = WaveorderReader('./all_3.zarr')
>>> original.reader.get_num_positions()
384
>>> original.reader.get_num_positions() == converted.reader.get_num_positions()
True
>>> import numpy as np
>>> np.array_equal(original.get_array(0), converted.get_array(0))
True
>>> np.array_equal(original.get_array(100), converted.get_array(100))
True
>>> np.array_equal(original.get_array(383), converted.get_array(383))
True

Are you seeing a mismatch at a specific position? Or is this pointing to an issue in a reader/viewer elsewhere?

P.S. I also tried opening a subset and received the same error. I think extra logic is required to open incomplete micromanager datasets, so as far as I know this is "expected" behavior even though it's suboptimal.

@JohannaRahm
Copy link

I am for example interested in the columns in the zarr store that had D6-Site in their names in the ome.tif files. If I open the last ome.tif file in the all_3 folder and the last Col_383 in the zarr store, they are not the same position. I also printed out the files in all_3 sorted with sorted(os.listdir('./all_3') and in this case the last file name is all_3_MMStack_D6-Site_9.ome.tif, which is also not the same position as Col_383. Calling positions via index in your code example is working, but how does waveorderReader index the ome.tif files and how can I find out which name the ome.tif file for Col_383 etc has? I hope this explanation is clear!

image

RE: P.S. I also tried opening a subset and received the same error. I think extra logic is required to open incomplete micromanager datasets, so as far as I know this is "expected" behavior even though it's suboptimal.
I agree with this being suboptimal, because from a users point of view I would expect that it is possible to convert a subset of recorded data. It can happen quite frequently that parts of the experiment fail and that one would like to sort out these positions at quite an early point in analysis. Perhaps this is something to add to the todo-list.

@talonchandler
Copy link
Collaborator Author

Aha! I understand now. Thanks for clarifying.

Try this:

$ python
>>> from waveorder.io import WaveorderReader
>>> original = WaveorderReader('./all_3')
>>> [original.stage_positions[x]['Label'] for x in range(original.get_num_positions())]

which shows that D1-Site_15 is the last position. It seems like the data was collected in "zig-zag" order across the plate: i.e. A1-6, B6-1, C1-6, D6-1.

To find a specific index, append .index('D1-Site_10') to the list comprehension. Is this what you're after?


I would expect that it is possible to convert a subset of recorded data. It can happen quite frequently that parts of the experiment fail and that one would like to sort out these positions at quite an early point in analysis. Perhaps this is something to add to the todo-list.

Note that you can convert a failed/incomplete experiment because the micromanager metadata will match the data...converting subsets are only currently an issue when you manually delete/move a subset and don't update the metadata.

@talonchandler
Copy link
Collaborator Author

For your specific case:

d6_indices = [[original.stage_positions[x]['Label'] for x in range(original.get_num_positions())].index('D6-Site_'+str(y)) for y in range(15)]
[288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302]

are the indices you're looking for.

Copy link

@JohannaRahm JohannaRahm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

@JohannaRahm
Copy link

JohannaRahm commented Jul 8, 2022

$ python
>>> from waveorder.io import WaveorderReader
>>> original = WaveorderReader('./all_3')
>>> [original.stage_positions[x]['Label'] for x in range(original.get_num_positions())]

This is exactly what I was looking for. Thanks you! :)

@talonchandler talonchandler merged commit 32edad9 into main Jul 8, 2022
@talonchandler talonchandler deleted the zarr-converter-position-bug branch July 8, 2022 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] recOrder convert key error
2 participants