Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallelized deskew and phase deconvolution #13

Closed
mattersoflight opened this issue Apr 14, 2023 · 4 comments · Fixed by #47
Closed

parallelized deskew and phase deconvolution #13

mattersoflight opened this issue Apr 14, 2023 · 4 comments · Fixed by #47
Assignees
Labels
analysis Software development needed for data analysis enhancement New feature or request

Comments

@mattersoflight
Copy link
Collaborator

mattersoflight commented Apr 14, 2023

I suggest building the parallelized deskew and phase deconvolution module in the following iterations:

  1. Use the current version of recOrder/waveOrder to write a parallelized reconstruction script that parses mantis label-free datasets and writes the output in the target format (documented here).

@edyoshikun can an example from your slurm scripts be adapted for parallelized deskew, deconvolution, and zarr conversion?

  1. The future phase deconvolution CLI can rely on recOrder 0.4.0's revised CLI (Design for the recOrder-waveorder interface mehta-lab/recOrder#341).

Looks like @edyoshikun can implement (1) and @talonchandler can review.

@mattersoflight mattersoflight changed the title phase deconvolution module parallelized deskew and phase deconvolution module Apr 14, 2023
@mattersoflight mattersoflight changed the title parallelized deskew and phase deconvolution module parallelized deskew and phase deconvolution Apr 14, 2023
@edyoshikun
Copy link
Contributor

Yes, the slurm scripts can be easily adapted for parallelized deskew or deconvolution. Since the scripts split the datasets per position one can iterate through individual timepoint, channel or z-slices. I would have to just add these lines from the waveorder/example/fluorescence_deconv.ipynb. Perhaps it's good to make the decon a simple function if it doesn't exist yet?

The zarr conversion works like a charm with the implemented TiffConverter here that is also currently used in recOrder through CLI thanks to @ziw-liu and @talonchandler. The datasets in my prior experience have to be converted from tiff to zarr first before any multiprocessing.

@ieivanov
Copy link
Collaborator

I am working thru parallelizing the deskewing, starting with @talonchandler's PR.

I think it's better to implement parallel processing with the multiprocessing module rather than slurm. If I remember correctly, slurm is more powerful than multiprocessing only if you need to use more than 64 cores. Talon can correct me here, I couldn't find his notes on this right way. Slurm comes with extra overhead, while multiprocessing can also be used locally - the mantis PC does have 32 dual cores that we can take advantage of for quick local data processing.

@edyoshikun
Copy link
Contributor

Yes, I agree with implementing and prototyping simple scripts using multiprocessing. I think the slurm scrips are useful once the pipelines are somewhat established since they basically help run the same code across multiple nodes if needed. The slurm scripts run the python files as CLI scripts and they should work independently using multiprocessing.

One thing to consider if you need more GPU/RAM memory, then slurm comes in handy because it allocates the resources you need so you are not saturating.

@mattersoflight
Copy link
Collaborator Author

mattersoflight commented Apr 15, 2023

If I remember correctly, slurm is more powerful than multiprocessing only if you need to use more than 64 cores. Talon can correct me here, I couldn't find his notes on this right way. Slurm comes with extra overhead, while multiprocessing can also be used locally - the mantis PC does have 32 dual cores that we can take advantage of for quick local data processing.

Yes. It does make sense to start with multiprocessing and use SLURM scripts when the time lapses grow large or we have to re-analyze multiple time lapses.

@ieivanov If your multiprocessing code is written to account for the environment it is running within (local environment or cluster), it should be possible to run it via slurm scripts too.

For example,

if os.environ['SLURM_CPUS_PER_TASK']:
         number_of_cores = int(os.environ['SLURM_CPUS_PER_TASK'])
else: 
   number_of_cores =  mp. cpu_count()
with Pool(number_of_cores) as pool:
   scatter the computation 
   gather the results 

@ieivanov ieivanov added enhancement New feature or request analysis Software development needed for data analysis labels Apr 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analysis Software development needed for data analysis enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants