Operate on data as imgs and lists of imgs throughout package #251

tsalo · 2019-04-05T14:35:20Z

Summary

We currently use confusing masking and unmasking methods to keep our data in arrays throughout the workflow. Most functions working on data don't support img-like objects. In my opinion, this makes it harder for users (and even devs) to work with individual functions outside the context of the overall workflow. I propose that we shift to operating on imgs and lists of imgs throughout the package.

Additional Detail

I originally started working on this in a PR (#70), but this was closed without merging in part because it became overcomplicated. The problem was that I was coercing our data into five-dimensional (X x Y x Z x E x T) Nifti1Images, but unfortunately 5D images are not supported for anything beyond their initial creation. If we instead choose to operate on lists of 4D images, we'll be able to use standard masking/unmasking functions throughout the package, instead of having to write our own.

The benefits to this are that it will make it easier for users to interact with specific functions within tedana without a deep understanding of package-specific masking, loading, and file writing functions.

The only con I can see is that this will make some steps take slightly longer, but I don't foresee it having an appreciable impact on speed.

The text was updated successfully, but these errors were encountered:

rmarkello · 2019-04-05T19:48:08Z

Just wanted to chime in that I think this is a good idea! This was more-or-less how the package originally operated and we removed it because we were thinking of accepting surface-projected data for a period of time. Since that is off the table at this point (largely due to the difficulties in detecting spatial artifacts) I would definitely 👍 switching back to using img-like objects. Sorry for the difficulty of having to re-refactor!

As for your con: I agree that things shouldn't take too much longer -- once the data are loaded once using nibabel they'll be cached in memory so subsequent loads will be approximately as quick as querying a numpy array.

tsalo · 2019-04-06T20:52:44Z

Awesome! This is going to be fairly straightforward, but will involve a pretty thorough refactor, so I'm going to hold off on trying to do it until some of the other extensive PRs (e.g., #247 and #152) are handled. But it's definitely on my to-do list.

jbteves · 2019-05-02T22:20:50Z

@rmarkello @emdupre do you happen to know if this is contained to a certain commit or even group of commits that could be reverted? Even if there are many merge conflicts it will help us get a good handle on which places need to be changed to revert the change.

jbteves · 2019-06-01T02:52:58Z

@tsalo you had mentioned in the video conference that this was involved. Is there anything I can do to help?

tsalo · 2019-06-01T16:10:17Z

I think it's just a matter of putting the time in. We can come up with a standard way of handling it and can split up the functions if you want.

Here's my basic summary of how I think it should work:

All public functions should operate on imgs and/or lists of imgs. I'm not sure if we should also support numpy arrays or if that would make things too complicated.
I think it's reasonable for private functions to still work on arrays. This refactor is mostly about ease of use, so private functions seem like they should work in the most efficient- not useable- way possible.
Multi-echo data can be stored as lists of imgs.
Adaptive masks can be stored as lists of imgs, unless we think it's necessary to write a specific function for applying adaptive masks to lists of imgs.
We'll need to do something substantial with the eimasking used in TEDPCA. I think that we can build a new adaptive mask matching the same procedure and then can stack the echo-specific masked data. I'm not sure if that requires a separate function or if it can just be done in the tedpca function.
We can drop io.filewrite and the ref_img variable.
Pretty much every function will require a mask argument (although most already have it, so this isn't that big a deal).

jbteves · 2019-06-03T16:24:21Z

Okay. That all looks reasonable. I would think the simplest way to handle it is for me to add your repository as a remote and open PRs to your repository so we can hash out differences there, then you can open the PR when we're completely done. Thoughts?

tsalo · 2019-10-04T12:41:50Z

Recently we've pivoted to the idea of operating on files, rather than arrays or even image objects, per #394 and our monthly tedana developers call. As such, I'm comfortable closing this. Does that sound good to everyone?

jbteves · 2019-10-04T13:06:20Z

@tsalo sounds good to me. One more down ; - )

tsalo added discussion issues that still need to be discussed refactoring issues proposing/requesting changes to the code which do not impact behavior labels Apr 5, 2019

emdupre mentioned this issue Apr 11, 2019

Move masking/unmasking functions into new masking module #250

Closed

tsalo mentioned this issue May 23, 2019

[ENH] Make outputs BIDS derivatives-compatible #152

Closed

jbteves added this to the method extensions & improvements milestone May 24, 2019

tsalo mentioned this issue Sep 15, 2019

Nipype-ify workflow and functions #394

Closed

tsalo closed this as completed Oct 4, 2019

tsalo mentioned this issue Nov 7, 2019

Only operate on masked data in array format and keep masks as images #425

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Operate on data as imgs and lists of imgs throughout package #251

Operate on data as imgs and lists of imgs throughout package #251

tsalo commented Apr 5, 2019

rmarkello commented Apr 5, 2019

tsalo commented Apr 6, 2019

jbteves commented May 2, 2019

jbteves commented Jun 1, 2019

tsalo commented Jun 1, 2019

jbteves commented Jun 3, 2019

tsalo commented Oct 4, 2019

jbteves commented Oct 4, 2019

Operate on data as imgs and lists of imgs throughout package #251

Operate on data as imgs and lists of imgs throughout package #251

Comments

tsalo commented Apr 5, 2019

Summary

Additional Detail

rmarkello commented Apr 5, 2019

tsalo commented Apr 6, 2019

jbteves commented May 2, 2019

jbteves commented Jun 1, 2019

tsalo commented Jun 1, 2019

jbteves commented Jun 3, 2019

tsalo commented Oct 4, 2019

jbteves commented Oct 4, 2019