Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we have divergent approaches to optimal combination based on the goal? #617

Closed
tsalo opened this issue Oct 20, 2020 · 22 comments
Closed
Labels
discussion issues that still need to be discussed question issues detailing questions about the project or its direction

Comments

@tsalo
Copy link
Member

tsalo commented Oct 20, 2020

Summary

In #600, @cjl2007 and @mvdoc expressed concerns with our new approach to optimal combination, which incorporates echo signal quality information into the process. @emdupre, @rmarkello, and I just had a short call about this, and I think we've come up with a good decision point to bring to the rest of the community.

Namely, the use-cases for the optimally combined data appear to be somewhat different for the t2smap and tedana workflows. In the t2smap workflow, the goal seems to be to use multi-echo fMRI to increase coverage, with a slight increase in SNR. In the tedana workflow, we want to limit the optimally combined data to just good data going into the denoising process, in which case coverage will suffer in the OC data, but SNR should increase more than if we incorporated "bad" echoes.

Additional Detail

Here is what we were thinking.

For tedana, we keep the optimal combination approach the way it is.

For t2smap, there are three possibilities:

  1. Revert to the old way, where we run optimal combination using all echoes, regardless of whether some echoes are "bad" or not.
  2. Use the echoes with good signal from each voxel, but drop the threshold we use. So instead of only combining voxels with >=3 good echoes and masking out the others, we also combine 2 good echoes and keep values from the first echo for the rest of the voxels.
  3. Like number 2, but for voxels with only one good echo, we use the first two echoes, like we do with the "full" T2*/S0 maps.

Next Steps

  1. Discuss.
  2. Add an argument to the optimal combination function that toggles this behavior.
@tsalo tsalo added question issues detailing questions about the project or its direction discussion issues that still need to be discussed labels Oct 20, 2020
@emdupre
Copy link
Member

emdupre commented Oct 20, 2020

I'd agree with splitting the method used depending on the end-goal of the application.

Personally, I think that (3) is perhaps the most principled (in line with the reasoning we use for T2*/S0 maps), but the benefit I see to (1) is that it's likely what users who are familiar with ME-ICA are expecting.

@jbteves
Copy link
Collaborator

jbteves commented Oct 20, 2020

I'm against option (1) if we're pretty sure that the bad echoes are bad, though I also see the merit in including it for legacy reasons. If we're not as sure then we should probably discuss that as well.

Curious what @handwerkerd and @CesarCaballeroGaudes in particular think about this.

@mvdoc
Copy link
Contributor

mvdoc commented Oct 20, 2020

Hi @tsalo and all, thanks for starting this issue and for the simulation you showed on the other issue. I want to start saying that I am not an expert in ME, this is my first ME dataset that I'm playing with, and I'm fairly sure some of my choices are suboptimal (for example, I don't plan to strictly use tedana for denoising—yet—& I don't see myself getting more than three echoes because I need a short TR).

That said, I do have some opinions about t2smap. The issue is tricky because it's not strictly related to tedana, but it's more related to fMRIPrep adopting the t2smap workflow. Personally, I don't like the software to decide for me to discard voxels and mask my data too much. I understand the reasons for masking out those voxels when running the full tedana pipeline, but for combining TEs I personally don't think it's good. Of course, this is your decision as the software developers, but as a user of fMRIPrep (and of neuroimaging software in general) I don't like it when the software decides for me without letting me choose other options. In my opinion this goes against the "glassbox" philosophy of fMRIPrep.

Especially for a voxelwise operation such as t2smap, I don't see any reason to mask aggressively (but I might be missing some crucial information).

I think there could be two solution for the t2smap workflow (and only that, I'm not considering the full tedana pipeline)

  1. just have a flag to use adaptive masking or not prior to combining echoes
  2. do not mask at all, but provide a mask with good voxels as an output, together with the optimally combined volumes

P.S. I also want to thank you guys for tedana :) I remember when this project started at one of the hackathon (IIRC) and it's really exciting to see what has become of that project!

EDIT: I guess I can summarize what I wrote by saying: I'd like t2smap to allow me to do what I want, even if it's wrong (fitting a decay with two echoes, using bad voxels), and then I'd like it to give me metrics to guide my decision on whether what I did was optimal or not. The current operation is more like "we decide for you", which I have trouble with :)

@emdupre
Copy link
Member

emdupre commented Oct 21, 2020

I'd like t2smap to allow me to do what I want, even if it's wrong (fitting a decay with two echoes, using bad voxels), and then I'd like it to give me metrics to guide my decision on whether what I did was optimal or not

I'm very much a fan of this philosophy, but it's a little difficult to see how we could implement it, here. We'd need to give you the calculated adaptive mask (showing how many echos have reliable signal at each voxel) and then you could combine them as you see fit. But that almost certainly wouldn't integrate into pipelines like fMRIPrep, or really most user workflows.

Personally, I think it's reasonable for tedana to "take a stance" and have a sensible (well-documented) default. The problem is that our current default is too conservative for this use case -- and I think it's an important use case !!

@handwerkerd
Copy link
Member

My brief answer is:

  1. We never want to give an end user processed data based on bad underlying data
  2. Bad is subjective
  3. We should set a default definition for bad and give end users some flexibility to change that definition. I see no reason not to allow someone to get the weighted average of just two echoes (There's a whole world of dual-echo pulse sequences). From the code point of view, maximizing flexibility might be a bit ugly so we'd need more discussion of what flexibility to expose through function calls and what flexibility to allow through a command line interface.
  4. Short TRs are overrated. Getting 4 echoes really might not be worthwhile in your situation, but I'm always happy to push back against the drive to shorter TRs (There are specific cases where they really are beneficial, but those aren't most of the situations where short TRs are pushed.)

@mvdoc
Copy link
Contributor

mvdoc commented Oct 21, 2020

@emdupre agreed. I think it's perfectly reasonable for tedana to take a stance. Changing the default, adding documentation, and perhaps granting a little bit of flexibility to the user would be a great solution.

@handwerkerd I feel like point 4. can be a whole new, interesting conversation ;)

@handwerkerd
Copy link
Member

@mjversluis I'd like to know a bit more about your data. I just read through #600 & saw the picture of your data. Given you are aiming for a short TR, I assume your 3 echo times are all relatively short (i.e. <40ms). I'd expect the third echo to drop out over a large area when the TE is non-trivially longer than 40ms. I would expect more drop out for TE=40 vs 30ms, and that drop out should affect the regions you care about, but it shouldn't be a huge change. I'd like to get a better sense of what your data looks like. If you look at the 3rd echo volume in these regions, is it clear that you've lost most of your signal? In that case this is a data issue and the tedana solution should focus on finding ways to let people get data with only 2 usable echoes. If some of these 3rd echo voxels look reasonable then that might mean we should think about making our default masking threshold less aggressive.

Assuming you might do Gallant lab style MVPA searchlight analyses, I'd need to give some thought to if there may be issues with including 2 & 3 echo voxels in a single data set. If one only collected 2 voxel data then the BOLD weighting at each voxel would be roughly similar. With 2 & 3, I'd expect non-trivially more BOLD weighting in the 3 echo data. I'm not sure how methods that are looking for clusters of voxels with better fits to a model would handle this level of variable data quality. Given that voxel-quality spatial varies quite a bit even with single echo data, this may be an issue with these analyses with or without multi-echo. I'm bringing this up here because figuring out a reasonable solution may depend on better understanding the overall goal.

@effigies
Copy link
Contributor

Apologies for an under-informed opinion on multi-echo, but this will be from the perspective of trying to produce as consistent-as-possible results for fMRIPrep. For single-echo data, we produce unmasked reference images and BOLD time series, and provide the masks separately. If at all possible, I would prefer that for the time series estimated by tedana.

What I would suggest in terms of outputs would be:

  • *_desc-preproc_bold.nii.gz - unmasked
  • *_desc-brain_mask.nii.gz - mask of voxels in brain (old style mask?)
  • *_desc-tedana_mask.nii.gz - (or some other desc) mask of "good" data (new style mask?)

I suspect that trying to model responses sufficiently far outside the brain produces complete garbage, so it may be that asking for a completely unmasked *_desc-preproc_bold.nii.gz is absurd. Still, would it work to use the least-restrictive plausible mask to return the data and separately generate a binary mask indicating which voxels can be trusted for proper ME analysis?

@handwerkerd
Copy link
Member

@effigies The challenge with this is the 'Optimally Combined' is just a weighted average of the echoes, but the weights are defined by the estimated T2*. Voxels where it's impossible to estimate T2* don't have reasonable weightings. We could create a combined volume that is the unweighted average echoes or uses some other default weighting in voxels without a good T2* estimate. That might be useful for data quality checks, but it could be problematic for analyses.

@CesarCaballeroGaudes
Copy link
Contributor

I agree with @handwerkerd that the t2*-weighted optimal combination does not make any sense in voxels outside the brain. Perhaps, we could simply average the echoes for outside-brain voxels, or copy the first echo which should have little BOLD contrast in order to actually reduce the chances of false positives outside the brain.

In general, my preference is also "set a default and give end users some flexibility to change that definition", and create documentation that explains the rationale for the different approaches. For example, we could do:

  • option 1: t2*-weighted OC regardless of the echoes are good or bad
  • option 2: t2*-weighted OC when at least 3 good echoes, and non-weighted average for voxels with 2 good echoes.
  • option 3: non-weighted average, regardless of the number of echoes and either good or bad.
    and so on

As masks occupy little memory, we could create mask of voxels in the brain, and a mask of voxels showing the number of echoes that are good (i.e. not a binary mask, but ranging from 0 to number of echoes).

Regarding the MVPA-related question, I guess the issue is similar to the fact that different brain regions exhibit different SNR/CNR. In the case of multi-echo, voxels with 3-echo weighting should ideally have better SNR than those with 2-echo weighting, of course if the echoes are good.

@mvdoc I also had a look at #600 , could you please let us know the parameters of your acquisition: TEs, TR, use of acceleration either multiband or grappa, voxel size? I also experienced that voxels in the OFC were masked with the default masks. That's why, in my preprocessing, I usually compute the brain mask from the 1st echo with a skull-stripping algorithm as it were an anatomical image (e.g. 3dSkullStrip).

@mvdoc
Copy link
Contributor

mvdoc commented Oct 21, 2020

@handwerkerd You are right, I should have described my objective first ;) We are testing out different whole-brain sequences with the goal of increasing coverage to the temporal pole. One of these attempts was to use multi-echo sequences. We are still very much in the process (pending covid), so this whole conversation is very helpful for me (thanks!).

We do have constraints regarding the TR (<= 2s), as we've found that shorter TRs give us benefits in SNR. (Maybe these benefits will be comparable to denoising performed by tedana/me-ica, so we might want to test this too with a 2s TR.) One of the sequences that we tried (possibly the one I took a picture of and posted in #600, but I need to double-check) had TEs of 12.8, 33.05, 53.3 ms. We collected short resting-state sequences to compute tSNR with GRAPPA 2, MB 2, 3, 4 with TR ~ 2s or sub-2s, PF7/8, and 2.6 mm isotropic voxel size. (This should also answer @CesarCaballeroGaudes. I'm not a fan of grappa + MB but we are very much in an exploratory phase. I'm not sure I can share this data for testing, but I will check.)

As for the analyses, yes you are correct I will do Gallant lab style analyses, but Gallant lab analyses are not MVPA searchlight :)
I want to be mindful not to bring this conversation OT, but briefly, our models are fit voxelwise, independently for each voxel (thus, it's essentially a univariate analysis). If I understand your point correctly, then I don't see a problem with weighing either 2 or 3 echoes for different voxels if models are fit separately for each voxel. Differences between those voxels will be attributed to differences in SNR (in line with what @CesarCaballeroGaudes pointed out), and we already have methods to estimate the level of noise with our voxelwise modeling framework. (I'm curious though to know whether you think this different weighing would be a concern also for other methods that use voxel covariances, like functional connectivity.)

@CesarCaballeroGaudes
Copy link
Contributor

CesarCaballeroGaudes commented Oct 22, 2020

@mvdoc if your goal is TR < 2 seconds, it is perfectly viable that you acquire 3 or 4 echoes at 2.6 mm isotropic resolution with MB factors of 3 or 4. The use of GRAPPA 2 is not mandatory, but recommended in order to shorten the first echo time as much as possible. We have implemented protocols around these parameters in our PrismaFit scanner with the CMMR multiband sequence. Which scanner and sequence are you using?

Regarding the coverage in inferior temporal cortex, temporal pole and orbitofrontal cortex, you might consider using 3dSkullStrip (or other skull-stripping algorithm tailored for anatomical images) in the first echo image. Are you collecting single-band reference (aka SBref) images too? If so, I would recommend using that image to create the mask.

@mvdoc
Copy link
Contributor

mvdoc commented Oct 22, 2020

@mvdoc if your goal is TR < 2 seconds, it is perfectly viable that you acquire 3 or 4 echoes at 2.6 mm isotropic resolution with MB factors of 3 or 4. The use of GRAPPA 2 is not mandatory, but recommended in order to shorten the first echo time as much as possible. We have implemented protocols around these parameters in our PrismaFit scanner with the CMMR multiband sequence. Which scanner and sequence are you using?

@CesarCaballeroGaudes, this sounds very interesting and relevant to what I'm doing! We are running these tests on a Siemens Prisma fit as well, using the CMMR multiband sequence. I'm concerned about GRAPPA 2 because I read it can be more sensitive to subject motion, and I expect some of my subjects to have increased motion artifacts. Unfortunately, I stopped testing this because of the pandemic, I hope to resume testing soon. Would it be possible for you to share the parameters of some of these sequences? Thanks!

Regarding the coverage in inferior temporal cortex, temporal pole and orbitofrontal cortex, you might consider using 3dSkullStrip (or other skull-stripping algorithm tailored for anatomical images) in the first echo image. Are you collecting single-band reference (aka SBref) images too? If so, I would recommend using that image to create the mask.

Yes, I'm collecting SBref images too. Your suggestion might be relevant for changing parts of the preprocessing pipeline in fMRIPrep for multiecho (ping @effigies). At the moment I used tedana (and t2smap) mostly within the fMRIPrep workflow.

@mvdoc
Copy link
Contributor

mvdoc commented Oct 22, 2020

Also for reference, this is how aggressive the adaptive mask is on my dataset (see nipreps/fmriprep#2309 (comment)).

I asked, and I could share this small resting-state dataset privately, if it can be useful to test different strategies for the mask.

@CesarCaballeroGaudes
Copy link
Contributor

Attached is the MRI protocol of the ME-fMRI sequence we have employed in the following study (credit to @smoia):
https://www.biorxiv.org/content/10.1101/2020.08.18.256479v2
MEfMRIsequence_BCBL_5echoes_TR1500_MB4_2.4x2.4x3mm.pdf

@mvdoc
Copy link
Contributor

mvdoc commented Oct 27, 2020

Awesome, thanks @CesarCaballeroGaudes and @smoia :)

@tsalo
Copy link
Member Author

tsalo commented Nov 20, 2020

To follow up from today's call- what if we completely respect the explicit mask, when provided? That way, when fMRIPrep calls the workflow, it will provide a brain mask and we can use lower-quality approaches for "bad" voxels in that mask instead of dropping them. The lower-quality approaches would be (1) unweighted average for two echoes and (2) use first echo directly when there's only one good echo.

@handwerkerd
Copy link
Member

handwerkerd commented Nov 20, 2020

I wanted to move my notes from today's dev call discussion #620 to here.

  • If there is even one echo with usable data, that data should be possible to include in the output.
  • The one challenge is, if there are <3 echoes then you don’t get a T2* estimate to do a weighted average.
  • It is possible to set up different rules for <3 echoes, and there is agreement that this should be an option. We could either do an unweighted average of the echoes use use either a default weighting or a weighting based on neighboring voxels when there are 2 good echoes.
  • The raw magnitude scaling for voxels with 1-2 echoes may be different. This may be a good thing since it would make clear to end-users that something different is happening at these locations.

Current plan

  • Make an option to include voxels with 1-2 good echoes, but don’t make that the default.
  • Consider keeping those voxels differently scaled
  • We may want to make keeping data with <3 echoes a default option, but we'll first set it up as the non-default option, and then see what it looks like in practice with the goal of considering making this the default option in the future
  • Another option is to have a way to output both the optimally combined data and data with a more liberal threshold, but <3 good echoes (Such an image shouldn't be called "optimally combined"
  • Specifically for fMRIprep, we definitely want to have this as a flag that can be passed. Several people suggested this should be the version that keeps voxels with <3 good echoes should be default for fMRIprep since the fMRIprep common use case doesn't include the denoising steps where this becomes a bigger issue. If this is the default for fMRIprep, we'd definitely want to also automatically write out the adaptive mask so fMRIprep users can know which voxels have <3 good voxels.
  • As @tsalo wrote in the previous comment, we can also accept the fMRIprep mask and output data at all those voxels as long as there is at least 1 good echo.

@tsalo
Copy link
Member Author

tsalo commented Dec 16, 2020

For right now, what do folks think of adding a new argument to make_adaptive_mask named "threshold"? We could hardcode "threshold" to be 1 for t2smap_workflow and 3 for tedana_workflow.

This would retain the old behavior of using the first two echoes for optimal combination when only one good echo is available, but that would only be triggered for the t2smap_workflow.

@emdupre
Copy link
Member

emdupre commented Jan 28, 2021

Hi everyone (especially @mvdoc and @cjl2007) ! We've just merged in #635, which now differently thresholds the adaptive mask depending on workflow. Essentially, the answer to the question posed in the issue title is now 'yes.'

This should better align with expectations of increase signal coverage with the t2smap workflow, but of course we'd love if you could test the current code on your own data ! We're planning to cut a release in the next week or so, so for now you would need to directly update from the master branch. But if you have any feedback, we'd be happy to incorporate it in before we release !

@emdupre
Copy link
Member

emdupre commented Feb 5, 2021

Should we close this since we'll be releasing today ?

Your feedback is still appreciated, of course ! But hopefully the release will make it a bit easier to integrate into eg fMRIPrep -- and apply to your data 😸

@jbteves
Copy link
Collaborator

jbteves commented Feb 5, 2021

I agree with closing on release. Speaking of, who is actually doing the release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion issues that still need to be discussed question issues detailing questions about the project or its direction
Projects
None yet
Development

No branches or pull requests

7 participants