Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix diffusion mask in voxels outside the FOV #129

Merged
merged 19 commits into from
Oct 29, 2019

Conversation

MichielCottaar
Copy link
Contributor

The voxels inside the FOV at the first b0 might not be included in some other volumes due to movement. In eddy the signal these voxels, which are not included in all the volumes, are set to zero. In the HCP pipeline the output of eddy is resampled to T1 space. During this resampling, these voxels become non-zero due to the spline interpolation and are included within the final nodif_brain_mask.

This patch fixes this by:

  1. Creating a mask which includes only those voxels consistently within the FOV across all volumes as determined by eddy (called fov_mask). This mask is created from the eddy (or eddy_combine) output before resampling.
  2. Transforming this mask to T1-weighted space using trilinear interpolation. This is then thresholded at a value of 1, so that any voxel which are at the edge of the FOV are excluded.
  3. Setting to zero any data outside this transformed FOV mask

fix_mask

This image shows the change in the final mask for subject 101309, which was the first subject where I noticed this issue, when looking at DTI fits. Note the very low range of the colour bar when plotting the B0 on the left. This is to visualise the striping caused by the spline interpolation.

@coalsont
Copy link
Member

coalsont commented Sep 9, 2019

If the mask's maximum value is 1 before resampling, thresholding at exactly 1 makes me nervous, as you are expecting the 8 interpolation factors to always sum to exactly 1 and never 0.999999. Thresholding at 0.999 should be functionally equivalent to what is desired, while allowing plenty of room for floating point rounding error.

Of course, if fsl's thresholding already has a baked-in rounding error margin, or their trilinear interpolation implementation is very careful, and they will never change them, 1 may be fine as-is. I did include a tiny rounding margin in wb_command -*-math's equality tests, though I'm not sure it would be big enough for the worst cases of multiply and add rounding in some possible implementations of trilinear interpolation.

@mharms
Copy link
Contributor

mharms commented Sep 19, 2019

Can you maybe include an image of fov_mask itself? (I'm assuming that the image above is of the final nodif_brain_mask?)

Also, just to clarify, eddy only zeros out voxels that moved outside the FOV in that particular volume/frame, right? (i.e., it doesn't automatically zero out that voxel across all frames as well? Because if it did, I'm wondering if we could simplify and just use the Diffusion/data/nodif file to generate the fov_mask and then warp that to T1w space.)

@mharms
Copy link
Contributor

mharms commented Sep 19, 2019

And one other thought: It would be great if we could quantify the fraction of voxels that were zeroed-out due to movement, and put that into a text file to promote a simple QC check.

@MichielCottaar
Copy link
Contributor Author

Eddy zeros out all the volumes if one of them is out of the FOV. So we should be able to use the nodif to generate the FOV mask.

For reference in this discussion is an image of the FOV mask:

image

#Register diffusion data to T1w space. Account for gradient nonlinearities if requested
if [ ${GdcorrectionFlag} -eq 1 ]; then
echo "Correcting Diffusion data for gradient nonlinearities and registering to structural space"
${FSLDIR}/bin/convertwarp --rel --relout --warp1="$DataDirectory"/warped/fullWarp --postmat="$WorkingDirectory"/diff2str.mat --ref="$T1wRestoreImage"_${DiffRes} --out="$WorkingDirectory"/grad_unwarp_diff2str
${FSLDIR}/bin/applywarp --rel -i "$DataDirectory"/warped/data_warped -r "$T1wRestoreImage"_${DiffRes} -w "$WorkingDirectory"/grad_unwarp_diff2str --interp=spline -o "$T1wOutputDirectory"/data
${CARET7DIR}/wb_command -volume_dilate $DataDirectory/warped/data_warped $DILATE_DISTANCE NEAREST $DataDirectory/warped/data_warped_dilated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command is -volume-dilate, no underscores. I suggest at least trying the weighted method with -exponent 7 when you test this, for comparison. However, both methods are somewhat slow in 1.3.2, so I would suggest taking a somewhat-dilated brain mask, finding its intersection with the fov-introduced zeros, and using that as the -bad-voxel-roi argument, to speed things up a bit (by not dilating unnecessary locations).

Also, wb_command requires the complete filename, including extension, so add .nii.gz to them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching the typos. The computation speed is indeed a major problem. I tested both versions of the dilation (resulting b0 attached) and it took 1 hour for NEAREST and 5 hours for WEIGHTED with exponent of 7. The weighted version does look a bit better, but it does not seem worth the extra 4 hours of computation time as the effect on the spline interpolation will probably be very small (I have not tested that yet).

I thought $DILATE_DISTANCE would essentially do what you suggest to do with the -bad-voxel-roi, i.e., by setting it to 12 * 1.25 I only compute the dilation for voxels withing 12 voxels of the FOV. I'll test if the run time is shortened by using -bad-voxel-roi.

I'm still trying to figure out how to set the extent of the dilation. There is no hard boundary in how far sharp boundaries are propogated in the spline. 12 voxels is probably overkill, but I'm not sure at all how far down I can safely go.

b0_dilation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a look at the code for -volume-dilate and it appears to me that for this specific application it is very inefficient, because it iterates over all volumes independently. However, in this case the -bad-voxel-roi and -data-roi are the same for all volumes, so once it has determined for a voxel the nearest neighbour or weights in the kernel for a single volume, it could apply it to all volumes.

I guess if we're really worried about runtime we could (1) assign each voxel in -data-roi an index, (2) run -volume-dilate with NEAREST flag, (3) assign diffusion data to each voxel based on its index. That should be a lot faster than running -volume-dilate on the whole diffusion dataset at once, but would not work with the WEIGHTED flag.

I might just be overthinking it and should just compute the dilation for a more limited -bad-voxel-roi as you suggested and ignore the inefficient runtime...

Copy link
Member

@coalsont coalsont Sep 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The smaller the dilation distance, the faster the command will be. Even 1 voxel would be an improvement over the previous results, but 4 might be enough to get most of the possible benefits. I suggested intersecting the negation of the FoV with a somewhat dilated brain mask specifically because it should pinpoint only the voxels contributing to this strong localized ringing effect (and the fewer voxels that -volume-dilate has to fill in, the faster it runs).

The latest code in github is already faster than the 1.3.2 release (it now uses an octtree of only the valid source voxels and uses a kernel cutoff distance based on the closest voxel, 1.3.2 used full-size stencils everywhere and checked against the ROIs repeatedly). You can try the dev_latest here if you want to compare the timing to 1.3.2:

http://brainvis.wustl.edu/workbench/

Cache locality effects might prevent the "each voxel across all volumes" approach from being noticeably faster, and unfortunately the case without a -bad-voxel-roi or -data-roi can have a different set of valid voxels in every frame, so the code would need to be more complicated to support this optimization (it would be better to save the set of source voxels for each output voxel and still go frame by frame).

If you aren't concerned about the values created by the dilation, you might as well just use fsl's dilation, if it is fast. Weighted dilation will always be slower, since it takes data from more than one source voxel for every output voxel. There is also the possibility of -volume-smoothing with the -fix-zeros option (because gaussian smoothing in an orthogonal grid decomposes into 1D smoothings, it is faster - the weighted dilation is trickier because it doesn't use a fixed-size gaussian, but instead a kernel that is invariant to scaling), though that smoothing method assumes that the data can't have real zeros in it.

@glasserm
Copy link
Contributor

I'm a bit lost here, why are we dilating?

@MichielCottaar
Copy link
Contributor Author

Because in the diffusion MRI data from eddy there is a sharp edge at the edge of the field of view (FoV). Voxels which in any volume are outside of the FoV are set to zero. The spline interpolation used in the resampling to T1w space is very sensitive to these type of sharp edges, which leads to striping artifacts. Dilation will move such striping artifacts outwards, which minimizes their effect on those voxels fully within the FoV (which we want to keep).

@glasserm
Copy link
Contributor

So a few iterations of nearest neighbor should be fine if you will be remasking anyway?

@MichielCottaar
Copy link
Contributor Author

Yeah, that would probably be enough. Nearest neighbour across 4 voxels (using -volume-dilate) took a bit less that 10 minutes, so that seems reasonable.

${FSLDIR}/bin/flirt -in "$DataDirectory"/data_dilated -ref "$T1wRestoreImage"_${DiffRes} -applyxfm -init "$WorkingDirectory"/diff2str.mat -interp spline -out "$T1wOutputDirectory"/data

#Create a mask covering voxels within the field of view for all volumes
${FSLDIR}/bin/fslmaths "$DataDirectory"/nodif -abs -bin "$DataDirectory"/fov_mask
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I suggested using nodif as a proxy for the internal fov mask within eddy (which, ideally, would write out the fov mask as an explicit output). But do we need to be concerned about the possibility of randomly getting a "true zero" within the first frame that wasn't actually part of the eddy internal fov mask?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've addressed this issue by actually checking for the -Tmax after taking the absolute value in the data. So we will only exclude voxels containing only zeroes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe my concern about this possibility was misplaced, since getting a true zero is very unlikely. The problem with the -Tmax approach is what if there are zeros in say half the frames for a given voxel, then we don't want to use that voxel. I guess that scenario shouldn't happen if eddy itself is enforcing a common fov mask across all frames. But I think it is better if we write this code to be robust to whatever eddy chooses to do in that regard. Which would argue for not using nodif as a proxy for the internal eddy fov mask, and using -Tmin instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-Tmin means if even a single frame has an exact zero there, it is considered bad, so it seems worse than the original simpler tactic of looking at just the first frame.

If we want a more robust solution, what if we thresholded to find all exact zeros, counted the number of frames a voxel is zero with a sum operation, and then used a second threshold (say, at least 10% of frames) on that?

However, if we can in fact model the FoV as a mask directly without depending on exact zeros in the output, that would resolve all such concerns.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, unless we think eddy is likely to get changed, I don't think we should be coding for what it might be changed to do, so if it does change to per-volume masking, we should expect to have to rewrite this code anyway (and the correct thing to do then would probably be separate ROIs per frame).

@mharms
Copy link
Contributor

mharms commented Oct 4, 2019

@MichielCottaar See #138 for some files that I'm creating to allow us to quickly assess the spatial coverage issue with the fMRI data. Ideally, the changes you are working on would incorporate something similar for quickly assessing the coverage in the dMRI data. I'd like to get that PR wrapped up early next week, and it would be nice if we could get this PR done at the same time as well.

@MichielCottaar
Copy link
Contributor Author

I've rebased onto master, hence all the repeat of the commits.

@mharms Sure, the latest commit should add such a text file.

I'm just running the new pipeline with and without gradient non-linearities (see https://git.fmrib.ox.ac.uk/ndcn0236/mask_striping). If that works, I'm happy for this to be merged.

@MichielCottaar
Copy link
Contributor Author

Ok, this version runs on jalapeno. It produces a summary of the coverage, which for my test subject contains:

PctCoverage, NvoxFinalMask, NvoxBrainMask
98.2133, 790150, 804524

@mharms
Copy link
Contributor

mharms commented Oct 16, 2019

-abs -Tmin -bin, with a -fillh thrown in as a back-up to ensure no holes, seems reasonable to me.

However, while looking through the code, I thought of three other things:

  1. We are leaving the same sort of striping problem present in the GDC case in the Diffusion/data space data. Seems like we should derive the actual Diffusion space fov_mask in eddy_postproc.sh and apply it as part of that script. (Which also makes sense conceptually, because the derivation of the Diffusion space fov_mask doesn't have anything to do with the conversion to T1w structural space per se). Then, in DiffusionToStructural.sh we just resample the fov_mask into the "$T1wOutputDirectory".
  2. In the no GDC condition, determining the fov_mask in DiffusionToStructural.sh is problematic, due to the fslmaths ${datadir}/data -thr 0 ${datadir}/data operation at the end of eddy_postproc.sh. This is another reason for moving the determination of fov_mask into eddy_postproc.sh (so that the fov_mask can be determined accurately in the no GDC condition).
  3. Separate from above, but related, once we apply the fov_mask in Diffusion space, I would argue that we should comment out (with a date provided) the fslmaths ${datadir}/data -thr 0 ${datadir}/data operation in eddy_postproc.sh, so that any negative values from eddy itself are preserved and not artifically truncated at zero.

@mharms
Copy link
Contributor

mharms commented Oct 16, 2019

Expanding on my point 3 above, I see that we also do the same fslmaths -thr 0 operation toward the end of DiffusionToStructural.sh. So maybe that particular operation we should leave alone in both eddy_postproc.sh and DiffusionToStructural.sh for consistency considerations. We just need to be careful about deriving any masks after that operation.

@MichielCottaar
Copy link
Contributor Author

I have moved the FoV mask calculation to post_eddy and added a dilation step when interpolating the diffusion data in post_eddy (points 1 and 2 from Mike). I'll need to rerun the code on jalapeno to see if any new bugs have arisen.

Copy link
Contributor

@mharms mharms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just what I was thinking, but there is a bug in one line that is going to prevent the $GdCoeffs='NONE' condition from working in eddy_postproc.sh. Once that is fixed, and both the GDC and no GDC conditions are confirmed to work, this is ready to go in my opinion.

@mharms mharms self-requested a review October 17, 2019 21:51
Copy link
Contributor

@mharms mharms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we fix the lingering bug, and the GDC and no GDC conditions are both confirmed to work, this looks ready to me.

@MichielCottaar
Copy link
Contributor Author

With these latest fixes both the GDC and no GDC conditions have produced T1-aligned diffusion data with the FoV mask applied.

@glasserm glasserm merged commit 36c6beb into Washington-University:master Oct 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants