Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Calculate Kappa and Rho on full F-statistic maps #714

Merged
merged 4 commits into from
Apr 15, 2021

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented Apr 8, 2021

Closes #713, which was probably introduced in #358.

Changes proposed in this pull request:

  • Calculate Rho and Kappa from the F-statistic maps compiled across adaptive mask values, instead of maps where only voxels where adaptive mask == number of echoes are filled in.

@codecov
Copy link

codecov bot commented Apr 9, 2021

Codecov Report

Merging #714 (a9e02ea) into main (167231b) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #714   +/-   ##
=======================================
  Coverage   93.09%   93.09%           
=======================================
  Files          25       25           
  Lines        1782     1782           
=======================================
  Hits         1659     1659           
  Misses        123      123           
Impacted Files Coverage Δ
tedana/metrics/kundu_fit.py 95.58% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 167231b...a9e02ea. Read the comment docs.

jbteves
jbteves previously approved these changes Apr 9, 2021
Copy link
Collaborator

@jbteves jbteves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for fixing @tsalo !

@jbteves jbteves requested a review from handwerkerd April 9, 2021 16:36
@handwerkerd
Copy link
Member

This generally looks good, but could you line it up with the current main? Without #712 merged into this PR you'd be thresholding F_R2 with f_max and then discarding the thresholded map before it influences anything.

@handwerkerd
Copy link
Member

Also, we've been lax on using it, but since this will change the output, shouldn't the PR title also include the [BRK] tag? There fixes that wouldn't change the output. This probably does change the output.

@tsalo
Copy link
Member Author

tsalo commented Apr 12, 2021

It seems reasonable to identify any PR that changes outputs, although I think a label would be better than a prefix. I will create a new breaking label and apply it here. We can also go through past PRs that have changed the integration test outputs and label those. That should help out with #622 and #699.

EDIT: Never mind, the breaking change label already exists. But applying the label to past PRs would be great.
EDIT 2: 🤦 we already had an output change label. I've added both to this PR just to be safe, but the other PRs flagged in #622 already have it.

@tsalo tsalo added breaking change WIll make a non-trivial change to outputs output-change labels Apr 12, 2021
@jbteves
Copy link
Collaborator

jbteves commented Apr 13, 2021

To clarify an earlier comment by @handwerkerd: it's because the parent commit is holding the outdated version and there were no new commits since the merge. When we merge, it will be lined up properly with the current main.

@handwerkerd
Copy link
Member

I just locally ran tedana looked at the outputs for the 5 echo test data set. They are non-trivially different.
For main component 0 is kappa=181.73 rho=30.78. For this PR, kappa=317.62 rho=30.9. The kappa rho scatter plots look similar, but the scaling is different, particularly for the kappa values.
Knowing what this change does, the new code should be more accurate, and thus better, but, in the new version, it is rejecting component 0 (i.e. the main visual cortex task response). This component is in the top right corner of the scatter plots below.

Others should run and look at this on their own to confirm that I didn't just mess something up.

If you're seeing this as well, my suspicion is that this specific change might be ok, but the changing scaling may be interacting with decision tree thresholding that was tuned for the older, flawed code. Another possibility is, with the recent adaptive mask changes, the voxel-wise scaling of weights for kappa & rho values wasn't done correctly.

main
image

#714
image

@tsalo
Copy link
Member Author

tsalo commented Apr 13, 2021

@handwerkerd thanks for testing the changes. Given that this fixes a bug introduced in #358, I think this may merit a separate issue about the validity of the current thresholds in the decision tree post-#358.

@handwerkerd
Copy link
Member

@tsalo, The trouble is if, in fixing a bug, it's creating a seriously bad misclassified component, we can't just merge this. I'm not sure I have time today but I may want to go back to older versions of the code to see if the scaling was this different before adding adaptive masking. If it's this PR specifically that's causing a big scaling change, that makes me think there might be another bug we need to catch before merging.
Put another way, before adaptive masking, we'd only include voxels in data with 5 good echoes. If also including voxels with only 3 good echoes scales the kappa values by nearly 2X, there's something else wrong with the code.

@tsalo
Copy link
Member Author

tsalo commented Apr 13, 2021

@tsalo, The trouble is if, in fixing a bug, it's creating a seriously bad misclassified component, we can't just merge this.

I would argue that the introduction of a bad misclassification is an issue that must be resolved before a release, but the scope of this PR is fixing a bug in the code, rather than an overarching problem in the decision tree. We definitely can't release 0.0.10 before the decision tree has been investigated, but I don't think that should block this.

Put another way, before adaptive masking, we'd only include voxels in data with 5 good echoes. If also including voxels with only 3 good echoes scales the kappa values by nearly 2X, there's something else wrong with the code.

If there are as many 3- and 4-echo voxels (together) as 5-echo voxels, and the F-statistics are roughly the same, then that seems like a reasonable scaling. To reproduce:

import numpy as np

# 3 voxels for each value in adaptive_mask
adaptive_mask = np.array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5])

# voxels with >= 3 echoes have actual values in the averaging
arr_3andup = np.array([0, 0, 0, 0, 0, 0, 5, 5, 5, 5, 5, 5, 5, 5, 5])
np.mean(arr_3andup)  # 3.0

# voxels with only 5 echoes have actual values in the averaging
# other voxels are incorrectly set to zero
arr_5only = np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 5, 5])
np.mean(arr_5only)  # 1.0

@handwerkerd
Copy link
Member

handwerkerd commented Apr 13, 2021

I'm trying to figure out what's going on. ~23000 voxels have 5 good echoes vs ~900 voxels having only 3 or 4 good echoes. There's no way a 2X scaling can happen from missing just those 900 voxels. I've been digging into this section of code line-by-line and, as far as I can tell, this 2X scaling difference does not occur in the area where you've changed the code (i.e. through line 203). The kappa values are different, but more like 117.1 vs 117.3.
Either I messed something up when I ran the full script or these changes are altering something else downstream.
I'll try to look into this a bit more tomorrow.

@handwerkerd
Copy link
Member

handwerkerd commented Apr 15, 2021

I've been looking into this a bit and couldn't figure out where the big change in kappa & rho values came from. I reran on this PR and don't see the same badly scaled result. My best guess is I somehow ran a non-updated up version of this PR last time.
Interestingly, the difference in results between this PR & Main is that the kappa values are nearly the same, but the rho values are mostly slightly larger (still <0.5 difference). This makes sense since, the voxels were echoes 4 & 5 were bad are likely to have more dropout and thus more rho weighting. By no longer accidentally excluding these voxels, they increase rho value for a bunch of components.

Copy link
Collaborator

@jbteves jbteves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks again @tsalo !

@jbteves jbteves merged commit 2363b4f into ME-ICA:main Apr 15, 2021
@tsalo tsalo deleted the fix-f-maps branch April 16, 2021 15:17
jbteves added a commit that referenced this pull request May 4, 2021
* Initial work on a class.

* Clean up.

* More work.

* Revert mistake.

* [FIX] Adds f_max to right place (#712)

Moves f_max thresholding to the step where the R2/S0 maps are written out, rather than just before kappa/rho calculation.

* [FIX] Calculate Kappa and Rho on full F-statistic maps (#714)

* Calculate rho and kappa on full maps.

* Add missing test outputs.

* Revert changes to outputs.

* Fill out docstrings for file-writing class.

* More work.

* [MAINT] Drop 3.5 support and begin 3.8 and 3.9 support (#721)

* Drop 3.5 support and add 3.8/3.9 support.

* Drop 3.5 tests and add 3.8/3.9 tests.

* Update installation instructions.

* Progress on refactor

* Fix some errors

* Fix style

* Switch to fstrings where possible

* Update tedana/io.py

Co-authored-by: Taylor Salo <tsalo006@fiu.edu>

* Update tedana/io.py

Co-authored-by: Taylor Salo <tsalo006@fiu.edu>

* Address @handwerkerd docstring review

* Replace generator with io_generator

Co-authored-by: Joshua Teves <jbtevespro@gmail.com>
Co-authored-by: Joshua Teves <joshua.teves@nih.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change WIll make a non-trivial change to outputs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kappa and Rho are only calculated from "perfect" voxels
3 participants