-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix bugs with corrmap computation #11858
Conversation
@jona-sassenhagen any chance you have time (& inclination) to review this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I have no recollection of how I originally implemented this. The code looks horrible, the names are so hard to read! I think taking the median was probably intentional, the problem with normalization was probably just a bug that I never discovered because the results were looking sensible.
The original EEGLAB/matlab code is this btw:
%%%%%%%%%%%%%%%%%%%%root mean square normalization %%%%%%%%%%%%%%%%%%%%%%
rms=sqrt( mean( ( comp{ind_c(g)}(:,corr_dec_abs(g,2)) ).^2 ) );
sel_ic(:,g)=(comp{ind_c(g)}(:,corr_dec_abs(g,2)))./rms;
ttt=find(-corr_dec_abs(g,1)==all_info(:,1));
Thanks for the fix Eric.
# Which is this (rms=Frobenius norm=np.linalg.norm): | ||
newtarget = np.mean( | ||
[(m * p) / np.linalg.norm(m) for m, p in zip(maxmaps, polarities)], axis=0 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this be written in vectorized form that would also be more readable? Like
(np.array(maxmaps) * np.array(polarities) / np.linalg.norm(np.array(maxmaps))).mean()
or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that might work if you pass the right axis
(and maybe keepdims=True
) to np.linalg.norm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe with the right axis manipulations... I'll try it and you can see if it's better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... actually I'm not sure this will work unless we're guaranteed non-raggedness of the np.array
, which we might not be if there are different numbers of components (which I think could happen if the number of PCA vectors differs across subjects for example). But I can unify the iteration step at least to a single one, which I'll push.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we cannot assume non-raggedness (IIRC I think raggedness even happens in our tutorial docs on corrmap)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I think what's here is about as readable as you can get with as few temporaries in memory (just one for the current sum
plus one that is being added)
Thanks @larsoner |
Pretty sure from going back to the original paper that our
corrmap
code has been EDIT: a little bit incorrect ever since it was added in 0.10 in #2104 (which took over #1985, which took over #1795 -- the bug is present in all). We used the mean of some weirdly transformed maps instead of just the mean of rms-normalized maps like noted in the paper. Our code inmain
did something like a z-score but not quite, even when fixing the incorrect usage ofstd
tomean
.Fortunately I don't think it makes a big difference -- at least in our tutorial and tests the
newtarget
in this PR and thenewtarget
onmain
have a correlation coefficient > 0.99. But it should be better to use the correct / reference implementation I think.Looking at the paper, one difference still remains where we use the median correlation (across ICs) rather than using a mean with a Fisher z-transformation, but I think that part's okay to stay as is -- especially since some of our correlations can be near
1.
which will be a problem when subjected to a Fisher Z transformation. So I think the existing use ofmedian
is a justified departure for us.Closes #11858