Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Adjusted SSD algorithm to support non-full rank data #11458

Merged
merged 11 commits into from
Feb 13, 2023

Conversation

tsbinns
Copy link
Contributor

@tsbinns tsbinns commented Feb 7, 2023

Reference issue

Spatio-spectral decomposition (SSD) implemented in mne.decoding.SSD can fail to fit data that is not full rank, even when specifying an appropriate number of components to reduce the dimensionality of the data to using the rank argument. This is due to the dimensionality reduction performed when computing covariance producing matrices which are not positive definite, causing the subsequent eigenvalue decomposition to fail with a LinAlgError:
image

What does this implement/fix?

This change adds a new private function for performing the dimensionality reduction after the covariance matrices have been computed (mne.decoding.ssd._dimensionality_reduction(); called by the fit() method of mne.decoding.SSD), which allows SSD to fit both full and non-full rank data. The rank argument of SSD maintains the same behaviour as before, however it is now used in _dimensionality_reduction() rather than passed when computing covariance.

The updated algorithm is based on that implemented and used by the original authors of the method (see e.g. https://github.com/svendaehne/matlab_SSD/blob/master/ssd.m), and compared to the current algorithm, the updated approach is more similar to that employed in the literature for related dimensionality reduction and reconstruction problems (see e.g. Ewald et al., 2012, NeuroImage).

Alongside the updated algorithm, a new test (test_non_full_rank_data()) has been added to the test suite (mne.decoding.tests.test_ssd) which checks that the new approach does in fact work for non-full rank data (the existing tests already show the approach continues to work for full rank data).

Additional information

No changes to the API or documentation have been made, and the ultimate function of SSD to enhance the contrast between a set of 'signal' and 'noise' frequencies has been preserved; the only change is that the updated algorithm is generalisable to wider forms of data. In line with the lack of change to the API, when using SSD with full rank data or rank="full", the results will be identical as the current algorithm's.

The argument could be made that this is an unecessary change, as the end user could instead first perform their own dimensionality reduction on non-full rank data and then fit SSD to this data, however: 1) this adds an additional burden to the end user; and 2) this would require end users interested in the spatial topographies of the SSD components (SSD.patterns_) to transform the patterns themselves from the dimensionality-reduced space, to their original channel space. Ultimately, those users less familiar with such methods may be disuaded from using this SSD implementation, and may feel forced to rely on e.g. the MATLAB implementation.

This issue is not unique to non-full rank data (e.g. the same problem can arise when full rank data is used but a rank less than the true rank of the data is specified, such as EEG data with rank 30 and rank={'eeg':25}, raising the same LinAlgError), however this is less of an issue, as SSD can be computed on the full rank data and then the desired number of components taken. On the other hand, for non-full rank data, there is no possibility of using SSD without prior manipulations of the data.

I am of course happy to discuss the suggested changes further!
Thomas

@welcome
Copy link

welcome bot commented Feb 7, 2023

Hello! 👋 Thanks for opening your first pull request here! ❤️ We will try to get back to you soon. 🚴🏽‍♂️

# License: BSD-3-Clause

import numpy as np
import scipy as sp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import scipy as sp
from scipy import linalg

and then use linalg.XXX below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, fixed now!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @agramfort, it seems from scipy import linalg causes test_import_nesting.py to fail. How should I handle this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nest the import inside the function that needs linalg like

def whatever(...):
    from scipy import linalg
   ...
    u, s, v = linalg.svd(...)

@agramfort
Copy link
Member

@vpeterson do you have time to look? 🙏

thanks @tsbinns

@agramfort
Copy link
Member

@tsbinns can you touch this file ssd_spatial_filters.py in the doc so we see it this changes any result visually?

@vpeterson
Copy link
Contributor

@vpeterson do you have time to look? 🙏

thanks @tsbinns
I will give it a look as soon as I can. Thank you @tsbinns for raising this up. Will check the changes and let you know

@tsbinns
Copy link
Contributor Author

tsbinns commented Feb 9, 2023

@tsbinns can you touch this file ssd_spatial_filters.py in the doc so we see it this changes any result visually?

@agramfort @vpeterson here is a look at the differences between the old and new implementations, using the example data of ssd_spatial_filters.py:

PSD
psd_old_new
No differences are seen for the power spectra. This is expected, as the data used here is full rank, so no dimensionality reduction is performed (i.e. it is the same process as the existing implementation, just with some extra multiplications of the covariance and spatial filters by the identity matrix in place of any dimensionality reduction transformations).

Topographies
topos_old_new
Likewise, the spatial patterns of the filters are also identical. I have seen three ways of computing the spatial patterns: inverting the spatial filters, W (the method currently used for SSD), multiplying the noise covariance, C_n, with W, or the matrix division/solving of the covariance and spatial filters (given in the MATLAB code, and which I have put in the new implementation). These approaches give identical results for both full rank data and non-full rank data where W is no longer a square matrix (non-full rank data topographies shown below).
topos_new_rr
As you may notice, the patterns for the second component onward look to be an inverted form of that when no dimensionality reduction is performed. Although the full and non-full rank data is different, I simulated the non-full rank data by changing the signal of only a single channel from the full rank data, which I would not expect to cause such a shift in the topographies. Fig. 5 of Haufe et al. (2014) suggests that large negative pattern values also indicate strong source activity, so perhaps the absolute value is considered more informative for source strength than the sign. In any case, the spatial filters for these components still produce high SNRs in the power spectra, which is also broadly similar to before:
image
Thinking on it, I would anticipate the matrix mutiplication approach to be the most computationally efficient option for recovering spatial patterns, however if you would prefer to stick with the filter inversion approach as is currently employed, you can see the results are equivalent.

Spectral ratios
spectral_ratio_old_new
Finally, you can see that the spectral ratios are identical across implementations.

@agramfort
Copy link
Member

thx @tsbinns for the detailed report.

+1 for MRG when green.

@tsbinns
Copy link
Contributor Author

tsbinns commented Feb 13, 2023

Thanks for the approval @agramfort!

I just reverted back to the original method for recovering patterns from filters by taking inv(filters) since this is the method stated in your example file, and as I show above, the results are equivalent (should also be faster than solving for the filters).

Unfortunately I am really struggling to understand why the make html_dev-noplot is failing. This error code make: *** [Makefile:70: html_dev-noplot] Error 1 of the output is really obscure to me, and I am not sure how to debug this. Is it something anyone is familiar with? Apologies for bothering you with such a trivial matter.

@agramfort
Copy link
Member

should be ok now @tsbinns

@agramfort agramfort merged commit ae7ecf8 into mne-tools:main Feb 13, 2023
@welcome
Copy link

welcome bot commented Feb 13, 2023

🎉 Congrats on merging your first pull request! 🥳 Looking forward to seeing more from you in the future! 💪

@agramfort
Copy link
Member

thx @tsbinns

Copy link
Contributor

@vpeterson vpeterson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I leave some comments on the proposed implementation.

rank="full", info=self.info)

cov_signal_red, cov_noise_red, dim_red = (_dimensionality_reduction(
cov_signal, cov_noise, self.info, self.rank))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a need to change the name of the cov matrices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Before when I was solving to find the spatial patterns this required the original covariance matrices, however this is no longer the case since I reverted to taking the inverse of the spatial filters.
This could be worth changing.

dim_red = np.matmul(
eigvects[:, :rank], np.eye(rank) * (eigvals[:rank]**-0.5))
logger.info(
"Reducing covariance rank from %i -> %i" % (n_channels, rank,))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rank of the covariance is not reduced, that is given by your matrix itself. What it is being done here is more a subspace projection. I would recommend changing the info msg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand entirely. Applying dim_red to the covariance matrices later in the function does reduce their rank, right?

cov_signal_red = np.matmul(dim_red.T, np.matmul(cov_signal, dim_red))
cov_noise_red = np.matmul(dim_red.T, np.matmul(cov_noise, dim_red))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, you are estimating the rank of the two matrices and taking the smallest value (which BTW, is there any practical case in which the rank(cov_signal) != rank(cov_noise) ? )

So, personally, I don't see this step as a dimensionality reduction of "the rank" (as you have stated in the log message) but a projection to the rank subspace of the data. In these lines, you are in fact making a projection. so dim_reg is a projection matrix to the rank subspace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, is there any practical case in which the rank(cov_signal) != rank(cov_noise) ?

I could not think of anything, but maybe there is some super strange edge case where the ranks are not equal. Since np.min of two values is so computationally cheap, I thought it could be better safe than sorry to check.

So, personally, I don't see this step as a dimensionality reduction of "the rank" (as you have stated in the log message) but a projection to the rank subspace of the data. In these lines, you are in fact making a projection. so dim_reg is a projection matrix to the rank subspace.

Okay, I think I understand. So something like "Projecting covariance of X channels to Y rank subspace" would be more appropriate?


cov_signal_red = np.matmul(dim_red.T, np.matmul(cov_signal, dim_red))
cov_noise_red = np.matmul(dim_red.T, np.matmul(cov_noise, dim_red))
return cov_signal_red, cov_noise_red, dim_red
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider changing the name of the projection matrix from dim_red to proj_rank ? or something more accurate.

Copy link
Contributor Author

@tsbinns tsbinns Feb 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it is a bit obtuse; perhaps I could change this as well.

@@ -172,17 +177,22 @@ def fit(self, X, y=None):

cov_signal = _regularized_covariance(
X_signal, reg=self.reg, method_params=self.cov_method_params,
rank=self.rank, info=self.info)
rank="full", info=self.info)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if rank will always be "full" then it should be removed from the Parameters of the SSD class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here rank is specified to be "full" to ensure the transformations of the covariance matrices are not performed when they are being computed, but rather after the fact in _dimensionality_reduction, to which rank is passed, so it is still in use.

@vpeterson
Copy link
Contributor

It seems I was too slow. I have added some comments on the ssd.py class changes, maybe they are still helpful @tsbinns

As a general comment, although I know that the projection to the rank subspace is utilized for estimating the covariance matrices in the literature, I am worried about how this implementation goes along with other existing spatial filters methods based on eigenvalue decomposition of cov matrices, such as CSP or SPoC. We made the estimation of the covariance matrices following other decoding implementations, which consider low-rank data. Should not the _regularized_covariance function also work for non-full rank data? @agramfort Maybe I am wrong and misunderstanding how the implementation of _compute_covariance_auto works.

@tsbinns
Copy link
Contributor Author

tsbinns commented Feb 13, 2023

Hi @vpeterson, yes, they are still helpful! I will have a look through them.

As for the compatibility with non-full rank data, I kept encountering the issue that _compute_covariance_auto would reduce the rank of the covariance matrix to an appropriate value, however the matrix would not be positive definite, and so trying to perform the generalised eigendecomposition on the covariance matrices would fail. The method Vadim Nikulin uses (which I added here) ensures that the covariance matrices will have an appropriate rank, whilst remaining positive definite.

Just to reiterate, as I mention in the first post, the behaviour of _compute_covariance_auto producing non-positive definite matrices is not a non-full rank data specific issue, but more a product of the matrix transformation method used, e.g. you can see the same issue if you specify an inappropriate rank for full rank data:

This issue is not unique to non-full rank data (e.g. the same problem can arise when full rank data is used but a rank less than the true rank of the data is specified, such as EEG data with rank 30 and rank={'eeg':25}, raising the same LinAlgError), however this is less of an issue, as SSD can be computed on the full rank data and then the desired number of components taken. On the other hand, for non-full rank data, there is no possibility of using SSD without prior manipulations of the data.

@tsbinns
Copy link
Contributor Author

tsbinns commented Feb 13, 2023

Hi @agramfort,
@vpeterson raised good points for some internal variable renaming. I had also planned to make a separate PR for some minor changes to the documentation, e.g.:

  • setting sort_by_spectral_ratio : bool (default False) to sort_by_spectral_ratio : bool (default True).
    def __init__(self, info, filt_params_signal, filt_params_noise,
    reg=None, n_components=None, picks=None,
    sort_by_spectral_ratio=True, return_filtered=False,
    n_fft=None, cov_method_params=None, rank=None):
    """Initialize instance."""

  • rewording this to read "If n_components is None, the number of components equal to the rank of the data are returned.", since I felt using the term dimensionality reduction here is now somewhat confusing, given that the rank argument is now tied to the available number of components.
    n_components : int | None (default None)
    The number of components to extract from the signal.
    If n_components is None, no dimensionality reduction is applied.

  • similarly, making it more explicit that the rank argument determines the degree of dimensionality reduction performed.
    rank : None | dict | ‘info’ | ‘full’
    As in :class:`mne.decoding.SPoC`
    This controls the rank computation that can be read from the
    measurement info or estimated from the data.
    See Notes of :func:`mne.compute_rank` for details.
    We recommend to use 'full' when working with epoched data.

@agramfort if you agree with these or Victoria's suggested changes, I could open a new PR for the minor changes.

@agramfort
Copy link
Member

agramfort commented Feb 13, 2023 via email

larsoner added a commit to drammock/mne-python that referenced this pull request Feb 22, 2023
* upstream/main: (46 commits)
  Fix docstrings by replacing str with path-like and fix double backticks for formatting (mne-tools#11499)
  Use pathlib.Path instead of os.path to handle files and folders [circle deploy] (mne-tools#11473)
  MAINT: Fix Circle [circle deploy] (mne-tools#11497)
  MAINT: Use mamba in CIs (mne-tools#11471)
  Updating documentation to clarify full vs half-bandwidth and defaults in time_frequency.multitaper.py (mne-tools#11479)
  Fix typo in tutorial (mne-tools#11498)
  Typo fix and added colons before code (mne-tools#11496)
  [MRG] ENH/DOC: demo custom spectrum creation (mne-tools#11493)
  Accept only left-clicks for adding annotations (mne-tools#11491)
  [BUG, MRG] Fix pial surface loading, logging in locate_ieeg (mne-tools#11489)
  [ENH] Added unit_role to add non-breaking space between magnitude  and units (mne-tools#11469)
  MAINT: Fix CircleCI build (mne-tools#11488)
  [DOC] Updated decoding.SSD documentation and internal variable naming (mne-tools#11475)
  Typo fix (mne-tools#11485)
  [MRG] Forward argument axes from plot_sensors to DigMontage.plot (mne-tools#11470)
  [MRG] Improve error message raised on channels missing from DigMontage (mne-tools#11472)
  MAINT: Deal with pkg_resources usage bugs (mne-tools#11478)
  Add object array support and docstring (mne-tools#11465)
  [ENH] Adjusted SSD algorithm to support non-full rank data (mne-tools#11458)
  [BUG] fix nibabel reference (mne-tools#11467)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants