-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Utility for testing EEG data-cleaning pipelines? #193
Comments
Hi, thank for contacting us. PyPREP is a thrilling project and brillant idea to have an automatic cleaning process for EEG. If you want to use BCI as a proxy for more generic EEG, and see how does perform your pipelines with and without the cleaning, MOABB is definitely a very good match! You could use the predefine pipelines for processing BCI datasets and it is possible to modify those pipeline to include your PyPREP cleaning stage. The pipeline should be following scikit-learn format and could take numpy arrays or MNE epochs as input. We could discuss this in visioconference during in office hours (see #191), on Gitter or in this issue. |
Hey @sylvchev, thanks for the quick response! I've got some other academic projects to tackle first before I can delve too far into this, but I'll get in touch when I've got some time. Very excited by the prospect of having a real-world benchmark for pipeline performance: there are quite a few areas where the original PREP makes some odd computation choices, and we've been trying to decide whether to replicate its behaviour or use a method that makes more sense to us. A tool like this would make answering those questions much easier. |
We discuss about this during office hours, as you may read in the wiki. This could be really interesting as @jsosulski made a script to generate plots, in order to have a visual sanity checks of all the subject's data for all datasets. Right now, it is only for P300 and there is already some interesting finding regarding data quality, as shown in #184 We will be happy to help you to use MOABB for benchmarking the computation choices of PyPREP. Do not hesitate to ping when you are ready to work on this, or to come and say hi! during office hours #191 |
@sylvchev Thanks again for meeting with me today! I've written a basic test script to get PyPREP working with moabb, but I'm running into a strange error relating to the CSP process: Full Traceback (click to unhide)
This happens for subjects 1, 3, and 4, but not subject 2 for some reason. import warnings
import numpy as np
import mne
from mne.decoding import CSP
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn.pipeline import Pipeline
from pyprep import PrepPipeline
from moabb import set_log_level
from moabb.paradigms import LeftRightImagery
from moabb.evaluations import WithinSessionEvaluation
from moabb.datasets import PhysionetMI
set_log_level("error")
warnings.filterwarnings("ignore")
# Initialize PREP Pipeline setup
bci2000 = PhysionetMI()
montage = mne.channels.make_standard_montage("standard_1020")
raw_info = bci2000.get_data(subjects=[1])[1]['session_0']['run_4'].info
eeg_index = mne.pick_types(raw_info, eeg=True, eog=False, meg=False)
ch_names_eeg = list(np.asarray(raw_info["ch_names"])[eeg_index])
sample_rate = raw_info["sfreq"]
powerline_hz = 60
prep_params = {
"ref_chs": ch_names_eeg,
"reref_chs": ch_names_eeg,
"line_freqs": np.arange(powerline_hz, sample_rate / 2, powerline_hz),
}
# Initialize custom class for PyPREP integration
class PhysionetPREP(PhysionetMI):
def _load_one_run(self, subject, run, preload=True):
raw = PhysionetMI._load_one_run(self, subject, run, preload)
msg = "\n\n### Running PREP for Subject {0}, Run {1} ###\n"
print(msg.format(subject, run))
prep = PrepPipeline(raw, prep_params, montage, random_state=42)
prep.fit()
prep.raw_eeg.info['bads'] = [] # Clear any remaining bads
return prep.raw_eeg
# Initialize dataset/paradigm
bci2000 = PhysionetMI()
bci2000.subject_list = list(range(1, 9))
bci2000_prep = PhysionetPREP()
bci2000_prep.subject_list = list(range(1, 9))
bci2000_prep.code = "PREP Physionet Motor Imagery"
datasets = [bci2000_prep, bci2000]
paradigm = LeftRightImagery()
# Initialize pipeline evaluation
evaluation = WithinSessionEvaluation(
paradigm=paradigm,
datasets=datasets,
overwrite=True
)
# Initialize BCI pipelines
pipelines = {
'csp+lda': Pipeline(steps=[
('csp', CSP(n_components=8)),
('lda', LDA())
])
}
# Actually run the pipeline
results = evaluation.process(pipelines)
breakpoint() This was done using the latest GitHub versions of PyPREP and MNE. I also get the error when running this built-in MNE CSP example with PyPREP-processed Raw objects so I know it's not a moabb-specific issue. Any ideas what might be going on here? (Note: if trying to run the this script yourself, please use the latest GitHub version of PyPREP instead of the one on PyPI, which missing a lot of major bugfixes and improvements) EDIT: After a bit more reading, I'm starting to expect that the problem is that PyPREP's interpolation of bad channels is causing the problem. Since an interpolated channel is going to be highly predictable by the values of the other channels, that makes the covariance essentially zero and throws a wrench in Numpy's |
Hi @a-hurst, I did not have time to run the code, but considering the error message, the issue is probably that your data is no longer full rank after using pyprep. This can e.g. happen when you perform an ICA decomposition and remove ICA components that you think are an artifact. If you then project back into the sensor space, the data has no longer full rank. The best solution would be to perform CSP+LDA etc in the ICA space. A quick fix / check could be to add some minor noise to the data or instead of removing the artifactual components, scaling them down to 0.0001 or something. |
I agree with your edit and @jsosulski: the interpolation of bad/missing channels relies on a linear combination that makes the matrix rank deficient, and covariance-based pipelines could not apply then. I'm working on an estimator of the mean that could handle such situations: http://www.acml-conf.org/2020/video/paper/yger20a but we don't have a pipeline ready yet. |
Hi there,
Just stumbled upon this project: looks super-useful for reproducible science! I'm currently a collaborator on PyPREP (an MNE-Python reimplementation of the MATLAB PREP pipeline), and am wondering how useful moabb would be for evaluating generalized (i.e. BCI-unrelated) EEG preprocessing pipelines?
It would be highly useful for us to have a tool that benchmarks how well a given filtering method or noisy channel detection method improves a dataset's SNR, and it seems like comparing the effects of our preprocessing on BCI classification accuracy would be a good way of doing that. Is this kind of workflow something moabb is designed to support?
Thanks in advance!
The text was updated successfully, but these errors were encountered: