An awesome toolbox for performing rapid machine learning analysis on your EEG data
- Python v3.8 and up
This should help you get the WAVi toolbox set up on your machine. To get a local copy up and running, follow these simple example steps.
- Python v3.8 or higher
- The following pip-installable packages:
- Clone the repo
git clone https://github.com/canlab/WAViMedEEG
- Install dependencies
pip3 install -r requirements.txt
- (Optional) Edit EEG headset, task names, and base directory setup in config.py
vim config.py
Before starting, you'll need to organize your dataset into a 'study folder'. If a single dataset contains more than one class, it may be useful to store them in separate folders.
If you're exporting your data using the WAVi Desktop software, do so using the 'MATLAB/csv' export option.
Regardless of whether your data is already anonymized, we will hash it again and store the original filenames in a new text file and store it again. You should back up this translator file as soon as possible.
Your 'study folder' should be organized as follows:
+-- /path/on/my/computer
| +-- /myStudy_controls
| | +-- /raw
| | | +-- John_Doe_P300.art
| | | +-- John_Doe_P300.eeg
| | | +-- John_Doe_P300.evt
| | | +-- Jane_Smith_P300.art
| | | +-- Jane_Smith_P300.eeg
| | | +-- Jane_Smith_P300.evt
| | | +-- ...
| +-- /myStudy_disease123
| | +-- /raw
| | | +-- Jim_Disease_P300.art
| | | +-- Jim_Disease_P300.eeg
| | | +-- Jim_Disease_P300.evt
| | | +-- Jenny_Sick_P300.art
| | | +-- Jenny_Sick_P300.eeg
| | | +-- Jenny_Sick_P300.evt
| | | +-- ...
Next, we will use the Clean module to anonymize and standardize our data.
Ex_clean.py
python3 Ex_clean.py --help
argument | type | default | description |
---|---|---|---|
studies_folder | str | config.myStudies | Path to parent folder containing study folders |
study_name | str | config.studyDirectory | Study folder containing dataset |
group_num | int | 1 | Group number to be assigned to dataset |
See the cleaning jupyter notebook file for example code.
- Instantiate a 'StudyFolder' object
myStudy = Clean.StudyFolder('/path/on/my/computer/myStudy')
Parameters:
- path: path to task folder
- Standardize and Anonymize Study Data Automatically:
myStudy.autoclean()
For each task defined in config.tasks, performs StudyFolder.standardize and StudyFolder.anon, standardizing task names / file structure and anonymizing subject headers, leaving original filenames in translator stored in /myStudy/_translator.txt
Note: you may need to manually update your subject numbers. The first number of a subject code indicates the subject's group number, moving forward.
Next, we will use the Prep module to structure our data, and remove bad timepoints (such as blinks, motion, etc.). If you'd like to create new data that has been filtered to specific frequency ranges, see Standard and inspect the BandFilter class.
Ex_bandpass.py
python3 Ex_bandpass.py --help
argument | type | default | description |
---|---|---|---|
studies_folder | str | config.myStudies | Path to parent folder containing study folders |
study_name | str | None | Study folder containing dataset. If None, performs on each folder in studies_folder |
task | str | P300 | Task to use, from options in config.tasks |
type | str | bandpass | Which band filter method should be applied: lowpass, highpass, bandstop, bandpass |
band | str | delta | Frequency band used for band ranges: delta, theta, alpha, beta, gamma |
Ex_contigs.py
python3 Ex_contigs.py --help
argument | type | default | description |
---|---|---|---|
length | int | (required) | Duration of input data, in number of samples @ 250 Hz (or other, specified in config.sample_rate) |
artifact | int | 0 | Strictness of artifacting algorithm to be used: 0 - strict, 1 - some, 2 - raw |
studies_folder | str | config.myStudies | Path to parent folder containing study folders |
study_name | str | None | Study folder containing dataset. If None, performs on each folder in studies_folder |
task | str | P300 | Task to use, from options in config.tasks |
spectra | bool | True | Whether spectra should automatically be generated and written to file after making contigs |
channels | str | 1111111111111111111 | Binary string of EEG channels to be included in analysis, in order of config.channels |
filter_band | str | nofilter | Bandfilter to be used in analysis steps, such as 'noalpha', 'delta', or 'hialpha' |
erp | bool | False | If True, then only contigs falling immediately after a '1' or '2' in the corresponding .evt file will be processed |
erp_degree | int | 1 | Lowest number in .evt files which will be accepted as an erp event |
See the contigs jupyter notebook file for example code.
- Instantiate a 'TaskData' object
myTask = TaskData('/myStudy/task_name')
Parameters:
- path: path to task folder
Contigs are generated datasets of equal-length snippets from the raw data. We can use a range of criteria to deem them 'worthy' data. Such as:
- strict (artDegree=0): require all .art datapoints to be 0 within the timeframe
- medium (artDegree=1): require all .art datapoints to be 0 or 1 within the timeframe
- loose (artDegree=2): allow any datapoint to pass
- ERP (Evoked Response Potential): use only datapoints following a stimulus, in which case a .event attribute is attached to the Prep.Contig object
- Generate contigs (these are going to be stored in RAM temporarily as 'Contig' objects)
myTask.gen_contigs(contigLength)
Generates Contig objects for every file possible in TaskData.path, appending each to TaskData.contigs
Parameters:
- contigLength: length in samples (@ 250 Hz or config.sampleRate)
- network_channels: default config.network_channels
- artDegree: (int) default 0, minimum value accepted to pass as a
"clean" contig, when reading mask from .art file - ERP: (bool) default False, if True then only contigs falling immediately
after a "1" or a "2" in the corresponding .evt file will be accepted,
i.e. only evoked responses
- (Optional) Write contigs to file
myTask.write_contigs()
Writes TaskData.contigs objects to file, under TaskData.path / contigs or TaskData.path / erps
Spectra can be automatically created using the --spectra flag in Ex_contigs.py, will have its own script soon.
Parameters:
- path: path to task folder
Spectra are simply fourier-transformed contig files. They can be generated using saved contig files.
- Generate spectra (these are going to be stored in RAM temporarily as 'Spectra' objects)
myTask.gen_spectra(contigLength)
Generates Spectra objects for every file possible in TaskData.path, according to TaskData.contigs
Parameters:
- contigLength: length in samples (@ 250 Hz or config.sampleRate)
- network_channels: default config.network_channels
- artDegree: (int) default 0, minimum value accepted to pass as a
"clean" contig, when reading mask from .art file - ERP: (bool) default False, if True then only contigs falling immediately
after a "1" or a "2" in the corresponding .evt file will be accepted,
i.e. only evoked responses
- (Optional) Write spectra to file
myTask.write_spectra()
Writes TaskData.spectra objects to file, under TaskData.path / spectra
Next, we will use the ML module to use a myriad of machine-learning tools to create archetypes for our patient groups. Independent of which method you select to employ, the same basic formula will load in our data.
Class object to which we can load our data before differentiating using various ML methods.
argument | type | default | description |
---|---|---|---|
type | str | (required) | Which datatype the model should expect: "spectra", "erps", "contigs" |
network_channels | str | 1111111111111111111 | Binary string of channel names to be included in analysis |
Loads one data at a time, appending it to the ML.Classifier.data attribute.
argument | type | default | description |
---|---|---|---|
path | str | (required) | Path to file (spectra, contig, or erp) |
Knowing that reference groups are named as follows: - ref 24-30 - ref 31-40 - ref 81+ - ...
Balances the classes of a dataset such that Classifier.data contains an equal number of control and condition-positive Spectra or Contig objects. New data are added with Classifier.LoadData.
argument | type | default | description |
---|---|---|---|
datapath | str | (required) | Parent path of reference folders listed above |
A command-line script does not yet exist for this function.
Documentation to be added soon.
A command-line script does not yet exist for this function.
Documentation to be added soon.
Ex_cnn.py
python3 Ex_cnn.py --help
argument | type | default | description |
---|---|---|---|
data_type | str | (required) | Input data type: contigs, erps, or spectra |
studies_folder | str | config.myStudies | Path to parent folder containing study folders |
study_name | str | config.studyDirectory | Study folder containing dataset. If None, performs on each folder in studies_folder |
task | str | P300 | Task to use, from options in config.tasks |
length | int | 250 | Duration of input data, in number of samples @ 250 Hz (or as otherwise specified in config.sample_rate) |
channels | str | 1111111111111111111 | Binary string specifying which of the EEG channels listed in config.channel_names will be included in analysis |
artifact | int | 0 | Strictness of artifacting algorithm to be used: 0 - strict, 1 - some, 2 - raw |
erp_degree | int | None | Lowest number in .evt files which will be accepted as an erp event (only relevant if type == 'erps'). |
epochs | int | 100 | Number of training iterations to be run |
plot_ROC | bool | False | Plot sensitivity-specificity curve using validation dataset (group number == 0) |
tt_split | float | 0.33 | Ratio of test samples to train samples |
normalize | str | None | Which normalization technique to use: standard, minmax, None |
learning_rate | float | 0.01 | CNN step size |
lr_decay | bool | False | Whether learning rate should decay adhering to a 0.96 / step decay rate schedule |
See the CNN jupyter notebook file for example code.
See the open issues for a list of proposed features (and known issues)
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the project
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a pull request
Distributed under the MIT License. See LICENSE
for more information.
Clayton Schneider - @clayton_goob - claytonjschneider@gmail.com
Project link: https://github.com/canlab/WAViMedEEG
- Lyanna Kessler
- Will Stritzel
- Francesca Arese
- Ambarish Jash
- David Joffe
- Ryan Layer
- WAVi Medical