Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add MEGnet to make MNE-ICALabel work on MEG data #207

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

colehank
Copy link

@colehank colehank commented Oct 18, 2024

Closes #134

PR Description

MEGnet Integration

Hi everyone,

I integrated MEGnet into mne_icalabel/megnet. Here’s a summary of the main work:

  1. ONNX Conversion: Transformed the original Keras model into ONNX format and confirmed that the prediction outputs remain consistent.

  2. Feature Simulation: Added megnet/features.py to simulate the BrainStorm ICA topomap used in MEGnet training.

  3. Prediction Function Simplification: Refined MEGnet prediction logic in megnet/label_components.py for simplicity. Also, the outputs now align with the current EEG label_components, using the keys:

  • label
  • y_pred_proba

You can run the following commands to check out the prediction performance of MEGnet on MNE’s sample data:

python megnet/features.py
python megnet/label_components.py

However, to successfully integrate into mne-icalabel, changes need to be made to the contents of mne-icalabel/label_components. I’m worried that, given my understanding of the project, I might mess it up, so all the work is only within mne_icalabel/megnet.

Merge checklist

Maintainer, please confirm the following before merging:

  • All comments resolved
  • All CIs are happy
  • Updated a changelog entry and contributor name in whats_new.rst is updated
  • PR description includes phrase "closes <#issue-number>"

@mscheltienne
Copy link
Member

Thanks for the PR, it looks like a very good start!
If I understand correctly this model is provided within the BrainStorm toolbox?

Would it be possible to take a known input, load it in MNE, load it in BrainStorm; then run the pipeline on MNE (this PR) and on BrainStorm, export the output from BrainStorm and compare it to the output from MNE (this PR)?
This is the idea os a 'unit test', where the goal is to write a test which makes sure that the code does what we want. If you can write this comparison, we can easily turn it into a test in the tests folders.

@colehank
Copy link
Author

This MEGnet is an open-source model, independent of BrainStorm, built on a Keras-based architecture. The input includes the ICA topomap from BrainStorm, a 120x120 RGB array, along with the corresponding ICA time series array. The output provides the probabilities that the component is classified as brain/other, eye blink, eye movement, or heart.

I performed “Feature Simulation” because the model was trained on ICA topomaps RGB array from BrainStorm. When I used MNE ICA topomaps to generate RGB images, the classification performance was not good. To improve this, I ran simulations in features.py. Below is the result using MNE sample MEG data:
b796b475-ab78-4791-8e53-130367f892a2
These features demonstrated acceptable classification accuracy in the resulting classifications.

To fully follow the MEGnet pipeline for comparison, we could start with BrainStorm, going from raw ➡️ ICA ➡️ topomaps & time series, and then use the original Keras model alongside the method in this PR for comparison. However, I have never used BrainStorm before. Perhaps I can write a comparison using the feature extraction method from this PR (features.py) for both models and check whether the classification results are consistent between ONNX and Keras?

@mscheltienne
Copy link
Member

Thanks for the clarification. I'm not worried of a difference between Keras and ONNX, but indeed we can test this:

  • Take a given input X
  • Run it through both Keras and ONNX version
  • Check for an exact match between the outputs

The good thing is that this is easy as the input X can be absolutely anything with the correct input shape and doesn't need to correspond to actual brain data.


Next, if this model was trained on topographic map from BrainStorm, I would expect it to behave differently if you feed slightly different topographic maps; as you comment. In this sense, we need to replicate the topographic map from BrainStorm, starting from an MNE object. This is what is done for the ICLabel model where those functions https://github.com/mne-tools/mne-icalabel/blob/main/mne_icalabel/features/topomap.py replicated the topographic map from EEGLab. To test the replication, we extracted topographic maps from EEGLab (in the end it's just a .mat file), and compared them with what we computed with the function in topomap.py.
It would be great to include the same kind of comparison between a topographic map outputted by brainstorm and one outputted by the feature code in your PR.

@colehank
Copy link
Author

Oh! I see, this will be a more systematic technical validation. I will follow these two steps to verify next. Thanks!

@colehank
Copy link
Author

colehank commented Oct 23, 2024

Hi! I recently had some time to complete the two-step verification process. You can see the details in a
temp repo
. The main result is shown in the figure below: the horizontal axis demonstrates the consistency between Keras and ONNX, while the vertical axis shows the generalizability of MEGnet across Brainstorm and MNE.

Additionally, I changed the original method for plotting the topomap by removing the contours, which seems to have improved MEGnet’s recognition accuracy. This is likely because the BrainStorm topomaps used to train MEGnet had very thin contour lines, which were white and almost invisible (as shown in the figure below).

2ea020c8-ae9a-4d6c-93b6-6ebd420a29be

There are some differences between the topomaps in MNE and BrainStorm. This is because, although I independently performed the same preprocessing and ICA steps on both platforms, I couldn't find a way to set the random_seed for ICA in BrainStorm (as I'm not very familiar with it). I'm not sure if this is important for our validation.

@colehank
Copy link
Author

Apologies for the multiple pushes😢. This is my first time creating a PR, and I’m still not very familiar with Git.

@adam2392 adam2392 self-requested a review October 27, 2024 23:40
@adam2392
Copy link
Member

Thanks for this work @colehank!

If I understand the code in your temp repo: Presumably the main difference is in the computation of the ICA, and corresponding topomaps within BrainStorm GUI?

Is it possible to also compare the MEGNet within BrainStorm and the MEGNet within mne-icalabel (this PR)? output on the same topomaps? E.g. using either/or the topomaps you computed from BrainStorm, or the topomaps from mne-icalabel?

This would verify that the network is fully the same. One thing you can do is save 1 input topomap, and 1 output from the network, and we can just use this "small test data" as a unit-test within this repo. For example, we do this here for ICLabel.

Now, the next, more difficult question is how to proceed if the topomaps are different. My question would be: i) are there substantial differences that will lead to performance changes between MNE approach vs BrainStorm approach? and ii) if not, do we care about matching exactly that in BrainStorm.

@mscheltienne
Copy link
Member

Apologies for the multiple pushes😢. This is my first time creating a PR, and I’m still not very familiar with Git.

Don't worry about it :)

I started a thorough review on Thursday, before I got interrupted by faults on our MEG system 😢
I'll try to complete it this week.

@adam2392 That's one part I did not get at the beginning either, but the model/network is not available in BrainStorm. The people who developed it made a strange choice: create topographic map in brainstorm, export them as PNG of shape (120, 120, 3) and export the ICA traces; then train a network in Python on the colored images and on the traces. Thus, the network is already in Python, not much to compare here ;)

Copy link
Member

@mscheltienne mscheltienne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll post those initial comments from last Thursday, you could already start tackling them :)

mne_icalabel/megnet/features.py Outdated Show resolved Hide resolved
mne_icalabel/megnet/features.py Show resolved Hide resolved
mne_icalabel/megnet/features.py Show resolved Hide resolved
mne_icalabel/megnet/label_componets.py Outdated Show resolved Hide resolved
mne_icalabel/megnet/features.py Outdated Show resolved Hide resolved
@colehank
Copy link
Author

Thanks for your suggestions! @mscheltienne @adam2392

There are indeed differences in the implementation of infomax between BrainStorm and MNE when we plot components, but from my experience, this difference doesn't seem to affect MEGnet's classification performance, as demonstrated in the fig before.

Based on @mscheltienne comments, I've made an updated iteration, and I hope it meets the required functionalities.

@mscheltienne
Copy link
Member

Thanks for tackling my comments, I'll try to continue the review as soon as possible!

Copy link
Member

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. Overall looking good and thanks for answering questions!

Comment on lines 6 to 17
def cart2sph(x, y, z):
xy = np.sqrt(x * x + y * y)
r = np.sqrt(x * x + y * y + z * z)
theta = np.arctan2(y, x)
phi = np.arctan2(z, xy)
return r, theta, phi


def pol2cart(rho, phi):
x = rho * np.cos(phi)
y = rho * np.sin(phi)
return x, y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions are also defined for ICLabel, so I wonder if we can pull them out for a general utility functions related to geometry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pol2cart is indeed a duplicate, cart2sph returns the element in a different order and it's a bit annoying to change the order. We can keep code de-duplication for a future PR.

return x, y


def make_head_outlines(sphere, pos, outlines, clip_origin):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write a docstring for the function?

mne_icalabel/megnet/features.py Outdated Show resolved Hide resolved
mne_icalabel/megnet/features.py Outdated Show resolved Hide resolved
Comment on lines 282 to 284
def _check_notch(
raw: BaseRaw, picks: str, neighbor_width: float = 2.0, threshold_factor: float = 3
) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why we don't just notch filter for them at the line_noise and its harmonics?

Would this impact the results?

I guess I can imagine a user constructing the Raw object via RawArray, which wouldn't need them to specify the line noise frequency, so maybe this wouldn't work all the time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elekta/MEGIN system do store the line noise frequency, but I don't know if this is the case for other systems. I would keep it is as it is now, with the responsibility to apply the notch filter left to the user.

mne_icalabel/megnet/label_components.py Outdated Show resolved Hide resolved
@mscheltienne
Copy link
Member

mscheltienne commented Nov 7, 2024

@colehank Hi! Apologies for the slow review pace, my bandwidth is limited, I pushed 4 commits which:

  • Clean-up a couple of docstrings/imports
  • Remove the model_path argument which is unnecessary, use importlib.resources.files instead of os.path to find the ONNX file, and include the ONNX file in the distribution (line added in pyproject.toml)
  • Simplified the _check_line_noise function, with the idea that the docstring already indicates that line noise should be removed thus (1) if raw.info["line_freq"] is absent, the check is skipped, (2) vectorize the computations/comparison using numpy

But more importantly, you'll see that I added a file mne_icalalbe/megnet/tests/test_features which contains a function test_check_line_noise. This is called a unit test and all the functions starting by test_ in folders (modules) tests are automatically run when you run the command pytest mne_icalabel from the root of the repository. This is also what is done by the different pytest / ... workflows that you can see running on this PR. The goal of unit test is to catch bugs before they make it into the code. Any error or warnings issued during a unit test will be caught!

In this case, for the _check_line_noise function:

  • I create a toy dataset with 2 channels with a known frequency content (peak at 10 and 30 Hz, peak at 30 and (0 Hz)
  • I run _check_line_noise on this toy dataset in different conditions (with raw.info["line_freq"] set or not) and assert that I get the correct output.

Could you try to write such test for:

  • mne_icalabel.megnet.features:get_megnet_features -> typical name for the test function would be test_get_megnet_features in mne_icalabel.megnet.tests.test_features.py. You could test that (1) it actually runs, the runtime error is issued if no MEG channels are present, the RuntimeError is issued if the number of samples is too low, etc.. A couple of tips:
# create a toy Raw object as a `pytest.fixture`
# a fixture is a function to create data/prepare resources for a unit test 
# a fixture is passed as an argument of the test function

@pytest.fixture
def raw_ica_valid():
    """Create a valid Raw for MEGNET."""
    raw = ...
    ica = ...
    return raw, ica

def test_get_megnet_features(raw_valid):
     """Test that 'get_megnet_features' works on a valid raw."""
    time_series, topomaps = get_megnet_features(*raw_valid)
    # assert that what is returned is correct
    assert time_series.shape = ...
    assert topomaps.shape = ...


@pytest.fixture
def raw_invalid1():
    """Create an invalid Raw for MEGNET."""
    raw = ...
    return raw


@pytest.fixture
def raw_invalid2():
    """Create an invalid Raw for MEGNET."""
    raw = ...
    return raw


def test_get_megnet_features_invalid(raw_invalid1, raw_invalid2):
    """Test that the warnings/errors are correctly issued during validation."""
    with pytest.raises(RuntimeError, match="Could not find MEG channels"):
        get_megnet_features(raw_invalid1, ...)
    with pytest.warns(RuntimeWarning, match="..."):
        get_megnet_features(...)

You could eventually test more precisely _get_topomaps_data and _get_topomaps, with the idea that if you know what it should output for input X, then run it and assert that you do get what you expect. But I would prefer you focus on get_megnet_features if possible.

  • mne_icalabel.megnet.label_components:megnet_label_components -> typical name for the test function would be test_megnet_label_components in mne_icalabel.megnet.tests.test_label_components.py.

Again, you could test more precisely _chunk_predicting and _get_chunk_start, but let's focus on the public functions which will anyway run those private functions.


Finally, at the moment megnet_label_components returns a dictionary:

    dict
        Dictionary with the following keys:
            - 'y_pred_proba' : list of float
                The predicted probabilities for each component.
            - 'labels' : list of str
                The predicted labels for each component.

Could you rework the output to match the output from ICLabel for consistency across the API? ICLabel returns:

    labels_pred_proba : numpy.ndarray of shape (n_components, n_classes)
        The estimated corresponding predicted probabilities of output classes
        for each independent component. Columns are ordered with 'brain',
        'muscle artifact', 'eye blink', 'heart beat', 'line noise',
        'channel noise', 'other'.

Happy to help/provide more details or resources!

@colehank
Copy link
Author

Thank you all for your helpful suggestions, and apologies for the delay in my response as I’ve been resting due to illness.

I’ve just submitted the unit test script for the main functions in label_components and features.

Additionally, I made the following updates:

  • The megnet_label_components now outputs an array with shape (n_components, n_classes). However, please note that the classes for MEGNET are ['brain/other', 'eye movement', 'heart beat', 'eye blink'], which do not exactly match the classes used in ICLabel.

  • I added a docstring for the _make_head_outlines function.

  • The pol2cart function has been updated to use _pol2cart from mne_icalabel.iclabel._utils.

@colehank
Copy link
Author

first time writing pytest, I might made some silly mistakes, but I did my best to ensure that there are no failures in the megnet directory locally. However, when I submitted it to pre-commit.ci, I encountered the following error:

_MODEL_PATH: str = files("mne_icalabel.megnet") / "assets" / "megnet.onnx"
TypeError: expected str, bytes or os.PathLike object, not NoneType

I'm not quite sure why this is happening... It seems that the model path is not being recognized correctly. Could it be because my branch is not in sync with the main branch?

@mscheltienne
Copy link
Member

mscheltienne commented Nov 14, 2024

Thanks, I'll try to have a look tomorrow!

the classes for MEGNET are ['brain/other', 'eye movement', 'heart beat', 'eye blink'], which do not exactly match the classes used in ICLabel.

That's fine and expected.

Looking briefly at the test I see:

  • Failure in pre-commit.ci:
codespell................................................................Failed
- hook id: codespell
- exit code: 65

mne_icalabel/megnet/features.py:34: hould ==> hold, should

You'll find line 34 in this file a spelling error, suggestions are hold or should.
If this is not a spelling error, we can tell codespell to ignore it.

  • Failure in pytest:
     def test_megnet_label_components(raw_ica):
        """Test whether the function returns the correct artifact index."""
        real_atrifact_idx = [0, 3, 5]  # heart beat, eye movement, heart beat
        prob = megnet_label_components(*raw_ica)
        this_atrifact_idx = list(np.nonzero(prob.argmax(axis=1))[0])
>       assert this_atrifact_idx == real_atrifact_idx
E       assert [np.int64(0),..., np.int64(5)] == [0, 3, 5]
E         
E         At index 2 diff: np.int64(4) != 5
E         Left contains one more item: np.int64(5)
E         
E         Full diff:
E           [
E         -     0,
E         -     3,
E         -     5,
E         +     np.int64(0),
E         +     np.int64(3),
E         +     np.int64(4),
E         +     np.int64(5),
E           ]

mne_icalabel/megnet/tests/test_label_components.py:38: AssertionError

Which you can probably reproduce locally by running:

pytest mne_icalalbel -k test_megnet_label_components

@colehank
Copy link
Author

codespell................................................................Failed
- hook id: codespell
- exit code: 65

mne_icalabel/megnet/features.py:34: hould ==> hold, should

Sorry, this was a small mistake I made. I have already corrected it in the latest submission.

Regarding the previously mentioned issue: TypeError: expected str, bytes or os.PathLike object, not NoneType

I think it was caused by the resource "mne_icalabel.megnet" was not recognized in my environment. When I changed _MODEL_PATH: str = files("mne_icalabel.megnet") / "assets" / "megnet.onnx" to my local absolute path, everything worked as expected. follow is my pytest result

(ica) zgh@zghdeMacBook-Air wokingdir % pytest mne-icalabel/mne_icalabel/megnet
====================================================== test session starts ======================================================
platform darwin -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0 -- /Users/zgh/anaconda3/envs/ica/bin/python
cachedir: .pytest_cache
rootdir: /Users/zgh/Desktop/wokingdir/mne-icalabel
configfile: pyproject.toml
collected 6 items                                                                                                               

mne-icalabel/mne_icalabel/megnet/tests/test_features.py::test_check_line_noise PASSED                                     [ 16%]
mne-icalabel/mne_icalabel/megnet/tests/test_features.py::test_get_megnet_features PASSED                                  [ 33%]
mne-icalabel/mne_icalabel/megnet/tests/test_features.py::test_get_megnet_features_invalid PASSED                          [ 50%]
mne-icalabel/mne_icalabel/megnet/tests/test_label_components.py::test_megnet_label_components PASSED                      [ 66%]
mne-icalabel/mne_icalabel/megnet/tests/test_label_components.py::test_get_chunk_start PASSED                              [ 83%]
mne-icalabel/mne_icalabel/megnet/tests/test_label_components.py::test_chunk_predicting PASSED                             [100%]

------------------------------ generated xml file: /Users/zgh/Desktop/wokingdir/junit-results.xml -------------------------------
===================================================== slowest 20 durations ======================================================
8.00s setup    mne_icalabel/megnet/tests/test_features.py::test_get_megnet_features_invalid
5.98s call     mne_icalabel/megnet/tests/test_label_components.py::test_megnet_label_components
5.00s setup    mne_icalabel/megnet/tests/test_label_components.py::test_megnet_label_components
2.01s setup    mne_icalabel/megnet/tests/test_features.py::test_get_megnet_features
1.70s call     mne_icalabel/megnet/tests/test_features.py::test_get_megnet_features_invalid
0.58s call     mne_icalabel/megnet/tests/test_features.py::test_get_megnet_features
0.19s call     mne_icalabel/megnet/tests/test_label_components.py::test_chunk_predicting

(11 durations < 0.005s hidden.  Use -vv to show these durations.)
====================================================== 6 passed in 23.51s =======================================================

but, when I try pytest mne-icalabel -k test_megnet_label_components, it says ImportError: cannot import name 'picard' from 'picard' as follows. These seem to be silly environment issues, but even after I tried reinstalling the environment, the problem still persists... Sorry about that.

(ica) zgh@zghdeMacBook-Air wokingdir % pytest mne-icalabel -k test_megnet_label_components
====================================================== test session starts ======================================================
platform darwin -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0 -- /Users/zgh/anaconda3/envs/ica/bin/python
cachedir: .pytest_cache
rootdir: /Users/zgh/Desktop/wokingdir/mne-icalabel
configfile: pyproject.toml
collected 85 items / 1 error / 84 deselected / 1 selected                                                                       

============================================================ ERRORS =============================================================
_________________________________ ERROR collecting mne_icalabel/features/tests/test_topomap.py __________________________________
ImportError while importing test module '/Users/zgh/Desktop/wokingdir/mne-icalabel/mne_icalabel/features/tests/test_topomap.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../anaconda3/envs/ica/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
mne-icalabel/mne_icalabel/features/tests/test_topomap.py:23: in <module>
    ica.fit(raw)
../../anaconda3/envs/ica/lib/python3.9/site-packages/mne/preprocessing/ica.py:736: in fit
    self._fit_raw(
../../anaconda3/envs/ica/lib/python3.9/site-packages/mne/preprocessing/ica.py:813: in _fit_raw
    self._fit(data, "raw")
../../anaconda3/envs/ica/lib/python3.9/site-packages/mne/preprocessing/ica.py:975: in _fit
    from picard import picard
E   ImportError: cannot import name 'picard' from 'picard' (/Users/zgh/anaconda3/envs/ica/lib/python3.9/site-packages/picard/__init__.py)
------------------------------ generated xml file: /Users/zgh/Desktop/wokingdir/junit-results.xml -------------------------------
==================================================== short test summary info ====================================================
ERROR mne-icalabel/mne_icalabel/features/tests/test_topomap.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================ 84 deselected, 1 error in 2.58s ================================================

@mscheltienne
Copy link
Member

No need to apologizes, that's fine and it does seem to be environment issues. The online CIs don't suffer from those, so they will tell us if pytest is passing or not. I just-restarted the workflows, let's see!

Copy link
Member

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's looking good. Just have to wrangle with our testing framework.

Thanks for the hard efforts @colehank!


Parameters
----------
raw : Raw.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raw : Raw.
raw : Raw


from .features import get_megnet_features

_MODEL_PATH: str = files("mne_icalabel.megnet") / "assets" / "megnet.onnx"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line causes issues.

See:

network_file = files("mne_icalabel.iclabel.network") / "assets" / "ICLabelNet.onnx"

Let us know if that doesn't work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think he modified anything from iclabel, that should not be failing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes nvm. I misread the error.

Hmm it seems the file is somehow not found.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I found the issue! After I added the __init__.py file in the mne_icalabel/megnet directory, it was able to correctly find megnet.onnx. (but I remember that python(at least v3.8) don't require init.py to treat a folder as a package.)

"""Test whether the function returns the correct artifact index."""
real_atrifact_idx = [0, 3, 5] # heart beat, eye movement, heart beat
prob = megnet_label_components(*raw_ica)
this_atrifact_idx = list(np.nonzero(prob.argmax(axis=1))[0])
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit strange here. When I ran it yesterday, the model detection results were 0, 3, 5; but today they have become 0, 3, 4, 5, which caused the pytest to fail. It might be due to the randomness in the ICA algorithm? (even though I set the random_state).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can try asserting set(real_atrifact_idx).issubset(set(this_atrifact_idx)) to ensure that the true labels are a subset of the model's predicted labels. This is a debugging-specific approach, but it might not be entirely reasonable...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your ICA has a fix random state, and if the model is deterministic (which I think it is?), you should get the exact same results.

To debug you can try:

  • run 2/3 times the 'icai' part and compare the mixing/unmixing matrixes with assert_allclose. They should be identical.
  • run 2/3 times the model part and compare the model outputs (not the index, but the probabilities per class), they should be identical.

Copy link
Author

@colehank colehank Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I followed your debugging method, and it turns out that both ICA with a fixed random state and MEGNet are deterministic.

To investigate further, I ran tests with both the pytest script in the PR and a new script I created locally.

  1. the model outputs were consistent across both setups.
  2. in the model's output, specifically in the 4th row [0.5, 0.5, 0.0, 0.0], the values for brain/other and eye movement are the same 0.5. Logically, when using output.argmax(axis=1), the returned array[4] should be 0, as it is the index of the first maximum value. My local script behaves as expected, returning array[4] = 0. However, in the PR's tests, array[4] is 1.

This seems to be the source of the issue, despite using the same argmax method, the pytest runs in the PR are not following the expected behavior of
numpy.argmax?

In case of multiple occurrences of the maximum values, the indices corresponding to the first occurrence are returned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Evaluate and support MEGNet
3 participants