How to Handle Unequal Number of Trials Across Different Classes in MVPA Light? #48

darianyao · 2024-07-12T05:11:13Z

Hi, Mr treder. I am currently working with MVPA Light and have encountered an issue where the number of trials across different classes is not equal. Could you please advise on the best practices or methods within MVPA Light to handle this imbalance? Any guidance or examples would be greatly appreciated. Thank you very much!

darianyao · 2024-07-13T06:28:22Z

Additionally, I noticed that there are examples for analyzing MEEG data. These examples have been tremendously beneficial to me. Thank you for your assistance. And could you provide a similar example for classifying fMRI data using MVPA Light? This would be extremely helpful for beginners. Thank you again!

treder · 2024-07-24T16:14:36Z

Hi @darianyao ! You can used the preprocessing pipeline to either oversample the minority classes or undersample that majority classes, see the preprocessing examples.

Regarding fMRI data, it would be indeed be nice to have concrete examples in the toolbox examples here. For now, you can refer to the example I mentioned in the MVPA-Light paper, namely the analysis of the Haxby dataset. The code is here. I hope this gives you a useful starter.

Let me know whether this answers your queries.

darianyao · 2024-07-25T04:48:31Z

Thank you for your reply. I have resolved the data imbalance issue by studying the examples in the toolbox. However, I seem to have encountered a bug. When I set the parameters cfg.cv = 'leaveout' and cfg.metric = 'auc', the results returned by mv_classify
function are always 0.

darianyao · 2024-07-25T04:59:10Z

Additionally, I have another issue. When using PCA for preprocessing and neighborhoods (searchlight in the time dimension) for analysis, the mv_classify function throws an error. I suspect this might be because the searchlight matrix is defined based on the original data, but after PCA, some redundant features from the original data are removed. Perhaps it would be simpler for users to input parameters directly rather than defining a matrix, for example, setting cfg.neighborhoods = 3 for a window containing three points, setting cfg.neighborhood_dim = 'channel' or cfg.neighborhood_dim = 'time' for dimension selection. Thank you :).

treder · 2024-07-27T11:18:40Z

AUC is the area under the ROC curve, with 1 sample in the test the ROC curve is essentially a line so the area is 0. The way it's calculated in MVPA-Light you would need both negative and positive examples (class 1 and 2) in the test set. You could use leaveout, collect the outputs (e.g., dvals) and then run mv_calculate_performance manually on the collected data.
But I will try to see whether there is a reasonable hack for this situation just for convenience.

Re: neighbours
I am not sure I understand the problem exactly. If you run a PCA (say on your voxel) you lose the notion of a neighbourhood structure (e.g., PC1 is not really a neighbour of PC2, since all of them are some linear combinations of voxels). Is your goal to use say PC1-PC3, PC2-PC4, PC3-PC5 and so on in a sliding window? You could fix the number of PCs or calculate the PCA beforehand (not inside the classification loop) and then define the neighbourhood matrix according to the PCs. Would this work?

darianyao · 2024-07-27T11:47:27Z

Thank you very much for your reply.

Re: AUC

"Thank you for helping me understand AUC better. I'm not from a computer science background, so I asked ChatGPT how to solve the issue of calculating AUC with leave-one-out cross-validation (LOO-CV) since it cannot be calculated in a single iteration. Here is ChatGPT's response: 'To correctly calculate the AUC using leave-one-out cross-validation (LOO-CV), we need to aggregate all the predicted results from each iteration and then calculate the ROC curve and AUC as a whole.' I'm not sure if this will be helpful to you."

Re: neighbourhoods

"Regarding the neighbourhoods issue, my solution is not to run PCA in the preprocessing stage:). I can't think of other methods. I will try the fixed PC numbers method you mentioned, but I'm not sure if it will resolve the errors that occur when using searchlight and PCA together."

treder · 2024-07-27T17:11:45Z

Glad I could help, good luck!

Re:AUC
Yes, this is exactly what I suggested above. For now you have to do this "by hand". MVPA-Light does not do this because metrics are calculated on test sets and then averaged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to Handle Unequal Number of Trials Across Different Classes in MVPA Light? #48

How to Handle Unequal Number of Trials Across Different Classes in MVPA Light? #48

darianyao commented Jul 12, 2024

darianyao commented Jul 13, 2024

treder commented Jul 24, 2024

darianyao commented Jul 25, 2024

darianyao commented Jul 25, 2024

treder commented Jul 27, 2024 •

edited

Loading

darianyao commented Jul 27, 2024

treder commented Jul 27, 2024

How to Handle Unequal Number of Trials Across Different Classes in MVPA Light? #48

How to Handle Unequal Number of Trials Across Different Classes in MVPA Light? #48

Comments

darianyao commented Jul 12, 2024

darianyao commented Jul 13, 2024

treder commented Jul 24, 2024

darianyao commented Jul 25, 2024

darianyao commented Jul 25, 2024

treder commented Jul 27, 2024 • edited Loading

darianyao commented Jul 27, 2024

treder commented Jul 27, 2024

treder commented Jul 27, 2024 •

edited

Loading