PCA is applying an inappropriate transformation #654

notZaki · 2021-01-18T19:01:58Z

Summary

The PCA implementation in sklearn performs normalization along the voxel dimension and this isn't an appropriate strategy for fMRI data. We should switch to a different PCA implementation or transpose the data before PCA.

Additional Detail

The PCA implementation in sklearn centers the input data along the first dimension before decomposition. For tedana, the data is an NxT matrix and the appropriate normalization should be along the second dimension (#636).

The simplest change would be to define our own PCA function. This actually already exists in decomposition.ma_pca._icatb_svd.

Alternatively, we could transpose the data so that its dimensions are TxN. Then we can continue using sklearn, however this will require heavier refactoring because the left/right sides of the decomposition will be swapped.

This normalization could be one of the reasons that the ICA decomposition is inconsistent (#629). Possible explanation is that the PCA step normalizes across voxels and this distorts the time-series which makes ICA's life difficult.
l did a quick test on the three echo test data, and the variability across python versions, CPU, and single/multi-threading vanished by swapping sklearn's PCA with my own definition. [log]

The text was updated successfully, but these errors were encountered:

tsalo · 2021-01-19T16:43:22Z

Would transposing impact the decomposition? I know that the number of components is limited based on both T and N, but I assume that one way would be a spatial decomposition vs. a temporal one...

notZaki · 2021-01-19T17:54:39Z

In theory, transposing shouldn't change PCA results. The most significant difference would be a swap between the U<->Vt matrices. However, since sklearn internally normalizes the data along axis = 0, that part will change the results. I'm not sure if this change can be done by re-normalizing the PCA output. My guess is no, but I'll check on some test data.

For ICA, I'm not sure. Could just try it on dummy data and check if there is a difference. The current ICA in tedana looks a bit strange to me because it doesn't using ICA.fit_transform() to extract the sources, rather it uses data from the mixing matrix. I was expecting the former to be used, but I haven't looked at the selection/metrics code yet.

How did the original meica code store the data? Was it NxT or TxN? (N = num voxels, T = number of TRs)

eurunuela · 2021-01-19T18:11:18Z

I'm not sure I'm following you @notZaki . What PCA are we talking about? maPCA?

Also, as you mention, transposing the data before the PCA would only swap U and Vt. I'm not familiar with sklearn's internal normalization.

notZaki · 2021-01-19T19:50:23Z

@eurunuela I am focusing on sklearn's implementation of PCA and ICA. This is partially relevant to maPCA because it uses PCA at the very end, once the number of components is estimated.

eurunuela · 2021-01-19T20:22:57Z

Gotcha!

We could simply transpose the input matrix and the resulting eigenvalues, then swap U and Vt.

notZaki · 2021-01-27T21:35:56Z

I am going to close this issue since the underlying issue has to do with normalization and there is discussion on that in #653 and #655. Resolving those issues will resolve this issue as well.

tsalo added the bug issues describing a bug or error found in the project label Jan 19, 2021

notZaki closed this as completed Jan 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PCA is applying an inappropriate transformation #654

PCA is applying an inappropriate transformation #654

notZaki commented Jan 18, 2021 •

edited

Loading

tsalo commented Jan 19, 2021

notZaki commented Jan 19, 2021

eurunuela commented Jan 19, 2021

notZaki commented Jan 19, 2021

eurunuela commented Jan 19, 2021

notZaki commented Jan 27, 2021

PCA is applying an inappropriate transformation #654

PCA is applying an inappropriate transformation #654

Comments

notZaki commented Jan 18, 2021 • edited Loading

Summary

Additional Detail

tsalo commented Jan 19, 2021

notZaki commented Jan 19, 2021

eurunuela commented Jan 19, 2021

notZaki commented Jan 19, 2021

eurunuela commented Jan 19, 2021

notZaki commented Jan 27, 2021

notZaki commented Jan 18, 2021 •

edited

Loading