-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Pauthenet 2017 style preprocessing for multiple features. #27
Comments
hi @sdat2
The 2nd point is somehow tricky internally I think, and we have to keep in mind performance in mind, computing the PCA for the concatenated field will be much longer than on each features. Would you like to work on this ? |
hi @gmaze I think Etienne thought that the spline basis projection step doesn't make much difference to the output of the PCA (assuming the inputs are clean), so we can probably not implement that step for now. A |
ok ! |
It would require to take out of |
Currently each feature (say salt and temperature) is fitted with a separate set of PCs
e.g.
will have will have fitted two PCs to SALT and two PCs to THETA.
In Pauthenet et al. 2017 he first transforms SALT and THETA to a spline basis, to scale them, and then concatenates these elements into one long vector which he performs PCA on, resulting in three thermohaline PCs.
https://doi.org/10.1175/JPO-D-16-0083.1
We wouldn't need to worry about transforming on to a spline basis (it doesn't seem to make much difference), but we could add a keyword argument like
m = pcm(K=5, features=features_pcm, maxvar=2, join=True)
so that there is a concatenation before the principal component step.
The text was updated successfully, but these errors were encountered: