You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanks for releasing this library! I really like the design of the api.
I was wondering if you had any ideas to help me figure out some performance issues I am facing. I am using Pca but experiencing relatively long runtimes and high memory usage. I saved the data I was training on as a npy file and ran it through sklearn PCA and it was very quick (minutes vs < 5s). I also tried the linfa rust implementation and got good performance.
The size of my (f64) data is (17105, 900), which shouldn't be too bad, I wouldn't think. I am using the openblas feature (and not any others -- default-fatures = false).
I'm not mentioning the other libraries to criticize this one, it just confused me, as I looked at your code compared to the others and didn't see anything that would explain why I was experiencing such a big difference. Do you have any instinct for what could be going on?
The text was updated successfully, but these errors were encountered:
Could you try RandomizedPca? sklearn uses the full SVD (the algorithm Pca implements) for a small input (< 500x500), but uses a randomized truncated SVD, like RandomizedPca, for larger inputs.
somewhat related performance question: I was trying to modify FastIca to only compute for n_components instead of ncols, but I could not seem to figure it out. do you have any sense of whether that should be an easy fix, or hard fix, in the current codebase? what I mean is, is it simply a means of slicing the arrays in all the right places, or is some more substantial refactor required for that?
hello,
thanks for releasing this library! I really like the design of the api.
I was wondering if you had any ideas to help me figure out some performance issues I am facing. I am using
Pca
but experiencing relatively long runtimes and high memory usage. I saved the data I was training on as a npy file and ran it through sklearn PCA and it was very quick (minutes vs < 5s). I also tried the linfa rust implementation and got good performance.The size of my (
f64
) data is(17105, 900)
, which shouldn't be too bad, I wouldn't think. I am using the openblas feature (and not any others --default-fatures = false
).I'm not mentioning the other libraries to criticize this one, it just confused me, as I looked at your code compared to the others and didn't see anything that would explain why I was experiencing such a big difference. Do you have any instinct for what could be going on?
The text was updated successfully, but these errors were encountered: