-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: sparse truncated SVD #106
Comments
This should be easy with a wrapper around |
I should clarify that the version being requested is the rank-truncated singular value decomposition returning k largest singular values and the corresponding singular vectors. (issue description updated) I recall some time ago that we had the eigs form some time ago computing the singular values as the eigenvalues of [0 A; A' 0], but it's numerically unstable. |
I'm trying to see if there is a more modern algorithm, but doi:10.1137/0911029 is probably the reference to begin with. The first step is to compute a rank-revealing QR factorization which may or may not be provided by SuiteSparseQR. The second step is to compute the TSVD using Algorithm TSOL. |
I'll take a look at what octave and others do for |
+1 BTW, this would be great for our xdata project. I'm sorely missing some practical implementation of |
What octave does is exactly what I am proposing we do for our How large are your sparse matrices, and on what kind of machines do you run? One thing we can easily do is parallelize the matvecs, just for speed. Cc: @alanedelman |
@jiahao knows about the problem details. Here's a short summary: We're working on a large web-graph decomposition problem that us on the order of 42M dimensions with 623M non-zero entries in the adjacency matrix. Asking for a rank 3 reduction via |
@ViralBShah I did a little digging into the literature and it seems like Golub-Kahan-Lanczos is still pretty much state of the art for large and very sparse truncated SVDs. This survey (pdf) reviews its adaptation into several iterative algorithms in Section 4.3. Not directly related but still interesting is the literature on incrementally updated truncated SVDs: if the rank of the desired factorization is known beforehand, you can define updates to an existing SVD as more columns or rows of the matrix are added. This paper seems to be one of the more comprehensive ones in terms of numerical methods, and has something nontrivial to say about missing data. I'm not sure if these methods are faster than the iterative ones. |
Don't you have an implementation in |
Have you tried the proposed method in ARPACK, i.e. calculating the values and |
@ViralBShah I have only the naive method which computes all the singular values and no singular vectors. I'll see what I can do to get some of these other ones implemented. @andreasnoackjensen That is essentially what all these methods do, but I have no idea what ARPACK does under the hood. I dread having to look. |
We used to use the OP as you suggest earlier but the implementation was buggy. I will try it again now that the whole arpack calling is much more stable. |
I thought that the earlier implementation used the MATLAB/Octave operator |
I don't recollect that. That was the approach I was considering. |
I've found the old one, so just for the record it is here and you are right that it used |
I do not recollect what broke, but we can start with reinstating it and adding tests. |
Has there been any progress on If not then what about adding something like as follows as a place holder: type SymX <: AbstractArray{Float64, 2}
X
end
import Base.size
import Base.*
import Base.issym
*(s::SymX, v::Vector{Float64}) = s.X' * (s.X * v)
size(s::SymX) = size(s.X, 2), size(s.X, 2)
issym(s::SymX) = true
function svds(X; args...)
ex = eigs(SymX(X), I; args...)
## calculating left-side singular vectors
left_sv = X * ex[2]
return left_sv ./ sqrt(sum(left_sv.^2, 1)), sqrt(ex[1]), ex[2], ex[3], ex[4], ex[5], ex[6]
end Example usage: F1 = diagm( [3, 2, 1] )
a1 = svds(F1, nev = 2) |
|
Here you go. Pull request JuliaLang/julia#9425. |
Fixed by JuliaLang/julia#9425 |
There is a request for the rank-truncated singular value decomposition of large sparse matrices returning k largest singular values and the corresponding singular vectors.
The text was updated successfully, but these errors were encountered: