Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eigs takes too long to converge #29

Closed
stevengj opened this issue Oct 10, 2013 · 38 comments
Closed

eigs takes too long to converge #29

stevengj opened this issue Oct 10, 2013 · 38 comments

Comments

@stevengj
Copy link
Member

The code below, which constructs a sparse discrete lapacian (or –laplacian, actually) inside a cylinder and evaluates the smallest eigenvalue using eigs, dies with an ARPACKException for me:

Changing to Nx=Ny=100 works fine. However, Nx=Ny=400 is "only" a 100000x100000 (positive-definite real-symmetric) sparse matrix from a 2d grid (hence the sparse-direct solver should be efficient), and similar code works fine in Matlab.

What does this error mean? cc: @ViralBShah

# construct the (M+1)xM matrix D, not including the 1/dx factor
diff1(M) = [ [1.0 zeros(1,M-1)]; diagm(ones(M-1),1) - eye(M) ]

# sparse version (the lazy way):
sdiff1(M) = sparse(diff1(M))

# make the discrete -Laplacian in 2d, with Dirichlet boundaries
function Laplacian(Nx, Ny, Lx, Ly)
   dx = Lx / (Nx+1)
   dy = Ly / (Ny+1)
   Dx = sdiff1(Nx) / dx
   Dy = sdiff1(Ny) / dy
   Ax = Dx' * Dx
   Ay = Dy' * Dy
   return kron(speye(Ny), Ax) + kron(Ay, speye(Nx))
end

# Define mesh for the cylinder
# Code adapted from lecture notes
Lx = 1
Ly = 1
Nx = 400
Ny = 400
x = linspace(-Lx,Lx,Nx+2)[2:end-1]   # a column vector
y = linspace(-Ly,Ly,Ny+2)[2:end-1]'  # a row vector
r = sqrt(x.^2 .+ y.^2)   # use broadcasting (.+) to make Nx x Ny matrix of radii
i = find(r .< 1)         # and get indices of points inside the cylinder
A = Laplacian(Nx,Ny,2Lx,2Ly)
Ai = A[i,i]              # to make a submatrix for the Laplacian inside the cylinder
λi, Ui= eigs(Ai, nev=1, which="SM")

[ViralBShah: The title was ARPACKException needs better error message, which is fixed.]

@stevengj
Copy link
Member Author

Setting maxiter=10000 in eigs fixes the problem, so I guess it is just converging slowly. But it would be good to have a clearer error message, at least.

@stevengj
Copy link
Member Author

(It would also probably converge faster if there were a way to exploit the fact that my matrix is symmetric positive definite. Related to #30.)

@ViralBShah
Copy link
Member

We should certainly give a better message. That is easy to fix.

I don't think there is a way to exploit spd property in arpack - only the fact that it is symmetric.

@stevengj
Copy link
Member Author

If you are doing inverse iteration then you can use Cholesky factorization for SPD matrices.

A basic problem here is that eigs is over-typed. We should do duck-typing: support passing any object (not just an AbstractArray) that supports the necessary operations (size, \, * ?). That way we could pass factorization objects or other abstract linear operations.

@andreasnoack
Copy link
Member

It is tempting to think of AbstractMatrix as the objects that support \ and *, cf. Jutho's post at the mailing list. At least as long as it is not clearly defined what AbstractMatrix is.

By the way. Not directly related to this issue but as a consequence of the example of this issue: eigs seems pretty slow. With the laplacian example I get

@time eigs(Ai,nev=1,which="SM")
elapsed time: 457.095567119 seconds (18300352424 bytes allocated)

and with a pure Julia implementation

@time Arnoldi.eigs(Ai, 1, 2000, eps())
elapsed time: 10.058114537 seconds (6387928 bytes allocated)
1-element Array{Float64,1}:
 5.76436

@ViralBShah
Copy link
Member

I wonder what is happening. Can you post your implementation? What happens in matlab or octave? @vtjnash fixed some Fortran calling issues which could potentially explain this.

@ViralBShah
Copy link
Member

@stevengj we should be able to exploit the spd property for free due to the use of . The polyalgorithm may need some tweaking.

@stevengj
Copy link
Member Author

The memory usage is pretty insane too... 17GiB!

@andreasnoack
Copy link
Member

The implementation is here https://gist.github.com/andreasnoackjensen/6963157. I only did it to learn how the Lanczos method works so it is very simple. I have defined A_mul_B! for sparse matrices to be allocation free like gemv which explains the big difference in allocated memory, but not the whole time difference. I tweaked the arnoldi code to use my A_mul_B! instead of * and now I get

julia> @time eigs(Ai,nev=1,which="SM")[1]
elapsed time: 382.070997098 seconds (85137664 bytes allocated)
1-element Array{Float64,1}:
 5.76436

so much of the reallocation is avoided but the timing is still not good. I think I have made Ai to be identical in MATLAB and there I get

>> tic;eigs(Ai,1,'sm');toc
Elapsed time is 3.371547 seconds.

so the difference is huge.

@ViralBShah
Copy link
Member

579a005c995df57cd92f0dc90e6bda2b25158de6 gives us a better error message.

@ViralBShah
Copy link
Member

@JeffBezanson I have a suspicion that type inference is not working as well as it ought to here, which might be slowing down eigs, by taking a lot of time in the sparse matvec.

@JeffBezanson
Copy link
Member

Can you give me an example call that you think runs too slowly?

@andreasnoack
Copy link
Member

@stevengj's example above is one. There eigs takes 457 seconds and a Julia implementation of the same calculation takes 5.76 second.

@JeffBezanson
Copy link
Member

To narrow it down a bit, should I focus on the product of Ai and a random dense vector?

@JeffBezanson
Copy link
Member

Ok, there aren't any type inference problems in sparse matvec. I tried hoisting the accesses to A.colptr etc. but that didn't do very much. (I'm looking at linalg/sparse.jl:14)
We do need some codegen improvements around 1-d arrays, but I don't expect any more than a factor of 2 from that (probably not even that much).

@stevengj
Copy link
Member Author

It would be good to check the number of iterations that eigs is taking to converge versus the version in Matlab or in Julia by @andreasnoackjensen. If the increase in time is proportional to that, then the problem is probably not codegen, it is some screwup of the Arnoldi algorithm.

@ViralBShah
Copy link
Member

I wrote up this example in matlab, and it seems that matlab's eigs finishes in a handful of iterations (<10). There doesn't seem to be a way to get the number of iterations performed by eigs in matlab.

% constructs a sparse discrete lapacian (or –laplacian, actually) inside a cylinder
% Typically, call genmat with n = 400

function Ai = genmat(n)
% Define mesh for the cylinder
% Code adapted from lecture notes
Lx = 1;
Ly = 1;
Nx = n;
Ny = n;
x = linspace(-Lx,Lx,Nx+2);
x = x(2:end-1)';     % a column vector
y = linspace(-Ly,Ly,Ny+2);
y = y([2:end-1]');  % a row vector
r = sqrt(bsxfun(@plus, x.^2, y.^2));   % use broadcasting (.+) to make Nx x Ny matrix of radii
i = find(r < 1);         % and get indices of points inside the cylinder
A = Laplacian(Nx,Ny,2*Lx,2*Ly);
Ai = A(i,i);              % to make a submatrix for the Laplacian inside the cylinder

end

% make the discrete -Laplacian in 2d, with Dirichlet boundaries
function x = Laplacian(Nx, Ny, Lx, Ly)
   dx = Lx / (Nx+1);
   dy = Ly / (Ny+1);
   Dx = sdiff1(Nx) / dx;
   Dy = sdiff1(Ny) / dy;
   Ax = Dx' * Dx;
   Ay = Dy' * Dy;
   x = kron(speye(Ny), Ax) + kron(Ay, speye(Nx));
end

% construct the (M+1)xM matrix D, not including the 1/dx factor
function x = diff1(M)
  x = [ [1.0 zeros(1,M-1)]; diag(ones(M-1,1),1) - eye(M) ];
end

% sparse version (the lazy way)
function x = sdiff1(M) 
  x = sparse(diff1(M));
end

@ViralBShah
Copy link
Member

@andreasnoackjensen Your code does not always seem to work with @stevengj 's problem, but when it does work, it seems to come back in 4 iterations for Nx=Ny=100 and 11 iterations for Nx=Ny=400. So, clearly something's up with our ARPACK implementation.

@jiahao
Copy link
Member

jiahao commented Oct 26, 2013

can you tell matlab to run just one iteration at a time?

@stevengj
Copy link
Member Author

In matlab, you can pass a function (that takes a vector and returns A*vector) instead of a matrix. Then just include a print statement in your function.

@ViralBShah
Copy link
Member

While @alanedelman and I were exploring this, we took a look at the Golub-Kahan-Lanczos algorithm for the svd, which is about as simple as @andreasnoackjensen 's Arnoldi implementation. It may be best for us to have our own implementations of eigs and svds - even if we do the implicitly restarting versions later.

This is @alanedelman 's implementation of svds:

(m,n)=size(A)
αs=βs=zeros(0)
v=randn(n);v/=norm(v)
u=A*v

for k=1:100
  α=norm(u);αs=[αs; α];u/=α
  v=A'*u-α*v

  β=norm(v);βs=[βs; β];v/=β
  u=A*v-β*u
end
 println(round(svdvals(Bidiagonal(αs,βs[1:end-1],true))[1:5]',3))

@alanedelman
Copy link
Contributor

Some notes on the above:

This is the basic, no bells and whistles Golub-Kahan-Lanczos. We certainly should remove the
magic numbers 100 and 5, and incorporate implicit restarting and perhaps shift and invert strategies .

The algorithm, is mathematically the same as Householder reduction when the starting vector
is eye(n,1) or something, but is suitable for matrices that are not dense.

I think the key point here is that ARPACK has served well for many years, but if we have a
Julia implementation, we can probably look at these methods less as a block box not to be touched,
and more like an organic algorithm to be improved upon as we go.

The same would apply for Lanczos tridiagonalization of course.

@ViralBShah
Copy link
Member

Perhaps we should start an Arnoldi.jl package, which can move into base when it becomes ready. Seems like we have enough momentum.

@alanedelman
Copy link
Contributor

Maybe should be called Lanczos.jl :-)
or ArnoldiLanczos.jl

On Mon, Oct 28, 2013 at 7:48 AM, Viral B. Shah notifications@github.comwrote:

Perhaps we should start an Arnoldi.jl package, which can move into base
when it becomes ready. Seems like we have enough momentum.


Reply to this email directly or view it on GitHubhttps://github.com//issues/29
.

@andreasnoack
Copy link
Member

I would like to contribute to such a package.

@jiahao
Copy link
Member

jiahao commented Oct 28, 2013

+1

@ViralBShah
Copy link
Member

I just created this and you guys should all have commit access to it.

https://github.com/JuliaLang/ArnoldiLanczos.jl

@jiahao
Copy link
Member

jiahao commented Oct 28, 2013

I think IterativeSolvers.jl would be a much more descriptive and inclusive term. Arnoldi and Lanczos are but two common examples of a much broader family.

Unless anyone has better references, I would recommend van der Vorst's Iterative Krylov Methods for Large Linear Systems, and the last couple of chapters in Trefethen and Bau.

@StefanKarpinski
Copy link
Member

That does seem like a much better generic name.

@ViralBShah
Copy link
Member

Done. https://github.com/JuliaLang/IterativeSolvers.jl

We could start off with @andreasnoackjensen 's gist above, @alanedelman 's code for GKL, and other iterative solvers that people started putting together. Perhaps future discussion can be had over in the issues for IterativeSolvers.

@stevengj
Copy link
Member Author

Well, the Templates books (Linear Systems and Eigenproblems) are also decent references. Although they don't include Sleijpen's BiCGSTAB(L) algorithm, which is one of the better ones for large sparse nonsymmetric systems (don't waste your time with any of the other BiCG variants, this subsumes them).

@acroy
Copy link

acroy commented Feb 5, 2014

Apparently MATLAB's eigs uses shift-and-invert if 'sm' is specified while we also have to have sigma!=0 to use it. Thus the comparison is not entirely fair (for us). With 35690fd and shift-and-invert I get

julia> @time (d, v, nconv, niter, nmult, resid) = eigs(Ai,nev=1,which="LM",sigma=0.1)
elapsed time: 4.506958406 seconds (192699384 bytes allocated)
<snip output of eigs>
julia> d
1-element Array{Float64,1}:
 5.76436
julia> niter
1
julia> nmult
20

which is much better. Without the patch the matrix is always converted to a dense matrix when calling lufact (in shift-and-invert mode), which is expensive in this example. Without shift-and-invert, I get

@time (d, v, nconv, niter, nmult, resid) = eigs(Ai,nev=1,which="SM",sigma=0)
elapsed time: 330.53247192 seconds (17535572788 bytes allocated)
<snip output of eigs>
julia> d
1-element Array{Float64,1}:
 5.76436
julia> niter
863
julia> nmult
8640

@stevengj
Copy link
Member Author

stevengj commented Feb 5, 2014

It seems like we should use shift-and-invert for sigma != 0 and inverse iteration (i.e. shift-and-invert without the shift) for which="sm".

@ViralBShah
Copy link
Member

@acroy Thanks for tracking this down. I kept thinking that we have some bug in our ARPACK interface, but this makes sense now.

@ViralBShah
Copy link
Member

Can someone submit a PR? Unfortunately I am quite taken up for a few weeks and don't want to do this in haste.

@JeffBezanson
Copy link
Member

Do we imagine merging the PR for this for 0.3?

@ViralBShah
Copy link
Member

Yes. There is already an open PR, and should be possible to merge for 0.3

@jiahao
Copy link
Member

jiahao commented Mar 6, 2014

Closed by JuliaLang/julia#6053, with continuing work in IterativeSolvers.jl.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants