Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong eigenvalues on first run #118

Open
david-pl opened this issue Dec 29, 2020 · 12 comments
Open

Wrong eigenvalues on first run #118

david-pl opened this issue Dec 29, 2020 · 12 comments
Labels
bug Something isn't working upstream

Comments

@david-pl
Copy link

david-pl commented Dec 29, 2020

Hi!

When updating the version of Arpack in a package I noticed tests started to fail some of the time. The following code reproduces the issue. The test at the end fails on Arpack v0.5, but only on the first run. On later runs, it works. I apologize for the somewhat convoluted construction of matrices here, but whenever I tried to simplify the error no longer occurred.

using Arpack
using LinearAlgebra, SparseArrays, Random, Test

Random.seed!(0)

ωc = 1.2
ωa = 0.9
γ = 0.5
κ = 1.1

sz = sparse(ComplexF64[1 0; 0 -1])
sp = sparse(ComplexF64[0 1; 0 0])
sm = sparse(collect(sp'))
ids = one(sz)

a = sparse(diagm(1 => ComplexF64[sqrt(i) for i=1:10]))
ida = one(a)

Ha = kron(ida, 0.5*ωa*sz)
Hc = kron(ωc*a'*a, ids)
Hint = sparse(kron(a', sm) + kron(a, sp))
H = Ha + Hc + Hint

Ja = kron(ida, sqrt(γ)*sm)
Jc = kron(sqrt(κ)*a, ids)
J = sqrt(2) .* [Ja, Jc]
Jdagger = adjoint.(J);
rates = 0.5 .* ones(length(J))

spre(x) = kron(one(x), x)
spost(x) = kron(permutedims(x), one(x))

L = spre(-1im*H) + spost(1im*H)
for i=1:length(J)
    jdagger_j = rates[i]/2*Jdagger[i]*J[i]
    global L -= spre(jdagger_j) + spost(jdagger_j)
    global L += spre(rates[i]*J[i]) * spost(Jdagger[i])
end
d, rest = eigs(L, nev=2, which=:LR)

@test abs(d[1]) < 1e-9

This worked fine on v0.4. Note that when using eigen(Matrix(L)) the eigenvalue with the largest real part is 0.

Version info:

Julia Version 1.5.3
Commit 788b2c77c1 (2020-11-09 13:37 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
@andreasnoack
Copy link
Member

I can reproduce this issue and it goes away if you downgrade to version 3.5 of Arpack_jll, i.e. to version 3.5. of arpack-ng. I noticed something similar some time ago when I tried to upgrade to version 3.7, see #76 (comment). It seems like the arpack-ng developers are making changes to the actual algorithm and not just the building/packaging.

@andreasnoack andreasnoack added bug Something isn't working upstream labels Dec 30, 2020
@ViralBShah
Copy link
Collaborator

I will tag a new release that reverts the Arpack_jll update. We should also add this example to our tests and file the issue upstream.

@ViralBShah
Copy link
Collaborator

They seem to have made small tiny bugfixes in the last couple of years, and I wonder if those affect us. It would be nice if someone can investigate.

@ViralBShah
Copy link
Collaborator

In the above test, if I ask for 3 eigenvalues or more, then the test passes.

julia> d, rest = eigs(L, nev=3, which=:LR);

julia> @test abs(d[1]) < 1e-9
Test Passed

@ViralBShah
Copy link
Collaborator

ViralBShah commented Dec 31, 2020

Also, as reported, running it repeatedly makes it pass sometimes.

julia> @test abs(d[1]) < 1e-9
Test Failed at REPL[60]:1
  Expression: abs(d[1]) < 1.0e-9
   Evaluated: 2.0933319522234535 < 1.0e-9
ERROR: There was an error during testing

julia> d, rest = eigs(L, nev=2, which=:LR);

julia> @test abs(d[1]) < 1e-9
Test Failed at REPL[62]:1
  Expression: abs(d[1]) < 1.0e-9
   Evaluated: 2.093331952223471 < 1.0e-9
ERROR: There was an error during testing

julia> d, rest = eigs(L, nev=2, which=:LR);

julia> @test abs(d[1]) < 1e-9
Test Passed

While running it repeatedly, I also found this error in one of the runs:

julia> d, rest = eigs(L, nev=2, which=:LR); @test abs(d[1]) < 1e-9
ERROR: 
┌ Error: XYAUPD_Exception: Please check XYAUPD error codes in the ARPACK manual.
│   info = 1
└ @ Arpack ~/.julia/dev/Arpack/src/libarpack.jl:24

Stacktrace:
 [1] aupd_wrapper(::Type{T} where T, ::Arpack.var"#18#28"{SparseMatrixCSC{Complex{Float64},Int64}}, ::Arpack.var"#19#29", ::Arpack.var"#20#30", ::Int64, ::Bool, ::Bool, ::String, ::Int64, ::Int64, ::String, ::Float64, ::Int64, ::Int64, ::Array{Complex{Float64},1}) at /Users/viral/.julia/dev/Arpack/src/libarpack.jl:92
 [2] _eigs(::SparseMatrixCSC{Complex{Float64},Int64}, ::UniformScaling{Bool}; nev::Int64, ncv::Int64, which::Symbol, tol::Float64, maxiter::Int64, sigma::Nothing, v0::Array{Complex{Float64},1}, ritzvec::Bool, explicittransform::Symbol) at /Users/viral/.julia/dev/Arpack/src/Arpack.jl:235
 [3] #eigs#10 at /Users/viral/.julia/dev/Arpack/src/Arpack.jl:47 [inlined]
 [4] #eigs#9 at /Users/viral/.julia/dev/Arpack/src/Arpack.jl:46 [inlined]
 [5] top-level scope at REPL[65]:1

@ViralBShah
Copy link
Collaborator

ViralBShah commented Nov 17, 2021

I bisected and found that the failure is introduced between opencollab/arpack-ng@0e7d01d (last known good commit) and opencollab/arpack-ng@7fc42e5.

Unfortunately, many of the shared library builds fail in that range making it difficult to pinpoint, and some of the successful builds complain about not enough space to build the factorization (which feels like a bug since it is working fine in earlier versions).

@ViralBShah
Copy link
Collaborator

If I have done my analysis right - the failure is introduced in https://github.com/opencollab/arpack-ng/compare/0e7d01d..7fc42e5

@ViralBShah
Copy link
Collaborator

ViralBShah commented Oct 21, 2022

@andreasnoack This test fails on ARPACK 3.5 that we ship as well, and I can reproduce it with Arpack.jl 0.4 as well. The error itself appears to be due to different sorting of eigenvalues.

julia> for i=1:10000;  d, rest = eigs(L, nev=2, which=:LR); @test abs(d[1]) < 1e-9; end

Test Failed at REPL[35]:1
  Expression: abs(d[1]) < 1.0e-9
   Evaluated: 2.0933319522234384 < 1.0e-9
ERROR: There was an error during testing

Thus I am not sure if my earlier analyses of where the bug was introduced is correct.

@ViralBShah ViralBShah reopened this Oct 21, 2022
@ViralBShah
Copy link
Collaborator

ViralBShah commented Oct 21, 2022

If I use a rand initialized v0, I can't trigger the failure with ARPACK 3.5.

@andreasnoack
Copy link
Member

What changed relative to #118 (comment)? Is it something about how the library is built?

@ViralBShah
Copy link
Collaborator

ViralBShah commented Oct 22, 2022

More findings for ARPACK 3.5 vs. 3.8 #138 (comment). 3.8 fails reliably even with an initialization vector - fails always.

@ViralBShah ViralBShah changed the title Wrong eigenvalues on first run with 0.5 Wrong eigenvalues on first run Oct 31, 2022
@ViralBShah
Copy link
Collaborator

Here's the matrix from #118 as a gist:

https://gist.github.com/ViralBShah/ae892fe6df0884185313f3f0e10c03cc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working upstream
Projects
None yet
Development

No branches or pull requests

3 participants