Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(H+­μI) \ x solvers for Hessenberg factorizations #31853

Merged
merged 40 commits into from
May 17, 2019

Conversation

stevengj
Copy link
Member

@stevengj stevengj commented Apr 27, 2019

As discussed in #31738, this adds a few more methods for the Hessenberg type:

  • The Hessenberg type now includes a shift μ, so that you can form H+μ*I without making a copy.
  • It also includes an O(n²) ldiv! solver for (H+μI) \ x using an algorithm by Henry (1994) that requires only O(n) auxiliary storage and does not modify H.
  • also fast det, logabsdet/logdet, and multiplication by scalars

(I wasn't entirely sure whether to store the shift as H+μI or H-μI … computationally it makes no difference, but it changes what the user gets if they look at H.μ.)

Possible to-dos:

  • One thing I would like to support is a complex shift μ even for a real H, since this shows up in many real problems and it would be nice to support efficiently (i.e. without forcing you to complexify H). Unfortunately the LAPACK *ormhr can't multiply a real Q by a complex vector, so it seems like we'd have to copy the real/imaginary parts to separate arrays to copy.
  • I didn't implement rdiv!. Should be straightforward but seems a bit tedious to rederive Henry's algorithm for this case, so I'm inclined to leave it for a hypothetical future PR. Done.
  • Would be good to pretty-print Hessenberg objects as currently the output is rather inscrutable.
  • adjoints/transposes of Hessenberg factorizations.
  • fast det(H+µI) and logabsdet(H+µI)
  • UpperHessenberg matrix type (analogous to UpperTriangular) for storing upper-Hessenberg matrices (possibly including shifts). F.H now returns an UpperHessenberg matrix.
  • specialized Hessenberg factorization for Hermitian matrices, in which case H is real SymTridiagonal.

cc @jiahao since we just chatted about this.

@stevengj stevengj added linear algebra Linear algebra needs news A NEWS entry is required for this change labels Apr 27, 2019
@stevengj stevengj changed the title WIP: (H+­μI) \ x solvers for Hessenberg factorizations (H+­μI) \ x solvers for Hessenberg factorizations Apr 27, 2019
@stevengj stevengj removed the needs news A NEWS entry is required for this change label Apr 27, 2019
Copy link
Member

@StefanKarpinski StefanKarpinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not really qualified to review but this is an impressively thorough PR. @jiahao, maybe you could review?

@andreasnoack
Copy link
Member

I'll try to wrap my head around this tonight.

@stevengj
Copy link
Member Author

stevengj commented Apr 29, 2019

One related change that I was considering: change Hessenberg(A) to simply assume that A is already upper Hessenberg (either calling triu!(A,-1) or simply ignoring the lower triangle), so that you need to use hessenberg(A) or hessenberg!(A) if you want to compute the Hessenberg factorization.

Rationale:

  • If you have a matrix that is already upper-Hessenberg for some reason, you might want to take advantage of solvers for things like (H+µI) \ x without going through LAPACK's factorization routine.

  • This makes the Hessenberg(A) constructor more like UpperTriangular(A) etcetera.

  • The Hessenberg(A) constructor is currently undocumented (we only documented hessenberg and hessenberg!), so it doesn't seem problematic to change.

  • Currently, Hessenberg(A) and hessenberg!(A) are redundant — if we have two functions, why not get two different functionalities?

Alternatively, it could be a new function called Hessenberg!(A) since it modifies A. (I recall that there was a related discussion about Hermitian!(A), but I can't find it…)

But that could be a separate PR.

@StefanKarpinski
Copy link
Member

I resisted the String!, Symmetric!, Hermitian! and now Hessenberg! pattern but it seems like many people have naturally converged on this same convention and it makes intuitive sense to people, so I've come around to the idea of doing this. I know that @JeffBezanson was also skeptical about this previously—have you had any further thoughts?

@stevengj
Copy link
Member Author

stevengj commented Apr 30, 2019

Probably in this case we should avoid Hessenberg! simply by having H = Hessenberg(A) ignore the lower triangle of A, similar to UpperTriangular(A).

(This could be signified in the data structure in various ways, most simply by having an empty H.τ array.)

Copy link
Member

@andreasnoack andreasnoack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great upgrade of the Hessenberg factorization.

Did you consider a separate HessenbergMatrix? I'm not sure it's worth the trouble but it would allow us to e.g. just overload ldiv! instead of introducing special solve functions. It might also be useful for the eigenvalue problem.

Probably in this case we should avoid Hessenberg! simply by having H = Hessenberg(A) ignore the lower triangle of A, similar to UpperTriangular(A).

I agree. I think it would be a good idea to make Hessenberg just wrap the input matrix with an empty τ vector.

stdlib/LinearAlgebra/src/hessenberg.jl Outdated Show resolved Hide resolved
@stevengj
Copy link
Member Author

stevengj commented Apr 30, 2019

Did you consider a separate HessenbergMatrix? I'm not sure it's worth the trouble but it would allow us to e.g. just overload ldiv! instead of introducing special solve functions. It might also be useful for the eigenvalue problem.

I didn't think about this… it's an interesting thought, but I have a couple of questions:

  • If we add Hessenberg(H) for a matrix that is already in upper-Hessenberg form (i.e. Q=I), then a separate HessenbergMatrix type seems a bit redundant? Presumably we would support HessenbergMatrix(H) instead of Hessenberg(H) with an empty τ array?

  • Having both HessenbergMatrix and Hessenberg might be a bit confusing? On the other hand, this allows us to expose the AbstractMatrix interface (getindex etc.) for HessenbergMatrix.

  • I see that https://github.com/JuliaLinearAlgebra/GenericLinearAlgebra.jl/blob/31d2d8012347de93611902a8579659b7a4dc480c/src/eigenGeneral.jl defines a HessenbergMatrix type, so if we defined our own and exported it it might be breaking … I guess we would have to not export it? On the other hand, it wouldn't be conflicting if we called the new type UpperHessenberg as I suggest below, as that name seems currently unused by any registered Julia package.

  • Would HessenbergMatrix also include the shift μ?

I would tend to call it UpperHessenberg in analogy with UpperTriangular, rather than HessenbergMatrix.

@stevengj
Copy link
Member Author

stevengj commented May 1, 2019

Update: added an UpperHessenberg matrix type analogous to UpperTriangular. For a F::Hessenberg factorization object, F.H now returns this type, which means that F.H is now allocation-free. For example:

julia> A = rand(6,6)
6×6 Array{Float64,2}:
 0.107756   0.181333  0.366188   0.441492   0.0903661  0.342221  
 0.236713   0.121664  0.769639   0.878645   0.774642   0.464094  
 0.204347   0.110714  0.993152   0.0257284  0.408132   0.00811852
 0.0485218  0.618574  0.856544   0.498366   0.109      0.485984  
 0.0248759  0.652369  0.0578524  0.522572   0.615913   0.655619  
 0.0747064  0.344599  0.123562   0.915485   0.732657   0.752754  

julia> F = hessenberg(A);

julia> F.H
6×6 UpperHessenberg{Float64,Array{Float64,2},Bool}:
  0.107756  -0.512071   0.394723  -0.081845   0.076756    0.23782  
 -0.326105   1.48647   -1.21715    0.341651   0.680446   -0.364656 
           -1.25486    0.833861   0.357741   0.214807    0.120456 
                      1.11037    0.478931   0.217121   -0.121404 
                               -0.31514    0.143539    0.219257 
                                         -0.0626903   0.0390469

@stevengj
Copy link
Member Author

stevengj commented May 1, 2019

I made the τ type even more generic in the Hessenberg and HessenbergQ objects. Hopefully it is generic enough now for you to use in your GenericLinearAlgebra package instead of your own HessenbergFactorization and HessenbergMatrix types (where τ is a Vector{Householder{T}}) @andreasnoack?

@stevengj
Copy link
Member Author

stevengj commented May 1, 2019

Upon reflection, I'm not happy with including a shift µ in the UpperHessenberg matrix type. Pretty soon I would like to have hessenberg(A::Hermitian) call LAPACK zhetrd etc. to produce a SymTridiagonal matrix instead of an upper-Hessenberg matrix, with corresponding efficient shifted solvers, and I don't want to add a shift to SymTridiagonal. I see two options:

  • Introduce a ShiftedMatrix{T, S<:AbstractMatrix, V} type that wraps a matrix data::S and a shift μ::V, then have specialized routines acting on this. I'm reluctant to do this because it contributes to the combinatorial explosion of matrix types that we need to handle.

  • Store the shift μ in the Hessenberg factorization type, and add optional shift keyword arguments to ldiv!, det, etcetera for the UpperHessenberg and SymTridiagonal matrix types so that Hessenberg solves can pass in the shift.

I'm inclined towards the latter, since copy-free shifts mainly seem useful in re-using Hessenberg factorizations.

@stevengj
Copy link
Member Author

stevengj commented May 6, 2019

AppVeyor failure is #29880. Travis is GitError(Code:ERROR, Class:Net, curl error: Could not resolve host: github.com

@stevengj
Copy link
Member Author

I updated it to add specialized Hessenberg factorizations for Hermitian matrices, in which case F.H is real SymTridiagonal. This seemed like a good test of generality, and forced me to make a few changes to the types.

In the SymTridiagonal case, the solves are already O(n) with O(n) allocation, so it seems like there is not too much to be gained by a specialized solver ala Henry, especially since multiplying by F.Q is still O(n²). I did add a shift keyword in a couple of places so that the LDLᵀ factorization can be formed without allocating a second copy of the matrix A+μI, and to simplify the Hessenberg methods.

@stevengj
Copy link
Member Author

Travis osx failure is unrelated Error in testset FileWatching.

@stevengj
Copy link
Member Author

@andreasnoack, I've added quite a bit since the last time you reviewed; any comments prior to merging?

@stevengj
Copy link
Member Author

Will merge at the end of the week if there are no further comments.

@stevengj
Copy link
Member Author

stevengj commented May 17, 2019

Tests are passing again except for the unrelated "failed running test Profile" on win32.

Good to merge?

@andreasnoack
Copy link
Member

@stevengj While updating GenericLinearAlgebra to work with this change I wondered why you added the sym parameter. Isn't the information already reflected in the type of the wrapped matrix, i.e. SymTridiagonal?

@stevengj
Copy link
Member Author

@andreasnoack, for hessenberg(A::Hermitian), the SymTridiagonal type is stored only in the type parameter SH of the Hessenberg object. It is not in any of the type parameters of HessenbergQ, which is why I needed a sym parameter there. (Q.factors and Q.τ are ordinary Arrays, but are interpreted differently if they came from a Hermitian factorization.)

For example:

julia> H = hessenberg(Hermitian(rand(4,4)))
Hessenberg{Float64,SymTridiagonal{Float64,Array{Float64,1}},Array{Float64,2},Array{Float64,1},Bool}
Q factor:
4×4 LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},true}:
  0.857333   0.327767  -0.396924  0.0
  0.135862  -0.88782   -0.439679  0.0
 -0.496509   0.323024  -0.805688  0.0
  0.0        0.0        0.0       1.0
H factor:
4×4 SymTridiagonal{Float64,Array{Float64,1}}:
 0.207701   0.0808081                     
 0.0808081  0.211438    0.510799           
           0.510799    1.40159   -0.86616  
                     -0.86616    0.0661149

julia> H.Q
4×4 LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},true}:
  0.857333   0.327767  -0.396924  0.0
  0.135862  -0.88782   -0.439679  0.0
 -0.496509   0.323024  -0.805688  0.0
  0.0        0.0        0.0       1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants