Block-krylov processes

JuliaSmoothOptimizers · Sep 23, 2023 · 7541d0d · 7541d0d
1 parent 1d10c94
commit 7541d0d
Show file tree

Hide file tree

Showing 12 changed files with 988 additions and 21 deletions.
diff --git a/docs/make.jl b/docs/make.jl
@@ -12,6 +12,7 @@ makedocs(
   pages = ["Home" => "index.md",
            "API" => "api.md",
            "Krylov processes" => "processes.md",
+           "Block Krylov processes" => "block_processes.md",
            "Krylov methods" => ["Hermitian positive definite linear systems" => "solvers/spd.md",
                                 "Hermitian indefinite linear systems" => "solvers/sid.md",
                                 "Non-Hermitian square linear systems" => "solvers/unsymmetric.md",

diff --git a/docs/src/block_processes.md b/docs/src/block_processes.md
@@ -0,0 +1,215 @@
+# [Block Krylov processes](@id block-krylov-processes)
+
+## [Block Hermitian Lanczos](@id block-hermitian-lanczos)
+
+If the vector $b$ in the [Hermitian Lanczos](@ref hermitian-lanczos) process is replaced by a matrix $B$ with $p$ columns, we can derive the block Hermitian Lanczos process.
+
+![block_hermitian_lanczos](./graphics/block_hermitian_lanczos.png)
+
+After $k$ iterations of the block Hermitian Lanczos process, the situation may be summarized as
+```math
+\begin{align*}
+  A V_k &= V_{k+1} T_{k+1,k}, \\
+  V_k^H V_k &= I_{pk},
+\end{align*}
+```
+where $V_k$ is an orthonormal basis of the block Krylov subspace $\mathcal{K}_k^{\square}(A,B)$,
+```math
+T_{k+1,k} =
+\begin{bmatrix}
+  \Omega_1 & \Psi_2^H &        &          \\
+  \Psi_2   & \Omega_2 & \ddots &          \\
+           & \ddots   & \ddots & \Psi_k^H \\
+           &          & \Psi_k & \Omega_k \\
+           &          &        & \Psi_{k+1}
+\end{bmatrix}.
+```
+
+```@docs
+hermitian_lanczos(::Any, ::AbstractMatrix{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
+```
+
+## [Block Non-Hermitian Lanczos](@id block-nonhermitian-lanczos)
+
+If the vectors $b$ and $c$ in the [non-Hermitian Lanczos](@ref nonhermitian-lanczos) process are replaced by matrices $B$ and $C$ with both $p$ columns, we can derive the block non-Hermitian Lanczos process.
+
+![block_nonhermitian_lanczos](./graphics/block_nonhermitian_lanczos.png)
+
+After $k$ iterations of the block non-Hermitian Lanczos process, the situation may be summarized as
+```math
+\begin{align*}
+  A V_k &= V_{k+1} T_{k+1,k}, \\
+  A^H U_k &= U_{k+1} T_{k,k+1}^H, \\
+  V_k^H U_k &= U_k^H V_k = I_{pk},
+\end{align*}
+```
+where $V_k$ and $U_k$ are bases of the block Krylov subspaces $\mathcal{K}^{\square}_k(A,B)$ and $\mathcal{K}^{\square}_k (A^H,C)$, respectively,
+```math
+T_{k+1,k} =
+\begin{bmatrix}
+  \Omega_1 & \Phi_2   &        &          \\
+  \Psi_2   & \Omega_2 & \ddots &          \\
+           & \ddots   & \ddots & \Phi_k   \\
+           &          & \Psi_k & \Omega_k \\
+           &          &        & \Psi_{k+1}
+\end{bmatrix}
+, \qquad
+T_{k,k+1}^H =
+\begin{bmatrix}
+  \Omega_1^H & \Psi_2^H   &          &            \\
+  \Phi_2^H   & \Omega_2^H & \ddots   &            \\
+             & \ddots     & \ddots   & \Psi_k^H   \\
+             &            & \Phi_k^H & \Omega_k^H \\
+             &            &          & \Phi_{k+1}^H
+\end{bmatrix}.
+```
+
+```@docs
+nonhermitian_lanczos(::Any, ::AbstractMatrix{FC}, ::AbstractMatrix{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
+```
+
+## [Block Arnoldi](@id block-arnoldi)
+
+If the vector $b$ in the [Arnoldi](@ref arnoldi) process is replaced by a matrix $B$ with $p$ columns, we can derive the block Arnoldi process.
+
+![block_arnoldi](./graphics/block_arnoldi.png)
+
+After $k$ iterations of the block Arnoldi process, the situation may be summarized as
+```math
+\begin{align*}
+  A V_k &= V_{k+1} H_{k+1,k}, \\
+  V_k^H V_k &= I_{pk},
+\end{align*}
+```
+where $V_k$ is an orthonormal basis of the block Krylov subspace $\mathcal{K}_k^{\square}(A,B)$,
+```math
+H_{k+1,k} =
+\begin{bmatrix}
+  \Psi_{1,1}~ & \Psi_{1,2}~ & \ldots       & \Psi_{1,k}   \\
+  \Psi_{2,1}~ & \ddots~     & \ddots       & \vdots       \\
+              & \ddots~     & \ddots       & \Psi_{k-1,k} \\
+              &             & \Psi_{k,k-1} & \Psi_{k,k}   \\
+              &             &              & \Psi_{k+1,k}
+\end{bmatrix}.
+```
+
+```@docs
+arnoldi(::Any, ::AbstractMatrix{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
+```
+
+## [Block Golub-Kahan](@id block-golub-kahan)
+
+If the vector $b$ in the [Golub-Kahan](@ref golub-kahan) process is replaced by a matrix $B$ with $p$ columns, we can derive the block Golub-Kahan process.
+
+![block_golub_kahan](./graphics/block_golub_kahan.png)
+
+After $k$ iterations of the block Golub-Kahan process, the situation may be summarized as
+```math
+\begin{align*}
+  A V_k &= U_{k+1} B_k, \\
+  A^H U_{k+1} &= V_{k+1} L_{k+1}^H, \\
+  V_k^H V_k &= U_k^H U_k = I_{pk},
+\end{align*}
+```
+where $V_k$ and $U_k$ are bases of the block Krylov subspaces $\mathcal{K}_k^{\square}(A^HA,A^HB)$ and $\mathcal{K}_k^{\square}(AA^H,B)$, respectively,
+```math
+B_k =
+\begin{bmatrix}
+  \Omega_1 &          &        &            \\
+  \Psi_2   & \Omega_2 &        &            \\
+           & \ddots   & \ddots &            \\
+           &          & \Psi_k & \Omega_k   \\
+           &          &        & \Psi_{k+1} \\
+\end{bmatrix}
+, \qquad
+L_{k+1}^H =
+\begin{bmatrix}
+  \Omega_1^H & \Psi_2^H   &        &            &                \\
+             & \Omega_2^H & \ddots &            &                \\
+             &            & \ddots & \Psi_k^H   &                \\
+             &            &        & \Omega_k^H & \Psi_{k+1}^H   \\
+             &            &        &            & \Omega_{k+1}^H \\
+\end{bmatrix}.
+```
+
+```@docs
+golub_kahan(::Any, ::AbstractMatrix{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
+```
+
+## [Block Saunders-Simon-Yip](@id block-saunders-simon-yip)
+
+If the vectors $b$ and $c$ in the [Saunders-Simon-Yip](@ref saunders-simon-yip) process are replaced by matrices $B$ and $C$ with both $p$ columns, we can derive the block Saunders-Simon-Yip process.
+
+![block_saunders_simon_yip](./graphics/block_saunders_simon_yip.png)
+
+After $k$ iterations of the block Saunders-Simon-Yip process, the situation may be summarized as
+```math
+\begin{align*}
+  A U_k &= V_{k+1} T_{k+1,k}, \\
+  A^H V_k &= U_{k+1} T_{k,k+1}^H, \\
+  V_k^H V_k &= U_k^H U_k = I_{pk},
+\end{align*}
+```
+where $\begin{bmatrix} V_k & 0 \\ 0 & U_k \end{bmatrix}$ is an orthonormal basis of the block Krylov subspace $\mathcal{K}^{\square}_k \left(\begin{bmatrix} 0 & A \\ A^H & 0 \end{bmatrix}, \begin{bmatrix} B & 0 \\ 0 & C \end{bmatrix}\right)$,
+```math
+T_{k+1,k} =
+\begin{bmatrix}
+  \Omega_1 & \Phi_2   &        &          \\
+  \Psi_2   & \Omega_2 & \ddots &          \\
+           & \ddots   & \ddots & \Phi_k   \\
+           &          & \Psi_k & \Omega_k \\
+           &          &        & \Psi_{k+1}
+\end{bmatrix}
+, \qquad
+T_{k,k+1}^H =
+\begin{bmatrix}
+  \Omega_1^H & \Psi_2^H   &          &            \\
+  \Phi_2^H   & \Omega_2^H & \ddots   &            \\
+             & \ddots     & \ddots   & \Psi_k^H   \\
+             &            & \Phi_k^H & \Omega_k^H \\
+             &            &          & \Phi_{k+1}^H
+\end{bmatrix}.
+```
+
+```@docs
+saunders_simon_yip(::Any, ::AbstractMatrix{FC}, ::AbstractMatrix{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
+```
+
+## [Block Montoison-Orban](@id block-montoison-orban)
+
+If the vectors $b$ and $c$ in the [Montoison-Orban](@ref montoison-orban) process are replaced by matrices $D$ and $C$ with both $p$ columns, we can derive the block Montoison-Orban process.
+
+![block_montoison_orban](./graphics/block_montoison_orban.png)
+
+After $k$ iterations of the block Montoison-Orban process, the situation may be summarized as
+```math
+\begin{align*}
+  A U_k &= V_{k+1} H_{k+1,k}, \\
+  B V_k &= U_{k+1} F_{k+1,k}, \\
+  V_k^H V_k &= U_k^H U_k = I_{pk},
+\end{align*}
+```
+where $\begin{bmatrix} V_k & 0 \\ 0 & U_k \end{bmatrix}$ is an orthonormal basis of the block Krylov subspace $\mathcal{K}^{\square}_k \left(\begin{bmatrix} 0 & A \\ B & 0 \end{bmatrix}, \begin{bmatrix} D & 0 \\ 0 & C \end{bmatrix}\right)$,
+```math
+H_{k+1,k} =
+\begin{bmatrix}
+  \Psi_{1,1}~ & \Psi_{1,2}~ & \ldots       & \Psi_{1,k}   \\
+  \Psi_{2,1}~ & \ddots~     & \ddots       & \vdots       \\
+              & \ddots~     & \ddots       & \Psi_{k-1,k} \\
+              &             & \Psi_{k,k-1} & \Psi_{k,k}   \\
+              &             &              & \Psi_{k+1,k}
+\end{bmatrix}
+, \qquad
+F_{k+1,k} =
+\begin{bmatrix}
+  \Phi_{1,1}~ & \Phi_{1,2}~ & \ldots       & \Phi_{1,k}   \\
+  \Phi_{2,1}~ & \ddots~     & \ddots       & \vdots       \\
+              & \ddots~     & \ddots       & \Phi_{k-1,k} \\
+              &             & \Phi_{k,k-1} & \Phi_{k,k}   \\
+              &             &              & \Phi_{k+1,k}
+\end{bmatrix}.
+```
+
+```@docs
+montoison_orban(::Any, ::Any, ::AbstractMatrix{FC}, ::AbstractMatrix{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
+```
diff --git a/docs/src/graphics/block_arnoldi.png b/docs/src/graphics/block_arnoldi.png
diff --git a/docs/src/graphics/block_golub_kahan.png b/docs/src/graphics/block_golub_kahan.png
diff --git a/docs/src/graphics/block_hermitian_lanczos.png b/docs/src/graphics/block_hermitian_lanczos.png
diff --git a/docs/src/graphics/block_montoison_orban.png b/docs/src/graphics/block_montoison_orban.png
diff --git a/docs/src/graphics/block_nonhermitian_lanczos.png b/docs/src/graphics/block_nonhermitian_lanczos.png
diff --git a/docs/src/graphics/block_saunders_simon_yip.png b/docs/src/graphics/block_saunders_simon_yip.png
diff --git a/docs/src/processes.md b/docs/src/processes.md
@@ -60,18 +60,18 @@ For matrices $C \in \mathbb{C}^{n \times n} \enspace$ and $\enspace T \in \mathb
 \left\{\sum_{i=0}^{k-1} C^i T \, \Omega_i \, \middle \vert \, \Omega_i \in \mathbb{C}^{p \times p},~0 \le i \le k-1 \right\}.
 ```
 
-## Hermitian Lanczos
+## [Hermitian Lanczos](@id hermitian-lanczos)
 
 ![hermitian_lanczos](./graphics/hermitian_lanczos.png)
 
 After $k$ iterations of the Hermitian Lanczos process, the situation may be summarized as
 ```math
 \begin{align*}
-  A V_k &= V_k T_k + \beta_{k+1,k} v_{k+1} e_k^T = V_{k+1}  T_{k+1,k}, \\
+  A V_k &= V_k T_k + \beta_{k+1,k} v_{k+1} e_k^T = V_{k+1} T_{k+1,k}, \\
   V_k^H V_k &= I_k,
 \end{align*}
 ```
-where $V_k$ is an orthonormal basis of the Krylov subspace $\mathcal{K}_k (A,b)$,
+where $V_k$ is an orthonormal basis of the Krylov subspace $\mathcal{K}_k(A,b)$,
 ```math
 T_k =
 \begin{bmatrix}
@@ -94,17 +94,17 @@ The function [`hermitian_lanczos`](@ref hermitian_lanczos) returns $V_{k+1}$ and
 Related methods: [`SYMMLQ`](@ref symmlq), [`CG`](@ref cg), [`CR`](@ref cr), [`CAR`](@ref car), [`MINRES`](@ref minres), [`MINRES-QLP`](@ref minres_qlp), [`MINARES`](@ref minares), [`CGLS`](@ref cgls), [`CRLS`](@ref crls), [`CGNE`](@ref cgne), [`CRMR`](@ref crmr), [`CG-LANCZOS`](@ref cg_lanczos) and [`CG-LANCZOS-SHIFT`](@ref cg_lanczos_shift).
 
 ```@docs
-hermitian_lanczos
+hermitian_lanczos(::Any, ::AbstractVector{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
 ```
 
-## Non-Hermitian Lanczos
+## [Non-Hermitian Lanczos](@id nonhermitian-lanczos)
 
 ![nonhermitian_lanczos](./graphics/nonhermitian_lanczos.png)
 
 After $k$ iterations of the non-Hermitian Lanczos process (also named the Lanczos biorthogonalization process), the situation may be summarized as
 ```math
 \begin{align*}
-  A V_k &= V_k T_k   +        \beta_{k+1} v_{k+1} e_k^T = V_{k+1} T_{k+1,k},   \\
+  A V_k &= V_k T_k + \beta_{k+1} v_{k+1} e_k^T = V_{k+1} T_{k+1,k}, \\
   A^H U_k &= U_k T_k^H + \bar{\gamma}_{k+1} u_{k+1} e_k^T = U_{k+1} T_{k,k+1}^H, \\
   V_k^H U_k &= U_k^H V_k = I_k,
 \end{align*}
@@ -140,7 +140,7 @@ Related methods: [`BiLQ`](@ref bilq), [`QMR`](@ref qmr), [`BiLQR`](@ref bilqr),
     With these scaling factors, the non-Hermitian Lanczos process coincides with the Hermitian Lanczos process when $A = A^H$ and $b = c$.
 
 ```@docs
-nonhermitian_lanczos
+nonhermitian_lanczos(::Any, ::AbstractVector{FC}, ::AbstractVector{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
 ```
 
 The non-Hermitian Lanczos process can be also implemented without $A^H$ (transpose-free variant).
@@ -169,7 +169,7 @@ Because $\alpha_k = u_k^H A v_k$ and $(\bar{\gamma}_{k+1} u_{k+1})^H (\beta_{k+1
 \end{align*}
 ```
 
-## Arnoldi
+## [Arnoldi](@id arnoldi)
 
 ![arnoldi](./graphics/arnoldi.png)
 
@@ -205,17 +205,17 @@ Related methods: [`DIOM`](@ref diom), [`FOM`](@ref fom), [`DQGMRES`](@ref dqgmre
     The Arnoldi process coincides with the Hermitian Lanczos process when $A$ is Hermitian.
 
 ```@docs
-arnoldi
+arnoldi(::Any, ::AbstractVector{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
 ```
 
-## Golub-Kahan
+## [Golub-Kahan](@id golub-kahan)
 
 ![golub_kahan](./graphics/golub_kahan.png)
 
 After $k$ iterations of the Golub-Kahan bidiagonalization process, the situation may be summarized as
 ```math
 \begin{align*}
-  A V_k &= U_{k+1} B_k,   \\
+  A V_k &= U_{k+1} B_k, \\
   A^H U_{k+1} &= V_k B_k^H + \bar{\alpha}_{k+1} v_{k+1} e_{k+1}^T = V_{k+1} L_{k+1}^H, \\
   V_k^H V_k &= U_k^H U_k = I_k,
 \end{align*}
@@ -255,17 +255,17 @@ Related methods: [`LNLQ`](@ref lnlq), [`CRAIG`](@ref craig), [`CRAIGMR`](@ref cr
     It is also related to the Hermitian Lanczos process applied to $\begin{bmatrix} 0 & A \\ A^H & 0 \end{bmatrix}$ with initial vector $\begin{bmatrix} b \\ 0 \end{bmatrix}$.
 
 ```@docs
-golub_kahan
+golub_kahan(::Any, ::AbstractVector{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
 ```
 
-## Saunders-Simon-Yip
+## [Saunders-Simon-Yip](@id saunders-simon-yip)
 
 ![saunders_simon_yip](./graphics/saunders_simon_yip.png)
 
 After $k$ iterations of the Saunders-Simon-Yip process (also named the orthogonal tridiagonalization process), the situation may be summarized as
 ```math
 \begin{align*}
-  A U_k &= V_k T_k   + \beta_{k+1}  v_{k+1} e_k^T = V_{k+1} T_{k+1,k},   \\
+  A U_k &= V_k T_k + \beta_{k+1} v_{k+1} e_k^T = V_{k+1} T_{k+1,k}, \\
   A^H V_k &= U_k T_k^H + \bar{\gamma}_{k+1} u_{k+1} e_k^T = U_{k+1} T_{k,k+1}^H, \\
   V_k^H V_k &= U_k^H U_k = I_k,
 \end{align*}
@@ -296,14 +296,14 @@ The function [`saunders_simon_yip`](@ref saunders_simon_yip) returns $V_{k+1}$,
 
 Related methods: [`USYMLQ`](@ref usymlq), [`USYMQR`](@ref usymqr), [`TriLQR`](@ref trilqr), [`TriCG`](@ref tricg) and [`TriMR`](@ref trimr).
 
-```@docs
-saunders_simon_yip
-```
-
 !!! note
     The Saunders-Simon-Yip is equivalent to the block-Lanczos process applied to $\begin{bmatrix} 0 & A \\ A^H & 0 \end{bmatrix}$ with initial matrix $\begin{bmatrix} b & 0 \\ 0 & c \end{bmatrix}$.
 
-## Montoison-Orban
+```@docs
+saunders_simon_yip(::Any, ::AbstractVector{FC}, ::AbstractVector{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
+```
+
+## [Montoison-Orban](@id montoison-orban)
 
 ![montoison_orban](./graphics/montoison_orban.png)
 
@@ -356,5 +356,5 @@ Related methods: [`GPMR`](@ref gpmr).
     It also coincides with the Saunders-Simon-Yip process when $B = A^H$.
 
 ```@docs
-montoison_orban
+montoison_orban(::Any, ::Any, ::AbstractVector{FC}, ::AbstractVector{FC}, ::Int) where FC <: (Union{Complex{T}, T} where T <: AbstractFloat)
 ```
diff --git a/src/Krylov.jl b/src/Krylov.jl
@@ -5,7 +5,9 @@ using LinearAlgebra, SparseArrays, Printf
 include("krylov_utils.jl")
 include("krylov_stats.jl")
 include("krylov_solvers.jl")
+
 include("krylov_processes.jl")
+include("block_krylov_processes.jl")
 
 include("cg.jl")
 include("cr.jl")