-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
svdsolve
rrule error when not all values are converged
#110
Comments
I am confused, |
Yeah, that's exactly what it's hitting. The log is a bit cryptic (blame Zygote), but this is relevant: [ Info: CTMRG conv 4: obj = +9.192864957180e-01 err = 4.1934293618e-11 time = 0.75 sec
┌ Warning: Invariant subspace of dimension 7 (up to requested tolerance `tol = 1.0e-8`), which is smaller than the number of requested eigenvalues (i.e. `howmany == 14`); setting `howmany = 7`.
└ @ KrylovKit ~/.julia/packages/KrylovKit/xccMN/src/eigsolve/arnoldi.jl:350
┌ Warning: `svdsolve` cotangent linear problem (7) returns unexpected result
└ @ KrylovKitChainRulesCoreExt ~/.julia/packages/KrylovKit/xccMN/ext/KrylovKitChainRulesCoreExt/svdsolve.jl:240
┌ Warning: `svdsolve` cotangent linear problem (14) returns unexpected result
└ @ KrylovKitChainRulesCoreExt ~/.julia/packages/KrylovKit/xccMN/ext/KrylovKitChainRulesCoreExt/svdsolve.jl:240 |
Ok that is very weird. The different eigenvectors of the linear operator are supposed to be of the form with I will try to reproduce this locally and investigate. |
Apologies that I dont have a MWE, I am not sure how to construct something for which the forward algorithm passes but the reverse fails. In any case, my guess wasn't that there necessarily is something wrong with KrylovKit, I think our tolerances were just not configured properly, but I would expect KrylovKit to throw a warning and move on, rather than throwing a boundserror |
These are the singular values of the forward calculation: [0.5827127960663261, 0.013616701381118779, 0.013616700815086135, 0.00031674568674968046, 7.429999268923436e-6, 7.4299980267290364e-6, 2.0722724010027094e-7, 2.0722722219200433e-7, 2.0722720896899213e-7, 2.0540682979946611e-7, 5.767011119369518e-9, 5.767010235714964e-9, 4.053212060637025e-9, 4.053210898281218e-9] Clearly, there are degeneracies here, likely due to the SU2 symmetry of the Heisenberg model. As always, Krylov method cannot guarantee to find this; I think it is pretty amazing that it can find these degenerate values so well. But this is why the reverse algorithm chokes; there the degeneracies do cause the issue of the smaller invariant subspace. I will need to think a bit about this. The fact that you don't get a "gauge dependence" warning, probably means that the adjoint variables associated with the singular vectors of degenerate singular values are all linearly dependent, so that maybe you can just replace it with having to solve a single linear system, and thus in the auxiliary eigenvalue problem that you build, you could throw out the degenerate copies. |
I assume this issue has not yet magically gone away yet. How pressing is this? |
As far as I know we managed to work around this in PEPSKit and I have not seen it again myself. |
Ran into this again by accident. It doesn't occur in any of the regular tests anymore, but I can reproduce it by starting a PEPS optimization from a product state. Not really pressing I think, but very confusing to run into. MWE: # using current PEPSKit.jl master...
# Pkg.add(;
# url="https://github.com/QuantumKitHub/PEPSKit.jl",
# rev="4de9e091b7f722933b28f097f1feaa8498ce5f04",
# )
using Random
using PEPSKit
using TensorKit
using OptimKit
g = 3.1
e = -1.6417 * 2
mˣ = 0.91
# initialize parameters
χbond = 2
χenv = 16
ctm_alg = SimultaneousCTMRG(; )
opt_alg = PEPSOptimize(;
boundary_alg=ctm_alg, optimizer=LBFGS(4; gradtol=1e-3, verbosity=3)
)
# initialize states
H = transverse_field_ising(InfiniteSquare(); g)
Random.seed!(72534343814)
psi_init = product_peps(2, χbond; noise_amp=2e-1)
env_init = leading_boundary(CTMRGEnv(psi_init, ComplexSpace(χenv)), psi_init, ctm_alg)
# find fixedpoint
result = fixedpoint(psi_init, H, opt_alg, env_init) (dev) pkg> status
Status `~/git/PEPSKit.jl/dev/Project.toml`
[0b1a1467] KrylovKit v0.9.3
⌅ [77e91f04] OptimKit v0.3.1
[52969e89] PEPSKit v0.3.0 `https://github.com/QuantumKitHub/PEPSKit.jl#4de9e09`
[07d1fe3e] TensorKit v0.14.3
[9a3f8284] Random Stacktrace
Package and version info(dev) pkg> status
Status `~/git/PEPSKit.jl/dev/Project.toml`
[0b1a1467] KrylovKit v0.9.3
⌅ [77e91f04] OptimKit v0.3.1
[52969e89] PEPSKit v0.3.0 `https://github.com/QuantumKitHub/PEPSKit.jl#4de9e09`
[07d1fe3e] TensorKit v0.14.3
[9a3f8284] Random julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 16 × Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores) |
This keeps popping up here and there, and becomes significantly worse when working with symmetries with small blocks. Do you think there is a sensible fix for this @Jutho, or does this really mean something is going fundamentally wrong in the simulations that we should safeguard against? |
* Enhance documentation of `compute_projector` function in projectors.jl * Renormalize expanded corner tensors in sequential.jl and simultaneous.jl * Refactor `sequential_projectors` and `simultaneous_projectors` to use `compute_projector` * Remove unnecessary changes from previous commits Note: Tests in heisenberg.jl and tf_ising.jl fail due to KrylovKit issue QuantumKitHub#110 (Jutho/KrylovKit.jl#110)
* Enhance documentation of `compute_projector` function in projectors.jl * Renormalize expanded corner tensors in sequential.jl and simultaneous.jl * Refactor `sequential_projectors` and `simultaneous_projectors` to use `compute_projector` * Remove unnecessary changes from previous commits Note: Tests in heisenberg.jl and tf_ising.jl fail due to KrylovKit issue QuantumKitHub#110 (Jutho/KrylovKit.jl#110)
Changed the tolerance parameter in rrule_alg from using the variable ctmrg_tol to a hardcoded value of 1.0e-10 to address KrylovKit.jl issue QuantumKitHub#110. Ref: Jutho/KrylovKit.jl#110
I just fixed the issue by setting a small tolerance (1.0e-10) in Arnoldi algorithm occurred in my case. |
So it seems that at least in some cases, this issue is caused by the fact that the magnitude of the smallest remaining singular value lies below the tolerance of the What do you think @Jutho? |
Yes; I do not feel great deviating from the arguments that were entered, but I guess there is (currently) no way to replicate such dynamic behavior in any other way. Is this still in the case where you actually use the KrylovKit |
Yes, I think all the occurrences of this issue in PEPSKit.jl are within that specific context. We should be able to circumvent this soon once QuantumKitHub/PEPSKit.jl#150 is merged, but it still seems worth it to address this issue here. Currently there's just not many people using this so it's hard to tell if this issue would come up in other contexts, but in principle I don't see why it wouldn't. |
I guess in spirit it's kind of a similar issue as when If you prefer to throw an error in case the |
I am still a bit confused; given that you manually call KrylovKit's rrule, is it possible to set the tolerance after you already know |
Yes, it's definitely possible to set the tolerance from within PEPSKit.jl. Part of the question (which may not have been clear from my explanation) is whether it would be better to circumvent this from PEPSKit.jl, or address it directly here. Since it seemed to me that this problem is more broad than just our use case I personally thought it might be good to address it here. But indeed, while I keep kind of forgetting we're never going through the actual forward computation in our use case, it is definitely much more likely for the trouble to start in the forward computation in a normal use case. Given this, it does feel a bit strange to change to try to fix things in the backward pass when the forward was ill-defined. If not intervening, does it make sense to put some warnings in place? Performing a tolerance versus smallest singular value magnitude check in the rrule and throwing a warning if the former is larger is simple enough I guess and might be helpful. Will this issue be caught by any of the current warnings in the forward computation? |
In one of the PEPSKit testruns, we are running into an issue where the within the rrule for
svdsolve
, the eigensolver is failing to converge all requested values. In this case, the outputvals
andvecs
are not of the expected length, leading to an out of bounds error.The stacktrace is found here
The offending line is this line .
There is already a warning being thrown, but I guess we have to limit
n
to the size of the computed vals and vecs.The text was updated successfully, but these errors were encountered: