-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add QR algorithm #472
base: master
Are you sure you want to change the base?
Add QR algorithm #472
Conversation
yield [ | ||
'A' => [ | ||
[3, 0, 0, 0], | ||
[0, 1, 0, 1], | ||
[0, 0, 2, 0], | ||
[0, 1, 0, 1], | ||
], | ||
'H' => [ | ||
[3, 0, 0, 0], | ||
[0, 1, 1, 0], | ||
[0, 1, 1, 0], | ||
[0, 0, 0, 2], | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question about Hessenberg matrices: Is there only one upper Hessenberg matrix for any matrix, or are there multiple possible Hessenberg matrices that satisfy the critiera?
I tried this test case in R and SciPy and it got a slightly different answer.
> A <- rbind(c(3, 0, 0, 0), c(0, 1, 0, 1), c(0, 0, 2, 0), c(0, 1, 0, 1))
> hb <- hessenberg(A)
> hb
$H
[,1] [,2] [,3] [,4]
[1,] 3 0.000000e+00 0.000000e+00 0.000000e+00
[2,] 0 1.000000e+00 -1.000000e+00 2.220446e-16
[3,] 0 -1.000000e+00 1.000000e+00 -5.551115e-16
[4,] 0 2.220446e-16 -4.440892e-16 2.000000e+00
$P
[,1] [,2] [,3] [,4]
[1,] 1 0 0.000000e+00 0.000000e+00
[2,] 0 1 0.000000e+00 0.000000e+00
[3,] 0 0 2.220446e-16 -1.000000e+00
[4,] 0 0 -1.000000e+00 2.220446e-16
In [1]: import numpy as np
In [2]: from scipy.linalg import hessenberg
In [10]: A = np.array([[3, 0, 0, 0], [0, 1, 0, 1], [0, 0, 2, 0], [0, 1, 0, 1]])
In [11]: A
Out[11]:
array([[3, 0, 0, 0],
[0, 1, 0, 1],
[0, 0, 2, 0],
[0, 1, 0, 1]])
In [12]: H = hessenberg(A)
In [13]: H
Out[13]:
array([[ 3., 0., 0., 0.],
[ 0., 1., -1., 0.],
[ 0., -1., 1., 0.],
[ 0., 0., 0., 2.]])
This online calculator however gets the same answer as you have here.
I'd like to help with generating test cases, but I don't know enough about it so I want to make sure I understand how it works before offering help.
Thanks,
Mark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the values are unique, the signs are not. In addition to the upper hessenberg form, the algorithm also computes a permutation matrix, P, such that A = PHPT
. Wolfram returns the hessenberg form with the added context of the permutation matrix, and multiplying it accordingly gives the original matrix.
In the NumericMatrix::upperHessenberg
method, the P matrix calculated is exactly the same as R, Python, and Wolfram, it just has different signs:
[1, 0, 0, 0]
[0, 1, 0, 0]
[0, 0, 0, 1]
[0, 0, 1, 0]
Multiplying A = PHPT
gives the original matrix as well.
So it might be better to return both matrices for context rather than just the upper hessenberg matrix
Also, I had originally tried the other method of getting the matrix in Upper-Hessenberg form using Givens rotations. It seems more widely used than the householder method, and I think that's because it allows for parallelization + works better on sparse matrices. This particular matrix needed (I think) a planar rotation of pi/2 to zero out the 1 at (4, 1), so the givens rotation block was just:
[cos(π/2), -sin(π/2)] = [0, -1]
[sin(π/2), cos(π/2)] [1, 0]
All that to say, I wouldn't be surprised if the Givens method introduced negative signs to both P
and H
So the
@Beakerboy what sort of input does the eigenvectors method expect for the |
@Aweptimum According to the source, it's an array of floats and is the same size as the matrix. Duplicated should be listed multiple times. When you say "it complains" what is the error that is reported? |
I just called |
Yeah, that's what I said, I think the duplicates should be passed in, so don't call |
Yeah, I meant to communicate it was my assumption and I'm glad to have your affirmation. I'm having a hard time writing today. I guess there's some weird edge case in the eigenvectors method when the supplied eigenvalues contain repeats. I put the failing eigenvector cases (#4 and #5 in |
From scanning the code, it looks like the algorithm is supposed to return two orthogonal vectors when it sees the first “1” in the eigenvalue list. Then when it is generating the return matrix, it will sort them so that one of them is assigned to the position of the first “1” and the other the position of the second “1”. It’s been years since I wrote this so no idea what might be going on. |
It's ok, I found the culprit! It's this line here $found = \array_column($solution_array, 'eigenvalue');
$filtered = \array_filter($found, fn($v) => Arithmetic::almostEqual($v, $eigenvalue, $A->getError()));
$key = array_key_first($filtered);
if ($key === null) {
// ... solve
} And it returns the correct solutions for the repeats. I think I'll submit this patch as a separate PR so mark can just merge it, although array_key_first was introduced in 7.3 so I'll have to use something else |
array_search seems to fail in most cases when looking for a float in an array of floats. And even if it does find a match, if the key is 0, php evaluates `!0` to true. To find a match, we can instead loop through and compare the numbers with `Arithmetic::almostEqual` and then explicitly check if `$key === false`
Converts a matrix to hessenberg form using householder reflections
The Wilkinson shift accelerates the rate of convergence for the QR algorithm - it's in the static calcShift helper Additionally, moved the iteration to an inner helper function to hide the $values parameter from the public method
Just gets the eigenvalues from the qr algorithm and feeds them to the eigenvectors method
They work, so the problem must be in the eigenvectors method
6a81002
to
2f08e84
Compare
I have found another problem, and that is the original matrix that produces a pseudo-inverse failure also produces an rref failure If you take this guy and pass it to the eigenvectors method with its eigenvalues:
The first step the eigenvector method does when it loops over the given eigenvalues is to subtract the largest eigenvalue from the matrix (2828791.05765 in the first loop) and then get the rref. The rref returned by the matrix method on line 74 is this:
Whereas wolfram and this random rref calculator I found produce an identity matrix (the latter is using the scaled-down matrix in the eigenvectors method) I'll add the matrix as a test case to the ReducedRowEchelonFormTest and submit a separate PR with it so anyone can look at what's going on |
Ok, been a bit, but I found the real problem and I'm not sure what to do about it In the above test case, when it gets to the 3rd eigenvalue, the resulting RREF is an identity matrix. The matrix copied from above:
The eigenvalues: The offending eigenvalue,
Which I plugged into wolfram and asked for the rref. It also gives an identity. So nothing is wrong with the RREF implementation or eigenvalues implementation. However, wolfram can solve for the eigenvectors of the original matrix, so that must mean that the eigenvector procedure is missing something - there is either a way to recover from the RREF not pointing to a solution or an alternative method of getting the eigenvectors. Edit: @Beakerboy idk if you would know anything of that, but any thoughts? Would you know a resource to look at? |
Hi @Aweptimum, Thanks for your continued work on this. I'm presuming the eigenvalue calculation is correct. If that is the case, and we think the eigenvector calculation is the issue, are you able to reproduce the issue with a simpler matrix, or another matrix, which might help identify what characteristics are causing the issue? Alternatively, is there a more robust method of calculating eigenvectors that we could be using? |
The eigenvalue calculation is correct, yes, I have the correct ones verified by numpy + wolfram + a professor who has matlab I'm not sure if there's a more "robust" way because the eigenvector algorithm is just solving The QZ Algorithm is analogous to the QR algorithm, but it uses the results of the QR algorithm to form what's called the Schur Decomposition, then solves the system using the decomposition rather than the original matrix, and then transforms the eigenvectors back to match the original matrix. The original paper is behind paywalls and the only implementations I can find are in C or Lapack (and I can read neither). However, there is a nasa paper someone wrote on extending the algorithm that might be plain enough to work from: https://ntrs.nasa.gov/citations/19730018792 |
I thought I’d at least chime in to say that I have no idea how to help, but a quick scan of the nasa paper looks like there’s enough direction that a well motivated individual could just follow the provided steps and implement it. I am not a mathematician, most of my contributions here were just implementing algorithms from Wikipedia, textbooks, or other software packages in PHP, just like this case. |
Sorry it's been a while, but my previous job laid me off a few months ago. I was previously allowed to contribute to math-php on the job, but my current job no longer touches PHP (and probably never will). I can no longer work on my PR's, basically :( |
Thank you for your work up till now. We appreciate it. I'll continue with what you've started. Thanks again. |
In investigating the problems in #468, I decided to try implementing the qr algorithm to have an alternative way of getting the eigenvectors.
This is still a work in progress, but it adds:
NumericMatrix::upperHessenberg
- converts a matrix to Upper Hessenberg form using householder reflections. I only have one test case, but it implicitly passes everyEigenvalue::qrAlgorithm
test case so that's neat.Eigenvalue::qrAlgorithm
- finds the eigenvalues of a matrix using the QR Algorithm, which follows the same principle as the power method. It relies onupperHessenberg
. I implemented it using the Wilkinson Shift strategy and deflation, so when the lowest diagonal element converges it recurses using the submatrix that doesn't include that diagonal. It's pretty fast, none of the test cases take more than ~100 iterations to converge. All tests pass.Eigenvector::qrAlgorithm
- this just takes the output of the above method and feeds it toEigenvector::eigenvectors
. The problem is that it fails in cases where the eigenvalues are correct, so I'm pretty sure something is wrong in the eigenvectors method. There is a way to directly solve for the eigenvectors using the QR algorithm, but I haven't fully understood it yet.I'm not sure how you'd want to expose
Eigenvector::qrAlgorithm
in the public API, so I haven't worked on that. Might need a refactor similar toEigenvector