Skip to content

dgeev: memory corruption on non-finite input #5250

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
AllinCottrell opened this issue Apr 30, 2025 · 1 comment · Fixed by #5251
Closed

dgeev: memory corruption on non-finite input #5250

AllinCottrell opened this issue Apr 30, 2025 · 1 comment · Fixed by #5251

Comments

@AllinCottrell
Copy link

AllinCottrell commented Apr 30, 2025

I've found that in certain cases passing a matrix that contains "inf" values to dgeev results in memory corruption. I'm attaching a minimal test case, which I compiled on Fedora 40 with

gcc -O2 -g -o eigencrash eigencrash.c -lopenblaso -lgfortran

The test program accepts a single argument. When invoked as ./eigencrash ok the non-finite input is 3 x 3 and there's no crash (the returned eigenvalues are non-finite, as expected). Without the "ok" argument the input is 4 x 4 and a crash results. The output from this case is as follows:

dgeev (1): lwork = 136, info = 0
 ** On entry to DGEBAL parameter number  3 had an illegal value
 ** On entry to DGEHRD parameter number  2 had an illegal value
 ** On entry to DHSEQR parameter number  4 had an illegal value
dgeev (2): info = -4
eigevals: real, imag:
     -nan,      -nan
     -nan,      -nan
     -nan,      -nan
     -nan,      -nan
double free or corruption (out)
Aborted (core dumped)

I'm also attaching the valgrind log and gdb output for the crashing case.

eigencrash.c.txt

valgrind.log

gdb.txt

@martin-frbg
Copy link
Collaborator

This looks like a genuine (Reference-)LAPACK issue (and/or undefined behavior w.r.t non-finite inputs to LAPACK).

Error 3 is raised in DGEBAL when the input contains NaN:

*
*           Exit if NaN to avoid infinite loop
*
            IF( DISNAN( C+CA+R+RA ) ) THEN
               INFO = -3
               CALL XERBLA( 'DGEBAL', -INFO )
               RETURN
            END IF

at that point, balancing the matrix has probably failed to produce the desired result, but the algorithm in DGEEV (lapack-netlib/src/dgeev.f around line 360) continues regardless, finally (line 511 of dgeev.f) trying to undo a previous scaling operation where I suspect the "INFO" appearing in input is not expected to be an actual error code and all hell breaks loose.
(The usual LAPACK fun where there is exactly one integer return value to signal any kind of error - or sometimes just the number of some interesting row or column...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants