Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid overflow from division in GETF2 potentially causing NaN #3924

Merged
merged 2 commits into from
Feb 27, 2023
Merged

Avoid overflow from division in GETF2 potentially causing NaN #3924

merged 2 commits into from
Feb 27, 2023

Conversation

martin-frbg
Copy link
Collaborator

for fault seen in numpy/numpy#22025

@martin-frbg martin-frbg added this to the 0.3.22 milestone Feb 26, 2023
@martin-frbg martin-frbg merged commit 5925178 into OpenMathLib:develop Feb 27, 2023
@mattip
Copy link
Contributor

mattip commented Feb 27, 2023

Would it help to close the numpy issue if I create an openblas-lib build of the develop branch or should I wait for a 0.3.22 release?

@angsch
Copy link
Contributor

angsch commented Feb 27, 2023

Is there a reason to use fabs(temp1) > 1.e-305 rather than SFMIN as used in LAPACK? SFMIN = 2**-1022 = DBL_MIN $\approx 2.22*10^{-308}$ . I can think of a badly scaled matrix (admittedly a pathological case) where LAPACK succeeds, but OpenBLAS fails with the fix. Also, I would use >= rather than >.

@martin-frbg
Copy link
Collaborator Author

No reason except imited testing once the problem/solution was identified. Also seems possible now that the problem only exists for clang/arm64 so could be ifdef'd to skip the fabs call on other platforms.

@angsch
Copy link
Contributor

angsch commented Mar 15, 2023

@martin-frbg Here is an example that illustrates the problem.

A = [ 0  a
      a  0 ], where |a| is a value below 1.e-305

The expected LU factorization with pivoting is

L = [ 1 0 ]  U = [ a 0 ] P = [ 0 1 ]
    [ 0 1 ],     [ 0 a ],    [ 1 0 ]

The expected memory representation is ipiv = [ 2, 2], A = [ a 0 0 a ].

With the patch, the computed result is

L = [ 1 0 ]  U = [ 0 0 ] P = [ 0 1 ]
    [ a 1 ],     [ 0 a ],    [ 1 0 ]

I consider this a bug. It will not be possible to solve a system with that LU factorization because U(1,1) == 0.0 will incur a division by zero.

Copy & paste reproducer:

// gcc test-getf2.c /path/to/libopenblas_skylakexp-r0.3.21.dev.a
#include <stdio.h>
#include <float.h>

extern void dgetf2_(int *m, int *n, double *A, int *ldA, int *ipiv, int *info);

int main() {
  int m = 2, n = 2, ldA = 2, info = 0;
  double a = DBL_MIN; // smallest positive normal number
  int ipiv[2];
  double A[4] = {0.0, a,
                 a  , 0.0};

  dgetf2_(&m, &n, A, &ldA, ipiv, &info);

  printf("U = [%.6e %.6e\n     %.6e %.6e]\n", A[0], A[2], 0.0, A[3]);
  printf("L = [%.6e %.6e\n     %.6e %.6e]\n", 1.0, 0.0, A[1], 1.0);
  printf("ipiv %d, %d", ipiv[0], ipiv[1]);
  return 0;
}

@martin-frbg
Copy link
Collaborator Author

@angsch thanks for reminding me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants