Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenBLAS crash on PENRYN with complex matrix operations #140

Closed
franko opened this issue Sep 13, 2012 · 10 comments
Closed

OpenBLAS crash on PENRYN with complex matrix operations #140

franko opened this issue Sep 13, 2012 · 10 comments
Milestone

Comments

@franko
Copy link

franko commented Sep 13, 2012

I've discovered what seems to be a bug in the OpenBLAS library with complex matrix on dual core PENRYN CPUs. The bug cause a crash of the software.

I was able to reproduce the bug with a very simple program using the BLAS library indirectly through the GSL library. When linking with OpenBLAS 0.2.3 the program crash while it does terminate correctly when linking with the GSL CBLAS library.

The bug happens in the GSL function:

gsl_linalg_complex_LU_solve

If I fire a debugger the program get a SEGMENTATION FAULT at this point:

Program received signal SIGSEGV, Segmentation fault.
0x6f6183e8 in zcopy_k_PENRYN ()
from c:\fra\downloads\gsl-shell-2.2.0-beta1-openblas\libgsl-0.dll

The OpenBLAS library was compiled on Windows XP using an up to date mingw installation with gcc 4.7.0.

The library was compiled with:

make DYNAMIC_ARCH=1 FC=gfortran

Then, in turn, the GSL library was linked to the OpenBLAS library.

Here the C program to reproduce the bug:

#include <stdio.h>

#include <gsl/gsl_matrix.h>
#include <gsl/gsl_vector.h>
#include <gsl/gsl_linalg.h>

int main()
{
    // {{1,1,0,0},{0,0,2.3631587e+083,0},{1,1,1,1},{0,1,0,0}}
    double m_data[32] = {1, 0, 1, 0, 0, 0, 0, 0,
                         0, 0, 0, 0, 2.3631587e+083, 0, 0, 0,
                         1, 0, 1, 0, 1, 0, 1, 0,
                         0, 0, 1, 0, 0, 0, 0, 0 };
    double b_data[8] = {0, 0, 0, 0, 0, 0, 1, 0};
    gsl_matrix_complex_view m_v = gsl_matrix_complex_view_array(m_data, 4, 4);
    gsl_vector_complex_view b_v = gsl_vector_complex_view_array(b_data, 4);
    gsl_matrix_complex *m = &m_v.matrix;
    gsl_vector_complex *b = &b_v.vector, *x;
    gsl_matrix_complex *lu;
    int signum[1];
    size_t n = 4;
    int k;

    lu = gsl_matrix_complex_alloc(4, 4);
    gsl_matrix_complex_memcpy(lu, m);

    gsl_permutation *p = gsl_permutation_alloc(n);

    gsl_linalg_complex_LU_decomp(lu, p, signum);

    x = gsl_vector_complex_alloc(4);
    gsl_linalg_complex_LU_solve(lu, p, b, x);

    for (k = 0; k < 4; k++)
        printf("x[%i] = %g + i %g\n", k, x->data[2*k], x->data[2*k+1]);

    gsl_vector_complex_free(x);
    gsl_permutation_free(p);
    gsl_matrix_complex_free(lu);

    return 0;
}
@xianyi
Copy link
Collaborator

xianyi commented Sep 14, 2012

Hi @franko ,

Is it 32-bit or 64-bit?

Zhang Xianyi

@franko
Copy link
Author

franko commented Sep 14, 2012

Sorry, I forgot to precise, it is on a 32-bit system.

@xianyi
Copy link
Collaborator

xianyi commented Sep 17, 2012

Hi @franko ,

Could you try the static library of gsl and OpenBLAS? Or, you don't set "DYNAMIC_ARCH=1".

Xianyi

@franko
Copy link
Author

franko commented Sep 22, 2012

Hi @xianyi ,

I've tried as you was suggested to build the test case above using only the static libraries. So I've linked to libgsl.a and openblas.lib (renamed in .dll.a). The openblas library was build with DYNAMIC_ARCH=1 and FC=gfortran.

The resulting executable is statically linked to both GSL and OpenBLAS and is quite big (~ 14 Mb) but it does still crash as before.

For me building without DYNAMIC_ARCH is not interesting because I'm trying to produce a binary package that is able to select the optimal CPU architecture at runtime for any 32bit Windows system.

Please let me know if you need more details and thank you very much for your help.

Francesco

@xianyi
Copy link
Collaborator

xianyi commented Sep 22, 2012

Hi @franko ,

Because I don't have PENRYN and Windows XP test box, I tested your codes on my Intel Sandy Bridge Win7 32-bit PC.
I built GSL from the source codes. Then, I built OpenBLAS with TARGET=PENRYN. It worked fine. So far, I cannot reproduce this SEGFAULT.

Could you build OpenBLAS with DEBUG=1?

To @zchothia , any comments?

Thank you

Zhang Xianyi

@franko
Copy link
Author

franko commented Sep 23, 2012

Hi @xianyi ,

I have built the library without DYNAMIC_ARCH and without multithread support to narrow down the test case. So I've used:

make USE_THREAD=0 DEBUG=1

Then I've used the GSL library as a static library with DEBUG enabled.

The example above still crash but now I'm able to give you more information on where the crash happens. Here the backtrace before the crash;

(gdb) bt
#0 cblas_ztrsv (order=CblasRowMajor, Uplo=CblasLower, TransA=CblasNoTrans,
Diag=CblasUnit, n=4, a=0x3d2470, lda=4, x=0x3d25d8, incx=1) at ztrsv.c:207
#1 0x0042e541 in gsl_blas_ztrsv (Uplo=CblasLower, TransA=CblasNoTrans,
Diag=CblasUnit, A=0x3d3f78, X=0x3d2588) at blas.c:968
#2 0x0041699d in gsl_linalg_complex_LU_svx (LU=0x3d3f78, p=0x3d3fa8,
x=0x3d2588) at luc.c:202
#3 0x00416864 in gsl_linalg_complex_LU_solve (LU=0x3d3f78, p=0x3d3fa8,
b=0x22fdb4, x=0x3d2588) at luc.c:168
#4 0x0040151a in main () at blas-bug-test.c:31

The line just before the bug happens is therefore:

207 (trsv[(trans<<2) | (uplo<<1) | unit])(n, a, lda, x, incx, buffer);

When I step into this function (it is actually ztrsv_TUU) is may be where the problem happens. The function ztrsv_TUU is executed normally but when you hit the final instruction;

return 0;

in ztrsv_L.c:169 the program get lost (corrupted stack ?) and I get a SEGFAULT.

My guess is that in ztrsv_TUU there is a buffer overflow and the stack get corrupted so that "return" fails to return to the caller stack.

Hi hope this will help you to identify the bug. On windows this bug is absolutely repeatable and I get it with any build options for OpenBLAS.

Francesco

@xianyi
Copy link
Collaborator

xianyi commented Sep 24, 2012

Hi @franko ,

I used MingW gcc 4.6.
I will upgrade gcc version to 4.7 and test it.

Xianyi

@xianyi
Copy link
Collaborator

xianyi commented Sep 24, 2012

Hi @franko ,

I have reproduce this bug with gcc 4.7 on Win 7 32-bit.

Because I am not familiar with Windows and MingW, I need investigate this bug in a few days.

Thank you

Zhang Xianyi

xianyi added a commit that referenced this issue Sep 24, 2012
GCC 4.7 uses MSVC ABI on Win 32. This means the caller pops the hidden pointer for returning
aggregate structures larger than 8 bytes.
@xianyi
Copy link
Collaborator

xianyi commented Sep 24, 2012

Hi @franko ,

I think I fixed this issue on develop branch. Please test it.

Thank you

Zhang Xianyi

@franko
Copy link
Author

franko commented Sep 26, 2012

Hi @xianyi ,

I was able to test yesterday with the GSL test and everything was fine with the lated "develop" version.

Thank you very much for your help. I guess that now the issue can be considered closed but this is up to you.

Francesco

@xianyi xianyi closed this as completed Sep 26, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants