-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error using OpenBLAS in an OpenMP Application #85
Comments
Hi, Thank you for this report. Could you give me some test codes to reproduce this error? You can send Thank you again. Xianyi grisuthedragon write:Sent: 12-4-2 Afternoon 3:56
|
I use Ubuntu 10.04 LTS $ gcc --version $ cat /proc/cpuinfo | grep "model name" | head -n1 I use 64 bit integers every where, unfortunately I don't have a working minimal example. But the code runs using OpenBLAS moren then a half year and now it crashed. |
Hi, From the following, #0 ?? () at ../kernel/x86_64/copy_sse2.S:592 from /scratch/koehlerm/mess/OpenBLAS/libopenblas.so.0 #1 0x00007ffff5737bee in ger_kernel (args=0x7fffffffbb00, I think it crashed in dger function. Could you give more information args is a structure pointer, range_m and range_n is a int arrary. Thank you Xianyi grisuthedragon write:Sent: 12-4-3 Afternoon 4:41
|
I have crashes, too. Unfortunately, I cannot provide any helpful gdb output as the code that crashes is a Python module using OpenBlas and I don't have all debugging symbols at hand. Maybe helpful: I did not have crashes with 0.1 alpha 2.5, the crashes only started after the upgrade to 0.1. CPU: model name : Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz |
What's your CPU and OS? 64-bit or 32? Xianyi 在 2012-4-5,下午10:56,Alexander Eberspächerreply@reply.github.com 写道:
|
CPU: model name : Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz I am happy to provide any further information you need. Remark: the tests after building pass. The crashes do not occur on every run of my code. |
Hi grisuthedragon, I just tested OpenBLAS with INTERFACE = 64 and USE_OPENMP=1. I cannot reproduce dger or copy errors. Thanks |
Hi Alexander, Could you build OpenBLAS with DEBUG=1. Then, enable the core dump as following. Next, you can run the program until it crashes. gdb your_program core This will show which function crashed. You can also type "bt" to show the function trace. Xianyi |
Xianyi, unfortunately I cannot find out which function leads to the crash. I do not have all required debugging symbols at hand (I run a Python script that uses a Python module created from Fortran using f2py, which itself calls OpenBlas). My code has an awful lot of dependencies for which I cannot get debugging symbols. All I have in the backtrace is #0 0x00007f451200269c in ?? () Please let me know if I can help in any other way. |
Hi Alexander, Do you know your application using shared OpenBLAS library or static library? Xianyi |
Hi Alexander, Please test this: export OPENBLAS_NUM_THREADS=1 Xianyi |
Dear Xianyi, I use the shared object version of OpenBlas. I already had replaced the .so file with the debug version. However, gdb cannot tell me which function crashed. However, the problem seems to vanish if I run my script with OMP_NUM_THREADS=1. At least that's what I infer here - i tested several runs and saw no crashes. With higher OMP_NUM_THREADS I have crashes in about 30 to 50% of runs. |
The segfaults are still there on the develop branch. If I have segfaults, they appear immediately after my code is run. Even before the first BLAS routine can was called. Maybe that helps to narrow it down. Could there be something wrong in the build process? |
Thank you for this information I think it may be relate to memory allocation or thread creation. Xianyi 在 2012年4月27日 下午3:54,Alexander Eberspächer <
|
Hi Alexander, Please help me do the following 2 experiments:
What't your Linux kernel version? Thank Xianyi |
Hi Xianyi, I applied your patch. The crashes are gone! I am using kernel 2.6.32-220.7.1 64-bit on Scientific Linux 6.2. If you are still interested in the outcome of the first experiment, please let me know. I am happy to test that as well. Thanks for the patch, Alex |
Hi Alex, I met this crash 1 year ago. It may be relate to a kernel bug. Then, I I suggest you upgrade the kernel. I think the crash will be gone without Thank you Xianyi 2012/5/2 Alexander Eberspächer <
|
I had lots of compiler warning, too. Unfortunately, I cannot upgrade my kernel. I think there are many other users stuck on Redhat Enterprise Linux 6, CentOS 6 and Scientific Linux 6. All these users use the same kernel as I do. Given the importance of those distributions, it might be wise to document this patch. I could issue a pull request if needed. Just let me know if you think this is helpful. |
Merge in PL/openblas from dev/k.zaytseva/LS-1012 to dev-riscv
I'll got an error using OpenBLAS (master and development branch) in an OpenMP application.
The programm crashes with a segmentation fault and gdb gives me the following:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff1a4c700 (LWP 12429)]
?? () at ../kernel/x86_64/copy_sse2.S:592 from /scratch/koehlerm/mess/OpenBLAS/libopenblas.so.0
592 movhps (X), %xmm0
(gdb) bt
#0 ?? () at ../kernel/x86_64/copy_sse2.S:592 from /scratch/koehlerm/mess/OpenBLAS/libopenblas.so.0
#1 0x00007ffff5737bee in ger_kernel (args=0x7fffffffbb00, range_m=0x100000000, range_n=0x7fffffffbb98, dummy1=0x7fffe6d4f080, buffer=0x7fffe6e4f080, pos=3) at ger_thread.c:88
#2 0x00007ffff5b229d6 in exec_threads (queue=0x7fffffffba58) at blas_server_omp.c:240
#3 0x00007ffff5b22b25 in exec_blas.omp_fn.0 (.omp_data_i=0x7fffffffb7f0) at blas_server_omp.c:268
#4 0x00007ffff49ce7ca in gomp_thread_start (xdata=Unhandled dwarf expression opcode 0xf3
) at ../../../libgomp/team.c:116
#5 0x00007ffff500a9ca in start_thread (arg=) at pthread_create.c:300
#6 0x00007ffff472970d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#7 0x0000000000000000 in ?? ()
(gdb) list
587 ALIGN_3
588
589 .L41:
590 movsd (X), %xmm0
591 addq INCX, X
592 movhps (X), %xmm0
593 addq INCX, X
594 movsd (X), %xmm1
595 addq INCX, X
596 movhps (X), %xmm1
(gdb)
OpenBLAS is compiled with:
The text was updated successfully, but these errors were encountered: