-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation faults when compiling in 32 bit on 64 bit Linux platform #186
Comments
I was trying to put in the info RE how I configured OpenBLAS, and it submitted the issue. openblas_compiled: |
BTW, I don't know if the problem has anything to do with the following comment re SEGFAULT, in common_linux.h ? static inline int my_mbind(void *addr, unsigned long len, int mode, |
Guys, The patch below fixes the problem for me. It looks like the "nodemask" argument to the From 09c398574079dd26379c62cddf02afc5bdcf327f Mon Sep 17 00:00:00 2001 common_linux.h | 7 ++++--- diff --git a/common_linux.h b/common_linux.h
1.7.9.6 |
Hi @danpovey , This is a known issue in OpenBLAS. On some Linux kernel version, the m_bind cannot accept NULL, which is a bug in kernel. You can apply segfaults.patch to walk around this issue. For example, patch -ruN < segfaults.patch Xianyi |
Wouldn't it be simpler just to always compile with my patch? This code is On Mon, Jan 21, 2013 at 3:28 AM, Zhang Xianyi notifications@github.comwrote:
|
Hi @danpovey, OpenBLAS always set NULL for mbind, which set the memory policy about allocating the memory on the local node. Thus, it can improve the performance. I think recent Linux kernel fixed this bug. Xianyi |
Guys,
When I ran my toolkit's tests using OpenBlas on a particular platform, I get certain hard-to-replicate segmentation faults in memory.c. These occur inconsistenly and not when I run in gdb; it's easier to set cores to dump and wait till that happens.
There is some information below.
svatava:matrix: gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/i686-linux/4.5.4/lto-wrapper
Target: i686-linux
Configured with: ../configure --build=i686-linux --with-arch=nocona --with-tune=core2 --with-thread=posix --with-as=/usr/local/bin/as --with-ld=/usr/local/bin/ld --with-system-zlib --program-suffix=-4.5
Thread model: posix
gcc version 4.5.4 (GCC)
BTW, the test code is not multi-threaded, and I configured OpenBLAS with:
make: Nothing to be done for `all'.
svatava:matrix: make test
Running matrix-lib-test ...... SUCCESS
Running kaldi-gpsr-test ...... SUCCESS
svatava:matrix: make test
Running matrix-lib-test .../bin/sh: line 1: 4758 Segmentation fault (core dumped) ./$x > /dev/null 2>&1
... FAIL
Running kaldi-gpsr-test ...... SUCCESS
make: *** [test] Error 1
svatava:matrix: gdb ./matrix-lib-test core.4758
GNU gdb (GDB) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /mnt/matylda6/jhu09/qpovey/sourceforge/kaldi/trunk/src/matrix/matrix-lib-test...done.
[New Thread 4763]
[New Thread 4758]
[New Thread 4762]
[New Thread 4764]
[New Thread 4761]
[New Thread 4760]
[New Thread 4759]
[New Thread 4765]
warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /homes/eva/q/qpovey/sourceforge/kaldi/trunk/tools/OpenBLAS/install/lib/libopenblas.so.0...done.
Loaded symbols for /homes/eva/q/qpovey/sourceforge/kaldi/trunk/tools/OpenBLAS/install/lib/libopenblas.so.0
Reading symbols from /usr/local/lib/libgfortran.so.3...done.
Loaded symbols for /usr/local/lib/libgfortran.so.3
Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /usr/lib/libstdc++.so.6...done.
Loaded symbols for /usr/lib/libstdc++.so.6
Reading symbols from /lib/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `./matrix-lib-test'.
Program terminated with signal 11, Segmentation fault.
#0 0xf6ea211b in alloc_mmap (address=0x0) at memory.c:433
433 *(long *)start = (long)start + PAGESIZE;
(gdb) up
#1 0xf6ea26be in blas_memory_alloc (procpos=2) at memory.c:987
987 map_address = (*func)((void *)base_address);
(gdb) p base_address
$1 = 0
(gdb) up
#2 0xf6ea34fb in blas_thread_server (arg=0x4) at blas_server.c:274
274 buffer = blas_memory_alloc(2);
(gdb) up
#3 0x4f08c832 in start_thread () from /lib/libpthread.so.0
(gdb) up
#4 0x4efcc4de in clone () from /lib/libc.so.6
(gdb) up
Initial frame selected; you cannot go up.
(gdb) p func
No symbol "func" in current context.
(gdb) down
#3 0x4f08c832 in start_thread () from /lib/libpthread.so.0
(gdb) down
#2 0xf6ea34fb in blas_thread_server (arg=0x4) at blas_server.c:274
274 buffer = blas_memory_alloc(2);
(gdb) down
#1 0xf6ea26be in blas_memory_alloc (procpos=2) at memory.c:987
987 map_address = (func)((void *)base_address);
(gdb) p func
$2 = (void *(*)(void *)) 0xf350b344
(gdb) p base_address
$3 = 0
(gdb) down
#0 0xf6ea211b in alloc_mmap (address=0x0) at memory.c:433
433 (long *)start = (long)start + PAGESIZE;
(gdb) p start
$4 = 3956314112
(gdb) p (long)start
$5 = (long ) 0xebd09000
(gdb) p *((long)start)
Cannot access memory at address 0xebd09000
(gdb) list
428
429 start = (BLASULONG)map_address;
430 current = (SCALING - 1) * BUFFER_SIZE;
431
432 while(current > 0) {
433 (long *)start = (long)start + PAGESIZE;
434 start += PAGESIZE;
435 current -= PAGESIZE;
436 }
437
(gdb) p sizeof(long)
$6 = 4
(gdb) p sizeof(void)
$7 = 4
(gdb) p map_address
$8 = (void *) 0xebd09000
(gdb) p memory
$9 = {{lock = 0, addr = 0xf4d0f000, pos = 0, used = 1, dummy = '\000' <repeats 47 times>}, {lock = 0, addr = 0x0, pos = -1, used = 1,
dummy = '\000' <repeats 47 times>}, {lock = 0, addr = 0x0, pos = -1, used = 1, dummy = '\000' <repeats 47 times>}, {lock = 0, addr = 0x0,
pos = -1, used = 1, dummy = '\000' <repeats 47 times>}, {lock = 0, addr = 0x0, pos = -1, used = 0,
dummy = '\000' <repeats 47 times>} <repeats 28 times>}
(gdb)
The text was updated successfully, but these errors were encountered: