Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashes in ARM-Cortex A15 architecture. #562

Closed
tripathyAbhijit opened this issue May 4, 2015 · 11 comments
Closed

Crashes in ARM-Cortex A15 architecture. #562

tripathyAbhijit opened this issue May 4, 2015 · 11 comments

Comments

@tripathyAbhijit
Copy link

Hello Xianyi !!! Hope you're doing well.
I am actually using Openblas for my neural network application on ARM.
Openblas builds without any issues on CortexA15.
However, when I run my application , it always crashes with seg fault.
When I inspect the core on GDB , I get the following info:
Program terminated with signal 11, Segmentation fault.
#0 0x00194dc0 in axpy_kernel_S4 ()

(gdb) where
#0 0x00194dc0 in axpy_kernel_S4 ()
#1 0x00167340 in cblas_saxpy ()
#2 0xb63946d0 in ?? ()
#3 0xb63946d0 in ?? ()

Backtrace stopped: previous frame identical to this frame (corrupt stack?)

FYI : my arm board has 4 cortex A15 cores and vfpv3,vpfv4 and neon are all supported.

I compiled Openblas with the following options supported by my toolchain:
-marm -mfpu=vfpv3 -mfloat-abi=softfp

Can you please look into the matter , as it is becoming very difficult for me to debug the assembly code.

@wernsaar
Copy link
Contributor

wernsaar commented May 4, 2015

Hi,

you cannot use softfp with OpenBLAS.

Best regards

Werner

On 05/04/2015 02:22 PM, tripathyAbhijit wrote:

Hello Xianyi !!! Hope you're doing well.
I am actually using Openblas for my neural network application on ARM.
Openblas builds without any issues on CortexA15.
However, when I run my application , it always crashes with seg fault.
When I inspect the core on GDB , I get the following info:
Program terminated with signal 11, Segmentation fault.
#0 0x00194dc0 in axpy_kernel_S4 ()
(gdb) where
#0 0x00194dc0 in axpy_kernel_S4 ()
#1 #1 0x00167340 in
cblas_saxpy ()
#2 #2 0xb63946d0 in ?? ()
#3 #3 0xb63946d0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

FYI : my arm board has 4 cortex A15 cores and vfpv3,vpfv4 and neon are
all supported.

I compiled Openblas with the following options supported by my toolchain:
-marm -mfpu=vfpv3 -mfloat-abi=softfp

Can you please look into the matter , as it is becoming very difficult
for me to debug the assembly code.


Reply to this email directly or view it on GitHub
#562.

@tripathyAbhijit
Copy link
Author

@wernsaar Thanks for the quick help . I have some more questions : Can we use hardfp or soft? I badly need a library which gives similar performance to ATLAS . But I find it extremely difficult to cross compile ATLAS on ARM. Any suggestions ?

@wernsaar
Copy link
Contributor

wernsaar commented May 4, 2015

Hi,

You cannot use softfp with OpenBLAS, because a lot of functions are
written in assembler and you cannot call this functions using softfp.
Why do you want to use softfp?

Best regards

Werner

On 05/04/2015 02:33 PM, tripathyAbhijit wrote:

@wernsaar https://github.com/wernsaar can we use hardfp or soft? I
badly need a library which gives similar performance to ATLAS . But I
find it extremely difficult to cross compile ATLAS on ARM. Any
suggestions ?


Reply to this email directly or view it on GitHub
#562 (comment).

@tripathyAbhijit
Copy link
Author

@wernsaar ,Sorry for being late . The problem is , my toolchain has been configured with softfp.
Here are some of the options of my toolchain :
--with-fpu=vfpv3 --with-cpu=cortex-a15.cortex-a7 --with-tune=cortex-a15.cortex-a7 --with-float=softfp --disable-libatomic --enable-libgomp --enable-poison-system-directories --enable-long-long --enable-threads --enable-languages=c,c++,fortran --enable-shared --enable-lto --enable-symvers=gnu --enable-__cxa_atexit --with-pkgversion=VDLinux.v7a15a7.GA1.2014-04-25 --with-gnu-as --with-gnu-ld --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-

As you can see, only softfp has been enabled for the toolchain. There is no stubs-hard.h .
I made a soft link like stubs-hard.h-> stubs-soft.h and recompiled with mabi = hard. However Openblas doesn't compile now .

Is there any workaround ? Or do I have to re compile my toolchain with Hard support .
Please let me know , If I am doing something horribly wrong . Looking forward to hear from you ...
Thanks in advance

@wernsaar
Copy link
Contributor

wernsaar commented May 5, 2015

Hi,

I have published binaries for ARM on sourceforge.
Please download
http://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-armv7a.tar.gz,
extract the file and try to run a benchmark, for example:
cd benchmark
./dgemm.goto 1024 1024 1.

This is a test to check, that the libraries for mfloat-abi=hard are
installed.

Best regards
werner

On 05/05/2015 07:18 AM, tripathyAbhijit wrote:

@wernsaar https://github.com/wernsaar ,Sorry for being late . The
problem is , my toolchain has been configured with softfp.
Here are some of the options of my toolchain :
--with-fpu=vfpv3 --with-cpu=cortex-a15.cortex-a7
--with-tune=cortex-a15.cortex-a7 --with-float=softfp
--disable-libatomic --enable-libgomp
--enable-poison-system-directories --enable-long-long --enable-threads
--enable-languages=c,c++,fortran --enable-shared --enable-lto
--enable-symvers=gnu --enable-__cxa_atexit
--with-pkgversion=VDLinux.v7a15a7.GA1.2014-04-25 --with-gnu-as
--with-gnu-ld --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-

As you can see, only softfp has been enabled for the toolchain. There
is no stubs-hard.h .
I made a soft link like stubs-hard.h-> stubs-soft.h and recompiled
with mabi = hard. However Openblas doesn't compile now .

Is there any workaround ? Or do I have to re compile my toolchain with
Hard support .
Please let me know , If I am doing something horribly wrong . Looking
forward to hear from you ...
Thanks in advance


Reply to this email directly or view it on GitHub
#562 (comment).

@tripathyAbhijit
Copy link
Author

@wernsaar , I tried the same on my target A15 board. You were right .Sadly , the libraries for mfloat-abi=hard are not installed it seems. At my shell when I run ./dgemm.goto 1024 1024 1 , it shows me ./dgemm.goto : file not found.
I have one more question , is there some flag in the Makefile , the enabling of which results in generation of platform independent C code .
For example : After successful build , In the case of Dgemm it should generate dgemm_kernel_4x2_vfp.c , dgemm_kernel_4x4_vfpv3.c etc in place of the corresponding .S files. I'm aware that , it will degrade performance. But still I want to use that solution as I need the Openblas library for Matrix Operations.

Looking forward to hear from you , Thanks in advance !

@wernsaar
Copy link
Contributor

wernsaar commented May 5, 2015

Hi,

I don't have a platform, that supports softfp.
But you can try it.

Please edit Makefile.rule and set/edit two lines:
TARGET = ARMV5
USE_OPENMP = 1

Now edit Makefile.arm

replace:

ifeq ($(CORE), ARMV5)
CCOMMON_OPT += -marm -mfpu=vfp -mfloat-abi=hard -march=armv6
FCOMMON_OPT += -marm -mfpu=vfp -mfloat-abi=hard -march=armv6
endif

with:

ifeq ($(CORE), ARMV5)
CCOMMON_OPT += -marm -mfpu=vfp -mfloat-abi=softfp -march=armv6
FCOMMON_OPT += -marm -mfpu=vfp -mfloat-abi=softfp -march=armv6
endif

Simply type make

Best regards

Werner

On 05/05/2015 01:42 PM, tripathyAbhijit wrote:

@wernsaar https://github.com/wernsaar , I tried the same on my
target A15 board. You were right .Sadly , the libraries for
mfloat-abi=hard are not installed it seems. At my shell when I run
./dgemm.goto 1024 1024 1 , it shows me ./dgemm.goto : file not found.
I have one more question , is there some flag in the Makefile , the
enabling of which results in generation of platform independent C code .
For example : After successful build , In the case of Dgemm it should
generate dgemm_kernel_4x2_vfp.c , dgemm_kernel_4x4_vfpv3.c etc in
place of the corresponding .S files.

Looking forward to hear from you , Thanks in advance for the quick
reply again !


Reply to this email directly or view it on GitHub
#562 (comment).

@tripathyAbhijit
Copy link
Author

@wernsaar , Thanks a lot !!! 👍 It worked . I was just wondering , would it be wise to use neon vs vfp ? Any suggestions on this ?

@wernsaar
Copy link
Contributor

wernsaar commented May 7, 2015

Hi,

on processors armv7 and later, neon is not faster than vfp and deprecated.
With neon, you have fewer instructions, but the same number of cycles, and
neon is not IEEE compliant.
On ARM64 (Aarch64), neon is replaced by asimd, which gives very good
performance.

If you need high performance on your arm platform, I would recommand to
replace the BS image
with an image that supports mfloat-abi=hard.
Fedora and Ubuntu are good choices.

Best regards
Werner

On 05/07/2015 11:08 AM, tripathyAbhijit wrote:

@wernsaar https://github.com/wernsaar , Thanks a lot !!! 👍 It
worked . I was just wondering , would it be wise to use neon vs vfp ?
Any suggestions on this ?


Reply to this email directly or view it on GitHub
#562 (comment).

@wernsaar
Copy link
Contributor

@tripathyAbhijit , can I close this issue?

Best regards
Werner

@wernsaar
Copy link
Contributor

Closed, because it worked

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants