Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARMV7 (with hard float flag) did not run with correct result #1145

Open
gangm opened this issue Apr 7, 2017 · 15 comments
Open

ARMV7 (with hard float flag) did not run with correct result #1145

gangm opened this issue Apr 7, 2017 · 15 comments

Comments

@gangm
Copy link

gangm commented Apr 7, 2017

hello,

Resently, we are using openblas to setup caffe env in our ARMv7 platform, but we got a problem when run openblas with hard float flag.
we compiled the openblas used following command:
       make CC=arm-linux-gnueabihf-gcc FC=arm-linux-gnueabihf-gfortran HOSTCC=gcc TARGET=ARMV7 libs

then we had a simple test used following code:
int main()

{

const enum CBLAS_ORDER Order=CblasRowMajor;
const enum CBLAS_TRANSPOSE TransA=CblasNoTrans;
const enum CBLAS_TRANSPOSE TransB=CblasNoTrans;
const int M=4;
const int N=2;
const int K=3;
const float alpha=1;
const float beta=0;
const int lda=K;
const int ldb=N;
const int ldc=N;
const float A[M*K]={1.123434543534,2.33234241365,3.4534545454,4.45435435345,5.454554545,6.45452345345,7.454545465,8.454545245,9.2345245625,8.45234545,7.423564545,6.425452454};
const float B[K*N]={5.4523452345,4.34526547,3.462354544,2.52436254,1.262565262,0.265364564565};
float C[M*N];

cblas_sgemm(Order, TransA, TransB, M, N, K, alpha, A, lda, B, ldb, beta, C, ldc);
for (int i = 0; i < M; i++)
{
    for (int j = 0; j < N; j++)
    {
        cout << C[i*M + j] << " ";
    }
    cout << endl;
}
}    

return EXIT_SUCCESS;
}

testing code compling command:
arm-linux-gnueabihf-g++ -mfpu=vfpv3 -mfloat-abi=hard -o test testblas.cpp /usr/local/arm/openblas/lib/libopenblas_armv7p-r0.2.20.dev.a -I/usr/local/arm/boost/include/ -lpthread

but, when we run the test code in our ARMv7 platform, we got a strange result, as below:
1.4013e-45 1.4013e-45
1.4013e-45 1.4013e-45
1.4013e-45 1.4013e-45
1.4013e-45 1.4013e-45

it is not the correct result...

when we used the openblas lib in our caffe code, it caused coredump when called openblas APIs.

can you help for this? thank you very much.

@martin-frbg
Copy link
Collaborator

Would your hardware also allow building with 64bit ARMv8 target for comparison ? There was a similar report in #1088 where I suggested reverting a small change from a year ago, unfortunately it seems nobody tried.

@ashwinyes
Copy link
Contributor

Tried your code on ARM32 QEMU (since I don't have a ARMv7 machine) with the latest OpenBLAS develop branch. The following is the result.

18.561 11.6857 
81.5766 56.1848
1.12343 2.33234
5.45455 6.45452

On ARMv8 and Intel also, I am getting the same result.

So the issue I believe, is related to your ARMv7 setup, and not OpenBLAS.

@gangm
Copy link
Author

gangm commented Apr 10, 2017

hello,

1.our hardware doesn't support 64bit ARMv8 for comparison...

2."So the issue I believe, is related to your ARMv7 setup, and not OpenBLAS."
what did "related to your ARMv7 setup" mean? you means our hardware setup or arm cross compile
envionment?

3.I have another question:
which version(branch) should i use?
Now i am tring "arm_soft_fp_abi" branch, and use compile command:"make CC=arm-none-linux-gnueabi-gcc TARGET=ARMV7 NOFORTRAN=1 HOSTCC=gcc ARM_SOFTFP_ABI=1", the result is correct.
(hard float mode can not work in this branch too...)
but when i tried "master" branch, and use similar compile command(make CC=arm-none-linux-gnueabi-gcc TARGET=ARMV7 NOFORTRAN=1 HOSTCC=gcc NO_LAPACK=1 ONLY_CBLAS=1,what ever using hard/softfp/soft mode), the result is strange(sometimes all zero, sometimes like 1.4013e-45 and so on).

@ashwinyes
Copy link
Contributor

"So the issue I believe, is related to your ARMv7 setup, and not OpenBLAS."
what did "related to your ARMv7 setup" mean? you means our hardware setup or arm cross compile
envionment?

I meant your hardware setup. Could you please share the output of /proc/cpuinfo of your ARMv7 machine.

@gangm
Copy link
Author

gangm commented Apr 10, 2017

hello:

@ashwinyes ,cpuinfo is as below:

Processor : ARMv7 Processor rev 10 (v7l)
processor : 0
BogoMIPS : 1988.29

processor : 1
BogoMIPS : 1988.29

processor : 2
BogoMIPS : 1988.29

processor : 3
BogoMIPS : 1988.29

Features : swp half thumb fastmult vfp edsp neon vfpv3
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc09
CPU revision : 10

Hardware : Freescale i.MX 6Quad/DualLite/Solo Sabre-SD Board
Revision : 63015
Serial : 0c17a1d4e6b573b3

@ashwinyes
Copy link
Contributor

@gangm Thanks for sharing the cpuinfo. Wanted to check that your processor actually supports vfpv3.

Now, another thing to check would be that all libraries being used (including boost, caffe, pthread etc. are confirming to the "-mfloat-abi=hard". You may use the steps mentioned in [http://stackoverflow.com/questions/20555594/how-can-i-know-if-an-arm-library-is-using-hardfp] to check it.

OR

You can try building your standalone program without using any other library except OpenBLAS.

@ashwinyes
Copy link
Contributor

And googling further, I found the following issues which also looks related to the issue at hand here.

sh1r0/caffe-android-lib#27
sh1r0/caffe-android-lib#37
#777

@xianyi will be right person to comment on the extent of softfp support in the latest OpenBLAS code.

@gangm
Copy link
Author

gangm commented Apr 10, 2017

@ashwinyes thanks for your reply.

I tried using standalone program which just using OpenBLAS library, and I can see it is support vfpv3, as below:
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "7-A"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Application
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-2
Tag_FP_arch: VFPv3
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Needed
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align_needed: 8-byte
Tag_ABI_align_preserved: 8-byte, except leaf SP
Tag_ABI_enum_size: int
Tag_ABI_HardFP_use: SP and DP
Tag_ABI_VFP_args: VFP registers
Tag_CPU_unaligned_access: v6
Tag_DIV_use: Not allowed

and the result is also strange, as below:
1.34409e+38 1.34409e+38
1.34409e+38 1.34409e+38
1.34409e+38 1.34409e+38
1.34409e+38 1.34409e+38

@xianyi , do you know whether our OpenBLAS is support ARMv7 in hard float mode? I tried many branches and many methods, but seems no work.
how much will the performance be promoted in hard float mode compared with softfp mode?(we can use softfp mode instead, but the performance is a little slow.)

@ashwinyes
Copy link
Contributor

@gangm Can you give the "readelf" output for your pthread and boost libraries as well ?

@gangm
Copy link
Author

gangm commented Apr 10, 2017

@ashwinyes readelf of pthread is as below:
File Attributes
Tag_CPU_name: "7-A"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Application
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-2
Tag_FP_arch: VFPv3
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Needed
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align_needed: 8-byte
Tag_ABI_align_preserved: 8-byte, except leaf SP
Tag_ABI_enum_size: int
Tag_ABI_HardFP_use: SP and DP
Tag_ABI_optimization_goals: Aggressive Speed
Tag_DIV_use: Not allowed

@solrex
Copy link

solrex commented Apr 28, 2017

@gangm I encountered the same problem. More info:

  1. Built all with armeabi-v7a-hard with NEON, got wrong result just as your case.

  2. Built others with armeabi-v7a with NEON, change OpenBLAS build script:

sed -i -e 's/float-abi=hard/float-abi=softfp/g' Makefile.arm

Caffe load model worked, run Forward() crashed. I guess the problem is in OpenBLAS (forwarding use blas, loading not).

Update

Branch https://github.com/xianyi/OpenBLAS/tree/arm_soft_fp_abi works for armeabi-v7a with NEON, ignore my comments 2.

@martin-frbg
Copy link
Collaborator

Please note that the arm_soft_fp_abi branch is a work in progress, only a handful of functions have been modified for the softfp abi so far.

@scorpeeon
Copy link

Hello!

I compiled openblas for Android, linked with the specified flags to avoid issues with hard float. I have the same issue, and also other issues reported by others on armv7 with hard float on Android (some functions returning zero or other incorrect values, or not returning at all). (they work fine on all other architectures I tried including armv5. arm64 (armv8), x86.
I did not open another issue as I found plenty open ones that mention the same issue, including: #777, #853, #894, #1088.

Do we know anything about the cause of these issues?

I understand that adding soft float support is in progress (as mentioned in this thread also) but is done for only a handful of functions on another branch.
Are there other ways around this issue?
One workaround would be to just use the armv5 libraries, which works fine an armv7 also, but I did some benchmarking and found it to be around 50 times slower when doing certain things like multiplying big float or double matrices, which is pretty much expected.

@ctgushiwei
Copy link

ctgushiwei commented Jun 20, 2017

@gangm @scorpeeon @martin-frbg I also use the "arm_soft_fp_abi" branch, and use compile command:"make TARGET=ARMV7 NOFORTRAN=1 HOSTCC=gcc ARM_SOFTFP_ABI=1", the result is correct.

but when i use the hard float flag ,it can compiled successfully,but test the cblas_sgemm(),it can not work normly,the error is segmentation fault,i think the reason is assembly code.because i use the c code ,it can work ok.

@ping996
Copy link

ping996 commented Aug 17, 2019

@ctgushiwei are you some idea about my questions: https://stackoverflow.com/questions/57534249/compile-the-flutter-engine-with-hard-float-type-library

Thanks for you help in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants