-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About Openblas #27
Comments
hi sh1r0: |
How did you get it to work? Did you cross compile the OpenBLAS library with hard float support? |
AFAIK, Eigen can be simply used as a header-only library, and is quite competitive with other BLAS-like libraries (refer to the benchmark, and note that OpenBLAS is based on GotoBLAS). I'm not going to say that Eigen is the best choice in all the cases, but it's a simple and great one at least in my case. |
There is a specific openblas branch for "deep learning" at https://github.com/xianyi/OpenBLAS/tree/optimized_for_deeplearning?files=1 |
I modified the flag "-mfloat-abi=hard" to "softfp"(it came error when openblas cross compile with hard float while caffe with softfp)@sh1r0 I tried the outdated pre-built one and https://github.com/xianyi/OpenBLAS/tree/optimized_for_deeplearning?files=1 |
To use the pre-built OpenBLAS:
@@ -19,7 +19,7 @@ OPENCV_ROOT=${ANDROID_LIB_ROOT}/opencv/sdk/native/jni
PROTOBUF_ROOT=${ANDROID_LIB_ROOT}/protobuf
GFLAGS_HOME=${ANDROID_LIB_ROOT}/gflags
BOOST_HOME=${ANDROID_LIB_ROOT}/boost_1.56.0
-export OpenBLAS_HOME=${ANDROID_LIB_ROOT}/openblas
+export OpenBLAS_HOME=${ANDROID_LIB_ROOT}/openblas-android
export EIGEN_HOME=${ANDROID_LIB_ROOT}/eigen3
rm -rf "${BUILD_DIR}"
@@ -40,7 +40,7 @@ cmake -DCMAKE_TOOLCHAIN_FILE="${WD}/android-cmake/android.toolchain.cmake" \
-DUSE_LMDB=OFF \
-DUSE_LEVELDB=OFF \
-DUSE_HDF5=OFF \
- -DBLAS=eigen \
+ -DBLAS=open \
-DBOOST_ROOT="${BOOST_HOME}" \
-DGFLAGS_INCLUDE_DIR="${GFLAGS_HOME}/include" \
-DGFLAGS_LIBRARY="${GFLAGS_HOME}/lib/libgflags.a" \
On the other hand, regarding the master or optimized_for_deeplearning branch of OpenBLAS, hard float support is required. And as I said, it works for native executables but not for jni libs. If you want to build this project with hard float support, you can simply set the flag in the shell |
Thank you very much@sh1r0. It worked with OpenBLAS-0.2.15.tar.gz when I had compile all dependencies with hard float support, with your help. But it seemed to show that using openblas is more faster than eigen in the forwarding of caffe model( 400-800ms faster). I thought may the version eigen is 3.2.5 and it was not the latest,but the openblas was the latest. |
I used the latest version of eigen (3.2.7), but got the same result... I wonder some flag (like "neon" etc) need to be set to eigen when compiling caffe with eigen. |
Hi @wuxuewu , good to know that. Do you mean that you have succeeded in getting jni work with hard float? Could you share experience? Thanks. |
@wuxuewu
|
Hi sh1r0: ===== OpenBLAS ===== note: caffe model, and cpu mode, eigen 3.2.7, OpenBLAS 0.2.15 |
Hi @wuxuewu , |
Hi sh1r0, phone A phone B ------------openblas - 4 ----------- ------------- eigen ----------- phone A: AArch64, android 6.0, 8 core |
I count the time with the following change in caffe_mobile.cpp, because I found predicting on phone A the function "clock()" was not precise.The log output was "Prediction time: 3900ms" while I saw the app returned results less than one second. So I used the following way to count the time.(The log would output and could get the time in the window logcat of eclipse) |
Hi @wuxuewu , it seems that your prediction results are correct? I mean, for example, |
Hi @sh1r0 : if [ -z "$NDK_ROOT" ] && [ "$#" -eq 0 ]; then #export OPENBLAS_NUM_THREADS=1 cd OpenBLAS make clean rm -rf "$INSTALL_DIR/openblas" I used the "caffe/examples/images/cat.jpg" to predict, but I did not focus on the result of prediction. I modified the last layer of the caffe model with only 4 outputs, but I did not change the synset_words.txt remained 1000 classifications. Does that matter? |
the script of building OpenBLAS is below:
|
@wuxuewu BTW, what NDK version do you use? |
Hi @sh1r0 , My NDK version is |
And the time it took was almost the same as above in phone B (phone B: Armv7 rev 1, android4.4.2, 4 core) . I just tested it on phone B. |
Hi @wuxuewu , |
I just got another phone to test, the results were (unsurprisingly?) incorrect, too. Perhaps, device is not the problem. My tests follow this ( Note: This attached image is my prediction result of |
I think maybe the key of the question is the caffemodel. You could use another caffemodel... I use caffemodel downloading from |
@wuxuewu , cd caffe
./scripts/download_model_binary.py models/bvlc_reference_caffenet |
I downloaded the caffe on Dec,22. And the caffe zip name is |
Why you need to download caffe?
Sorry, I cannot get the idea. So, if possible, let me know the results of your build with the latest master branch. |
Hi @wuxuewu , |
Hi @wuxuewu , I cannot get the clear idea why multi-threading not works on my devices. Both of my devices are quad-core. It's really a pity that the computation power is not fully utilized. |
@sh1r0 Is the issue related to the fact that "The JNI interface pointer (JNIEnv *) is only valid in the current thread."? Have you tested with openmp flags? See https://github.com/xianyi/OpenBLAS/wiki/faq#multi-threaded |
@bhack According to the reports above from @wuxuewu , I think NUM_THREADS with value greater than 1 works for him. However, some people mentioned in OpenMathLib/OpenBLAS#363 that OpenBLAS for android works only if single-threaded (?). |
If the native code in caffe called by jni use threads openblas need to parallelize with openmp |
@bhack Thanks for your information. I just updated the master branch to support OpenMP. |
@sh1r0 Next step CUDA support on android tegra k1 and x1 could be very useful. |
@bhack |
Great work! I also want to play caffe on android :) |
hi sh1r0:
I'm very interesting in your project.This project is very wondeful.It works very well with eigen,but it seems not work with openblas. I ran it in android, but it crashed infunction "cblas_sgemm".
The text was updated successfully, but these errors were encountered: