Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getConfidenceScore returns NaN #57

Closed
kingjason opened this issue Aug 5, 2016 · 14 comments
Closed

getConfidenceScore returns NaN #57

kingjason opened this issue Aug 5, 2016 · 14 comments

Comments

@kingjason
Copy link

We are running caffe-android-lib on a wide variety of Android phones, and we are seeing getConfidenceScore periodically return NaN. We've traced it back far enough to know that the float vector returned by Forward has NaN values. After it returns NaN, it continues to return NaN regardless of the image we send to Caffe until the Android app is restarted. It occurs much more often on certain phones.

I don't think it is an issue with our Caffe model, as we have tried several different ones all having the same behavior. Images do not consistently score NaN. If it scores NaN, we re-score it after restarting the app, and it scores correctly.

@sh1r0
Copy link
Owner

sh1r0 commented Aug 13, 2016

Did try to run your model on desktop?

@kingjason
Copy link
Author

Yes, we've done a lot of testing on desktop and haven't seen similar issues. We've mainly used OpenBLAS on desktop and Eigen on Android, so that could be a factor.

@sh1r0
Copy link
Owner

sh1r0 commented Aug 14, 2016

I would recommend you to use OpenBLAS on both platforms. And I'm actually planning to remove the support of Eigen in my forked caffe.

@kingjason
Copy link
Author

I'll give it a try again. The last time I tried it, OpenBLAS was much slower then Eigen. With the model I was using, Eigen was taking less then half a second and OpenBLAS was taking around 10 seconds to predict a single image. Any tips on how to speed that up?

@sh1r0
Copy link
Owner

sh1r0 commented Aug 15, 2016

Hmm... That's a little bit strange. According to #27, there should not have a such big difference of performance between 2 libraries. Could you try to benchmark caffenet on your devices over 2 libraries?
Thanks.

@woodthom2
Copy link

woodthom2 commented Aug 15, 2016

Hi
I have had the same problem (inconsistent output from the NN) so was advised to switch from Eigen to OpenBLAS.

I did the following commands before building:

export USE_OPENBLAS=1
export ANDROID_ABI="armeabi"

I evaluated on GoogLeNet with HTC One with 4 threads. I found that OpenBLAS is about 4 times slower than Eigen.

-----

HTC One with Eigen:
20 iterations:
Time taken: 197025.0
=> 9.8 sec to process 1 image

----

HTC One with OpenBLAS:
20 iterations:
Time taken: 821187.0
=> 41 sec to process 1 image

@sh1r0
Copy link
Owner

sh1r0 commented Aug 16, 2016

@woodthom2 With ANDROID_ABI="armeabi", OpenBLAS is built with single thread and it's actually for armv5.

@woodthom2
Copy link

woodthom2 commented Aug 16, 2016

Thank you.
When I switch to

export ANDROID_ABI="armeabi-v7a-hard-softfp with NEON"

I get an error on building Boost:

sources/cxx-stl/gnu-libstdc++/4.9/include/cstddef:44:28: fatal error: bits/c++config.h: No such file or directory
 #include <bits/c++config.h>
                            ^
compilation terminated.

Which ANDROID_ABI value do you recommend to use?

@sh1r0
Copy link
Owner

sh1r0 commented Aug 16, 2016

I think armeabi-v7a-hard-softfp with NEON is the right one for your case. Sorry, I did not have that error before. Please check your building environment, i.e., ndk, cmake, etc.

@sh1r0 sh1r0 closed this as completed Aug 29, 2016
@hpsaturn
Copy link

hpsaturn commented Oct 22, 2016

I obtain the same error when I change from "arm64-v8a" to "armeabi-v7a-hard-softfp with NEON". On arm64-v8a setup compile, linking and execute, works fine.

To replicate the error I made a Dockerfile:

FROM ubuntu:15.10

ENV OPENCV_VERSION=master \
    FFMPEG_VERSION=3.0.1 \
    ANDROID_NDK_HOME=/opt/android-ndk \
    NDK_ROOT=/opt/android-ndk \
    ANDROID_ABI="arm64-v8a" 

RUN apt-get update -qq && apt-get install --yes \
  build-essential \
  cmake \
  ca-certificates \
  curl \
  wget \
  dh-autoreconf \
  git \
  libass-dev \
  libjpeg-dev \
  libtiff5-dev \
  libjasper-dev \
  libpng12-dev \
  libavcodec-dev \
  libavformat-dev \
  libopenjpeg-dev \
  libswscale-dev \
  libv4l-dev \
  libatlas-base-dev \
  libfreetype6-dev \
  librtmp-dev \
  libxvidcore-dev \
  libx264-dev \
  libtbb-dev \
  libavutil-dev \
  libavdevice-dev \
  libavfilter-dev \
  libavresample-dev \
  libpostproc-dev \
  libopenexr-dev \
  pkg-config \
  yasm \
  unzip \
  && apt-get clean && rm -rf /var/tmp/* /var/lib/apt/lists/* /tmp/*

# ----------------------------------------- Android NDK ----------------------------------------
# download
RUN mkdir /opt/android-ndk-tmp \
    && cd /opt/android-ndk-tmp \
# get last NDK version
    && wget -q http://dl.google.com/android/repository/android-ndk-r12-linux-x86_64.zip \
    && cd /opt/android-ndk-tmp \ 
    && unzip -q android-ndk-r12-linux-x86_64.zip \
# move to it's final location
    && cd /opt/android-ndk-tmp \ 
    && mv ./android-ndk-r12 /opt/android-ndk \
# remove temp dir
    && rm -rf /opt/android-ndk-tmp

# -----------------------------------Tools and libraries -------------------------------------
RUN mkdir -p /opt/ \
    && git clone --recursive https://github.com/sh1r0/caffe-android-lib.git /opt/src/ \
    && cd /opt/src \

# -----------------------------------Compile all libraries scripts ----------------------------------------
RUN cd /opt/src && ./build.sh

error output:

Scanning dependencies of target boost_atomic
[  0%] Building CXX object CMakeFiles/boost_atomic.dir/boost_1_56_0/libs/atomic/src/lockpool.cpp.o
In file included from /opt/src/boost/boost_1_56_0/libs/atomic/src/lockpool.cpp:15:0:
/opt/android-ndk/sources/cxx-stl/gnu-libstdc++/4.9/include/cstddef:44:28: fatal error: bits/c++config.h: No such file or directory
 #include <bits/c++config.h>
                            ^
compilation terminated.
CMakeFiles/boost_atomic.dir/build.make:54: recipe for target 'CMakeFiles/boost_atomic.dir/boost_1_56_0/libs/atomic/src/lockpool.cpp.o' failed
make[2]: *** [CMakeFiles/boost_atomic.dir/boost_1_56_0/libs/atomic/src/lockpool.cpp.o] Error 1

@sh1r0
Copy link
Owner

sh1r0 commented Oct 22, 2016

@hpsaturn As I commented before, I think your problem was caused by cmake, please upgrade your cmake to at least v3.5.2. Thanks.

EDIT: please use android-ndk-r11c for armeabi-v7a-hard-softfp with NEON, the reason is that the support for the armeabi-v7a-hard ABI was removed since android-ndk-r12 (https://android.googlesource.com/platform/ndk/+/ndk-r12-release/docs/HardFloatAbi.md).

@hpsaturn
Copy link

Yes, when I changed NDK to r11c, "armeabi-v7a-hard-softfp with NEON" build works! The final docker file for this architecture:

FROM ubuntu:15.10

ENV OPENCV_VERSION=master \
    FFMPEG_VERSION=3.0.1 \
    ANDROID_NDK_HOME=/opt/android-ndk \
    NDK_ROOT=/opt/android-ndk \
    ANDROID_ABI="armeabi-v7a-hard-softfp with NEON" 

RUN apt-get update -qq && apt-get install --yes \
  build-essential \
  cmake \
  ca-certificates \
  curl \
  wget \
  dh-autoreconf \
  git \
  libass-dev \
  libjpeg-dev \
  libtiff5-dev \
  libjasper-dev \
  libpng12-dev \
  libavcodec-dev \
  libavformat-dev \
  libopenjpeg-dev \
  libswscale-dev \
  libv4l-dev \
  libatlas-base-dev \
  libfreetype6-dev \
  librtmp-dev \
  libxvidcore-dev \
  libx264-dev \
  libtbb-dev \
  libavutil-dev \
  libavdevice-dev \
  libavfilter-dev \
  libavresample-dev \
  libpostproc-dev \
  libopenexr-dev \
  pkg-config \
  yasm \
  unzip \
  && apt-get clean && rm -rf /var/tmp/* /var/lib/apt/lists/* /tmp/*

# ----------------------------------------- Android NDK ----------------------------------------
# download
RUN mkdir /opt/android-ndk-tmp \
    && cd /opt/android-ndk-tmp \
# get last NDK version
    && wget -q https://dl.google.com/android/repository/android-ndk-r11c-linux-x86_64.zip \
    && cd /opt/android-ndk-tmp \ 
    && unzip -q android-ndk-r11c-linux-x86_64.zip \
# move to it's final location
    && cd /opt/android-ndk-tmp \ 
    && mv ./android-ndk-r11c /opt/android-ndk \
# remove temp dir
    && rm -rf /opt/android-ndk-tmp

# -----------------------------------Tools and libraries -------------------------------------
RUN mkdir -p /opt/ \
    && git clone --recursive https://github.com/sh1r0/caffe-android-lib.git /opt/src/ 
# -----------------------------------RUN all libraries scripts ----------------------------------------
RUN cd /opt/src && ./build.sh

@mehatab-shaikh
Copy link

@kingjason Have you found any solution?
I have the same problem.

@mehatab-shaikh
Copy link

@sh1r0 getConfidenceScore periodically return NaN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants