Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NCNN speed seems much slower than caffe2 on armv7l #81

Closed
duangenquan opened this issue Aug 4, 2017 · 8 comments
Closed

NCNN speed seems much slower than caffe2 on armv7l #81

duangenquan opened this issue Aug 4, 2017 · 8 comments

Comments

@duangenquan
Copy link

I followed the sample, and run the sample code in the following device, the speed is about 2.2s.
While, caffe2 costs 260ms on the same device.

Are there anything configured wrong in my tests?

I updated CMakeLists.txt as below:
find_package(OpenCV REQUIRED core highgui imgproc)
find_package(OpenMP REQUIRED)
find_package(Threads REQUIRED)

include_directories(/tank/workspace/stone/ncnn/build/install/include)
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/../src)
include_directories(${CMAKE_CURRENT_BINARY_DIR}/../src)

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fopenmp -lpthread")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fopenmp -lpthread")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fopenmp -lpthread")
add_executable(squeezenet squeezenet.cpp)
target_link_libraries(squeezenet ncnn ${OpenCV_LIBS})

The device info is as this:
Architecture: armv7l
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Model name: ARMv7 Processor rev 1 (v7l)
CPU max MHz: 1800.0000
CPU min MHz: 126.0000
Hypervisor vendor: (null)
Virtualization type: full

Thanks!

@duangenquan duangenquan changed the title NCNN speed seems slower than caffe2 on armv7l NCNN speed seems much slower than caffe2 on armv7l Aug 4, 2017
@saturosfz
Copy link

我的实验,ncnn跑squeezenet开openmp在nexus5上200ms不到

@duangenquan
Copy link
Author

感谢您的回复。我也觉得有些奇怪,所以把ncnn的配置和设备的参数也贴在上面了。是因为设备太差了吗?

看介绍说是对安卓优化最好,其次ios,其次linux。。

@nihui
Copy link
Member

nihui commented Aug 4, 2017

参考 #28

@duangenquan
Copy link
Author

duangenquan commented Aug 4, 2017

参照#28, 我给src/CMakeLists.txt 37加了 true,现在速度是~300ms。

@BKZero
Copy link

BKZero commented Aug 4, 2017

我这边的测试结果和楼主差不多。caffe稳定在150ms,ncnn比这个要略慢一点。
openmp已开,在多个安卓设备上进行过测试。部分设备上开不同线程数量能够从速度和cpu占用上看出差别,大多数设备上看不出稳定的资源占用。这个问题之前在其他项目上也遇到过,不知道是什么原因导致openmp的加速效果出不来。
另外,即使是openmp线程能够正常开启的机型上,加速效果也不明显,最多到130ms且极不稳定。我记得在知乎上看到的性能对比图多线程能到70ms,不知道有没有记错。不知道那个是在什么设备什么系统上跑出来的结果。

@BKZero
Copy link

BKZero commented Aug 4, 2017

但是楼主你跑出一两秒的时间这也太夸张了。。。

@duangenquan
Copy link
Author

我换了个计时器,速度大概是300ms,感觉正常了,但是速度略慢一些。

@baby313
Copy link

baby313 commented Sep 7, 2017

@BKZero 安卓开并行大部分手机都会有降频处理导致速度过慢,目前我的是单核反而稳定切快。评测的应该是没有降频的平台或者开发板。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants