Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation from 2.8.1 to 2.9.2 #2951

Closed
caiotoledo-lunasystems opened this issue Jul 8, 2024 · 4 comments
Closed

Performance degradation from 2.8.1 to 2.9.2 #2951

caiotoledo-lunasystems opened this issue Jul 8, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@caiotoledo-lunasystems
Copy link

平台(如果交叉编译请再附上交叉编译目标平台):

Platform(Include target platform as well if cross-compiling):

Processor SoC QCM2290.

Github版本:

Github Version:

Version tags:

编译方式:

Compiling Method

cmake \
  -DCMAKE_INSTALL_PREFIX=${SCRIPTPATH}/.install_android/ \
  -DCMAKE_TOOLCHAIN_FILE=~/Android/Sdk/ndk/25.2.9519653/build/cmake/android.toolchain.cmake \
  -DCMAKE_BUILD_TYPE=Release \
  -DANDROID_ABI="arm64-v8a" \
  -DANDROID_STL=c++_shared \
  -DMNN_USE_LOGCAT=ON \
  -DMNN_ARM82=ON \
  -DMNN_SUPPORT_BF16=ON \
  -DMNN_OPENCL=ON \
  -DMNN_VULKAN=ON \
  -DMNN_BUILD_OPENCV=ON \
  -DMNN_IMGCODECS=ON \
  -DMNN_JNI=ON \
  -DANDROID_NATIVE_API_LEVEL=android-21 \
  -DMNN_BUILD_FOR_ANDROID_COMMAND=true \
  -DNATIVE_LIBRARY_OUTPUT=. -DNATIVE_INCLUDE_OUTPUT=. \
  -DMNN_BUILD_TEST=ON \
  -DMNN_BUILD_CONVERTER=ON \
  -DMNN_BUILD_BENCHMARK=ON \
  ../


The performance of version 2.9.2 is worse than the 2.8.1, using the benchmark tool:

  • Version 2.9.2
bengal_2w:/data/local/tmp/mnn-2.9.2-lib-arm64 # LD_LIBRARY_PATH=./:../cpp_shared/arm64-v8a/ ./benchmark.out ../ai-models/ 10 3 3
MNN benchmark
Forward type: OpenCL thread=4 precision=2 sparsity=0 sparseBlockOC=1 testQuantizedModel=0
--------> Benchmarking... loop = 10, warmup = 3
[ - ] yolov8n_160.mnn      max =   87.178 ms  min =   84.706 ms  avg =   85.698 ms
  • Version 2.8.1:
bengal_2w:/data/local/tmp/MNN # LD_LIBRARY_PATH=./lib/ bin/benchmark.out models/ 10 3 3
MNN benchmark
Forward type: OpenCL thread=4 precision=2 sparsity=0 sparseBlockOC=1 testQuantizedModel=0
--------> Benchmarking... loop = 10, warmup = 3
[ - ] yolov8n_160.mnn         max =   65.977 ms  min =   62.635 ms  avg =   63.617 ms

The model used for this test is the YoloV8 Nano from Ultralytics, using an image input size of 160x160.

@jxt1234
Copy link
Collaborator

jxt1234 commented Jul 22, 2024

收到,我们排查一下

@jxt1234 jxt1234 added the bug Something isn't working label Jul 22, 2024
@jxt1234
Copy link
Collaborator

jxt1234 commented Sep 6, 2024

内部代码修正,近期同步

@jxt1234
Copy link
Collaborator

jxt1234 commented Sep 12, 2024

2.9.5 已经修正,可以更新并测试下,内部验证结果是快于 2.8.1 版本

@jxt1234 jxt1234 closed this as completed Sep 12, 2024
@caiotoledo-lunasystems
Copy link
Author

@jxt1234,

I tested again the same model in the board QCM2290, but I'm still getting worse numbers than the version 2.8.1:

bengal_2w:/data/local/tmp/mnn-2.9.5 # LD_LIBRARY_PATH=./lib/ ./bin/benchmark.out ../mnn-models/ 10 3 3
MNN benchmark                                                                                  
Forward type: OpenCL thread=4 precision=2 sparsity=0 sparseBlockOC=1 testQuantizedModel=0
--------> Benchmarking... loop = 10, warmup = 3
[ - ] yolov8n_160.mnn         max =   83.413 ms  min =   81.695 ms  avg =   82.418 ms

However, the performance is indeed better than the 2.9.2.

Using our internal applications I see around the same performance numbers as your benchmark binary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants