Performance degradation from `2.8.1` to `2.9.2` #2951

caiotoledo-lunasystems · 2024-07-08T09:26:12Z

平台(如果交叉编译请再附上交叉编译目标平台):

Platform(Include target platform as well if cross-compiling):

Processor SoC QCM2290.

Github版本:

Github Version:

Version tags:

Version 2.8.1: https://github.com/alibaba/MNN/tree/d284430f92557aa8b4cc435752b1dff3309f2e38
Version 2.9.2: https://github.com/alibaba/MNN/tree/e1011161ed0382e1a33a65bfdde8bee931dbcfaf

编译方式:

Compiling Method

cmake \
  -DCMAKE_INSTALL_PREFIX=${SCRIPTPATH}/.install_android/ \
  -DCMAKE_TOOLCHAIN_FILE=~/Android/Sdk/ndk/25.2.9519653/build/cmake/android.toolchain.cmake \
  -DCMAKE_BUILD_TYPE=Release \
  -DANDROID_ABI="arm64-v8a" \
  -DANDROID_STL=c++_shared \
  -DMNN_USE_LOGCAT=ON \
  -DMNN_ARM82=ON \
  -DMNN_SUPPORT_BF16=ON \
  -DMNN_OPENCL=ON \
  -DMNN_VULKAN=ON \
  -DMNN_BUILD_OPENCV=ON \
  -DMNN_IMGCODECS=ON \
  -DMNN_JNI=ON \
  -DANDROID_NATIVE_API_LEVEL=android-21 \
  -DMNN_BUILD_FOR_ANDROID_COMMAND=true \
  -DNATIVE_LIBRARY_OUTPUT=. -DNATIVE_INCLUDE_OUTPUT=. \
  -DMNN_BUILD_TEST=ON \
  -DMNN_BUILD_CONVERTER=ON \
  -DMNN_BUILD_BENCHMARK=ON \
  ../

The performance of version 2.9.2 is worse than the 2.8.1, using the benchmark tool:

Version 2.9.2

bengal_2w:/data/local/tmp/mnn-2.9.2-lib-arm64 # LD_LIBRARY_PATH=./:../cpp_shared/arm64-v8a/ ./benchmark.out ../ai-models/ 10 3 3
MNN benchmark
Forward type: OpenCL thread=4 precision=2 sparsity=0 sparseBlockOC=1 testQuantizedModel=0
--------> Benchmarking... loop = 10, warmup = 3
[ - ] yolov8n_160.mnn      max =   87.178 ms  min =   84.706 ms  avg =   85.698 ms

Version 2.8.1:

bengal_2w:/data/local/tmp/MNN # LD_LIBRARY_PATH=./lib/ bin/benchmark.out models/ 10 3 3
MNN benchmark
Forward type: OpenCL thread=4 precision=2 sparsity=0 sparseBlockOC=1 testQuantizedModel=0
--------> Benchmarking... loop = 10, warmup = 3
[ - ] yolov8n_160.mnn         max =   65.977 ms  min =   62.635 ms  avg =   63.617 ms

The model used for this test is the YoloV8 Nano from Ultralytics, using an image input size of 160x160.

The text was updated successfully, but these errors were encountered:

jxt1234 · 2024-07-22T12:03:50Z

收到，我们排查一下

jxt1234 · 2024-09-06T09:41:53Z

内部代码修正，近期同步

jxt1234 · 2024-09-12T08:58:37Z

2.9.5 已经修正，可以更新并测试下，内部验证结果是快于 2.8.1 版本

caiotoledo-lunasystems · 2024-09-12T16:57:23Z

@jxt1234,

I tested again the same model in the board QCM2290, but I'm still getting worse numbers than the version 2.8.1:

bengal_2w:/data/local/tmp/mnn-2.9.5 # LD_LIBRARY_PATH=./lib/ ./bin/benchmark.out ../mnn-models/ 10 3 3
MNN benchmark                                                                                  
Forward type: OpenCL thread=4 precision=2 sparsity=0 sparseBlockOC=1 testQuantizedModel=0
--------> Benchmarking... loop = 10, warmup = 3
[ - ] yolov8n_160.mnn         max =   83.413 ms  min =   81.695 ms  avg =   82.418 ms

However, the performance is indeed better than the 2.9.2.

Using our internal applications I see around the same performance numbers as your benchmark binary.

jxt1234 added the bug Something isn't working label Jul 22, 2024

jxt1234 closed this as completed Sep 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance degradation from `2.8.1` to `2.9.2` #2951

Performance degradation from `2.8.1` to `2.9.2` #2951

caiotoledo-lunasystems commented Jul 8, 2024

jxt1234 commented Jul 22, 2024

jxt1234 commented Sep 6, 2024

jxt1234 commented Sep 12, 2024

caiotoledo-lunasystems commented Sep 12, 2024

Performance degradation from 2.8.1 to 2.9.2 #2951

Performance degradation from 2.8.1 to 2.9.2 #2951

Comments

caiotoledo-lunasystems commented Jul 8, 2024

平台(如果交叉编译请再附上交叉编译目标平台):

Platform(Include target platform as well if cross-compiling):

Github版本:

Github Version:

编译方式:

Compiling Method

jxt1234 commented Jul 22, 2024

jxt1234 commented Sep 6, 2024

jxt1234 commented Sep 12, 2024

caiotoledo-lunasystems commented Sep 12, 2024

Performance degradation from `2.8.1` to `2.9.2` #2951

Performance degradation from `2.8.1` to `2.9.2` #2951