-
Notifications
You must be signed in to change notification settings - Fork 6
Inference time #5
Comments
Nope Someone must write a standard inference speed test, that we could use to measure it. I'll test current wheel with latest open-vino release on my CPU i7-8550U CPU @ 1.80GHz, but I do not know exactly when. |
I've created two Ubuntu 18.04 LTS instances with For first instance, I've downloaded and installed Inference speed was tested and measured as described here. Code for models downloading on OpenVINO instance: #!/bin/bash
# urls, filenames and checksums are from:
# + <https://github.com/opencv/open_model_zoo/blob/2020.1/models/intel/text-detection-0004/model.yml>
declare -a models=("text-detection-0004.xml"
"text-detection-0004.bin")
url_start="https://download.01.org/opencv/2020/openvinotoolkit/2020.1/open_model_zoo/models_bin/1"
for i in "${models[@]}"; do
if [ ! -f $i ]; then
wget "${url_start}/${i%.*}/FP32/${i}"
else
sha256sum -c "${i}.sha256sum"
fi
done
Conclusion -- no difference. My best guess for your problem -- you were using different Target/Backend combinations or your CPU has AVX512 instruction set (it enabled in OpenVINO and disabled in my wheel) or you tangled in envars. Try to repeat my steps and check |
Thanks for checking My model is se_resnext50, converted from pytorch_toolbelt -> onnx -> IR With ver7 models, that i have previously (not se_resnext50), all is ok, same speed. But with new one, 10, difference in 10x 300ms for opencv-python-inference-engine, 30ms for openvino version on the same machine Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz, no avx512
I have noticed that, when my production code suddenly became 10x slower with that model. |
With your net: import cv2
import numpy as np
xml_model_path = "se_net.xml"
net = cv2.dnn.readNet(xml_model_path, xml_model_path[:-3] + 'bin')
blob = (np.random.standard_normal((1, 3, 224, 224)) * 255).astype(np.uint8)
net.setInput(blob)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_INFERENCE_ENGINE)
_ = net.forward()
%timeit _ = net.forward() OpenVINO: 49.2 ms ± 2.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) So yes, there is a 10 fold inference speed difference on that net. But I have no idea why. Maybe because se_resnext50 has some new layers, which are fast with some third party libraries which I did not compile:
I'll try few different builds |
Well, I managed to repeat third party lib setup:
But nothing changed. |
Maybe it is somehow related to MKL DNN, TBB libraries? |
It could be. I.e., dldt compiled with But I'll better go and ask dldt developers about it and how I could get dldt build info. |
My question about dldt build info: https://stackoverflow.com/questions/60704887/q-how-can-i-get-dldt-buildinfo Also, may be related to this: openvinotoolkit/openvino#166 I'll write here if I found something new |
So, I managed to solve it in the first perspective. I had to compile "dld"t with Now the inference speed of provided NN is the same with the use of OpenVINO or my wheel -- ~48ms, but it is +125MB ( OpenVINO has some 30MB |
Building mkl_tiny is not an option: uxlfoundation/oneDNN#674 But I built wheel with OpenBLAS and it has the same inference speed with |
Seems MKL should be faster ( https://software.intel.com/en-us/articles/performance-comparison-of-openblas-and-intel-math-kernel-library-in-r ) But if you are sure, i'm ok with that, thank for your work! |
Well I have problems to compile OpenBLAS in the fastest way possible. Now the fastest OpenBLAS variant is to use precompiled ubuntu lib:
Inference time for se_net I suggest that this wheel is 10% slower with your "se_net" than OpenVINO. And this without-gfortran-wheel is 2 times slower than OpenVINO, but Any feedback is welcome UPD: With the help of OpenBLAS contributors, I've managed to compile 4.2MB lib with approximately same inference speed as mkl_tiny (for se_net). Please refer here for details: OpenMathLib/OpenBLAS#2528 |
Solved with v4.2.0.4 release |
Great update, thanks! |
FYI New version of openvino
|
@Kulikovpavel aha, thx. Will need to do something with this at the weekend. |
@Kulikovpavel now one could compile dldt with GEMM=JIT and have OpenBLAS compatible speed on your net (and save 4.2 MB):
|
Hi, @banderlog , i have the same problem with performance of your version of opencv in new release with the same network as above |
@Kulikovpavel well, I'll check it, in case I missed something, but I am running tests for inference speed with your network each time I build the library. Before this release I checked inference speed of JIT vs OpenBLAS and left OpenBLAS (50 vs 40 ms approximately). You have the same network and setup as before? |
As you may see from the below table, all as it should be. I even tested separately wheels from repo and from pypi.
In the message above I reported smaller numbers, but I've achieved them in a different environment (lxd linux container). Code for replication: # sudo snap install multipass
multipass launch -c 6 -d 10G -m 7G -n test
multipass shell test
sudo apt-get update
sudo apt install git python3 virtualenv
git clone https://github.com/banderlog/opencv-python-inference-engine
cd opencv-python-inference-engine/tests/
# wget https://files.pythonhosted.org/packages/f0/ee/36d75596ce0b6212821510efb56dfca962f4add3fdaf49345bc93920a984/opencv_python_inference_engine-4.3.0.2-py3-none-manylinux1_x86_64.whl
wget https://github.com/banderlog/opencv-python-inference-engine/releases/download/v4.3.0.2/opencv_python_inference_engine-4.3.0.2-py3-none-manylinux1_x86_64.whl
wget https://github.com/banderlog/opencv-python-inference-engine/releases/download/v4.3.0.1/opencv_python_inference_engine-4.3.0.1-py3-none-manylinux1_x86_64.whl
wget https://github.com/banderlog/opencv-python-inference-engine/releases/download/v4.2.0.4/opencv_python_inference_engine-4.2.0.4-py3-none-manylinux1_x86_64.whl
wget https://github.com/banderlog/opencv-python-inference-engine/releases/download/v4.2.0.3/opencv_python_inference_engine-4.2.0.3-py3-none-manylinux1_x86_64.whl
# first run will take a lot of time, because it will install all needed python packages
./prepare_and_run_tests.sh opencv_python_inference_engine-4.3.0.2*
./prepare_and_run_tests.sh opencv_python_inference_engine-4.3.0.1*
./prepare_and_run_tests.sh opencv_python_inference_engine-4.2.0.4*
./prepare_and_run_tests.sh opencv_python_inference_engine-4.2.0.3* If you will run it, you should receive different inference times, but 4.2.0.3's must be x10 greater than others. |
@Kulikovpavel your performance problem still persist? |
By some reason inference time with IR model version 10 (new one) ten times slower on CPU with this wheel, that when using openvino toolkit integrated opencv.
Any idea why?
The text was updated successfully, but these errors were encountered: