-
Notifications
You must be signed in to change notification settings - Fork 491
Benchmark Caffe Time 2.0
Caffe Time 2.0 tool allows you to measure various performance indicators.
To use it you need to compile with compile with –D PERFORMANCE_MONITORING=1 flag
Example : rm -fr build && mkdir build && cd build && export MKLDNNROOT="" && export MKLROOT=/opt/mklml_lnx_2017.0.2.20170110/ && cmake .. -DCPU_ONLY=ON -DUSE_MKL2017_AS_DEFAULT_ENGINE=ON -DPERFORMANCE_MONITORING=ON && make all -j && cd ..
Caffe has benchmark tool built in, its called caffe time. You can run it for example using below command:
./build/tools/caffe time -model=models/default_googlenet_v2/train_val.prototxt
We created our own tool (caffe time 2.0) to make more precise measurements. To enable more thorough benchmark you need to compile caffe with PERFORMANCE_MONITORING flag set. And run training, ie:
./build/tools/caffe train -solver=models/default_googlenet_v2/solver.prototxt
After training output from our performance monitor will appear at the end of the output. It provides info about how much time in nanoseconds was spend on operations in each layer. It returns average time, minimum, maximum. There are two kinds of columns with suffix total and proc. Data in proc columns show how much time was spend on calculations, total also includes time for writing/reading, lags etc
If you want to check how it is done in the code, take a look at performance.hpp header file in caffe/include/caffe/util/performance.hpp
.
The most important are functions defined at the top PERFORMANCE_CREATE_MONITOR, PERFORMANCE_INIT_MONITOR, PERFORMANCE_MEASUREMENT_BEGIN, PERFORMANCE_MEASUREMENT_END_STATIC.
(Static function is a performance tweak so that we decrease calls to getEventIdByName where we know that name won't change. For example - mkl_conversion)
Also notice class Measurement which is implemented as a sort of stack. It is for that there are some measurements nested in other measurements, ie. in MKL layers. For example in src/caffe/mkl_memory.cpp you can see in line 198 (call to PERFORMANCE_MEASUREMENT_BEGIN) and line 200 (call to PERFORMANCE_MEASUREMENT_END_STATIC)