Skip to content

Benchmarking bare metal

temap edited this page Apr 13, 2021 · 11 revisions

Size Benchmarks

CSiBE

The CSiBE is GNU GCC's benchmark of choice when profiling for size GNU. The official release doesn't have the ARC configurations. The ARC specific port can be obtained from here

Note, Python2 needs to be used below. On attempt to use Python3 there will be multiple funny issues, don't do that!

Basic usage

CSiBE provides a simple Python script which can help to do your first measurement. The help screen of the script describes the commonly used features and options. See the available options executing:

$ ./csibe.py --help

The most common usage is to call csibe.py without any option.

$ ./csibe.py

This creates a build/native/all_results.csv results file which contains all the sizes of the generated binaries.

If you would like to embed CSiBE measurement routines to your own build or measurement framework you should first checkout out 'bin/create_sample_project' script. This will help you through to create your own measurement. To create a measurement project that uses a cross-compiler check out the 'bin/create_sample_cross_compile_project' script.

Synopsys specific configurations

CSiBE is used to track the performance of the fallowing configurations.

Architecture Target Configuration
ARC EM gcc-arc-em -Os -mdiv-rem -mcpu=arcem
ARC EM gcc-arc-em4 -Os -mcpu=em4_dmips
ARC HS gcc-arc-hs -Os -mcpu=archs
ARC HS gcc-arc-hs44 -Os -mcpu=hs4x -mtune=core3
ARM Cortex A7 gcc-cortex-a7 -Os -mcpu=cortex-a7 -mtune=cortex-a7 -mthumb
ARM Cortex R5 gcc-cortex-r5 -Os -mcpu=cortex-r5 -mtune=cortex-r5 -mthumb
ARM Cortex M0 gcc-cortex-m0 -Os -mcpu=cortex-m0 -mthumb
ARM Cortex M4 gcc-cortex-m4 -Os -mcpu=cortex-m4 -mthumb
RISCV gcc-riscv -Os -mtune=size -mdiv -msave-restore

Recommended way of running CSiBE for ARC configurations:

csibe.py gcc-arc-hs44 gcc-arc-em4 CSiBE-v2.1.1 

Result data is collected into gcc-arc-em4/all_results.csv and gcc-arc-hs44/all_results.csv files.

Recommended way of running CSiBE for ARM configurations:

csibe.py gcc-cortex-m0 gcc-cortex-m4 gcc-cortex-a7 gcc-cortex-r5 CSiBE-v2.1.1 

Result data is collected into next files:

gcc-cortex-m0/all_results.csv
gcc-cortex-m4/all_results.csv
gcc-cortex-a7/all_results.csv
gcc-cortex-r5/all_results.csv

RISCV configuration is not tracked with CSiBE.

Links

More info about the project can be found at http://www.csibe.org Contact: http://www.sed.inf.u-szeged.hu/

Licenses

The license of the CSiBE framework can be found in License.txt. Open Source projects - under source directory - may have different license conditions. Each of them can be found under its container folder.

EMBENCH

The EMBENCH benchmarks are designed to test the performance of deeply embedded systems. As such they assume the presence of no OS, minimal C library support and in particular no output stream. The Synopsys variant location is [here].

Used configurations:

Configuration CPU Options
em ARC EM -Os -mcpu=arcem -mdiv-rem -ffunction-sections
hs44 ARC HS44 -Os -mcpu=hs4x -ffunction-sections
riscv32 RISCV32imc -Os -march=rv32imc -ffunction-sections -msave-restore -mabi=ilp32
arm ARM M4 -Os -mcpu=cortex-m4 -mthumb -ffunction-sections

Recommended way of profiling:

./build_all.py --builddir hs44 --arch arc --chip hs44 --board generic
./benchmark_size.py  --builddir hs44

./build_all.py --builddir em --arch arc --chip em --board generic
./benchmark_size.py  --builddir em

./build_all.py --builddir rv32imc --arch riscv32 --chip test-size-gcc --board ri5cyverilator
./benchmark_size.py  --builddir rv32imc

./build_all.py --builddir arm --arch arm --chip size-test-gcc --board generic
./benchmark_size.py  --builddir arm

EEMBC

We use a limited set of EEMBC benchmarks as follows:

  • empty
  • a2time01
  • aifftr01
  • aifirf01
  • aiifft01
  • basefp01
  • bitmnp01
  • cacheb01
  • canrdr01
  • idctrn01
  • iirflt01
  • matrix01
  • pntrch01
  • puwmod01
  • rspeed01
  • tblook01
  • ttsprk01
  • autcor00
  • cjpeg
  • conven00
  • djpeg
  • fbital00
  • fft00
  • ip_pktcheck
  • ip_reassembly
  • nat
  • ospfv2
  • rgbcmy01
  • rgbyiq01
  • rgbhpg01
  • routelookup
  • tcp
  • viterb00
  • qos

The sources of those one can be found here.

Recommended way of profiling:

RISCV:

make clean
make size SIM=sim CPU=rv32imc ARC_PREFIX=riscv32-unknown-elf- 
cat *.rep | grep -P "^\s+[0-9a-f]+\s+" > gcc-rv32imc-eembc.txt

ARC HS44:

make clean
make size SIM=xcam CPU=hs4x
cat *.rep | grep -P "^\s+[0-9a-f]+\s+" > gcc-hs44-eembc.txt

Additional resources

Lately EEMBC has released in the public space a number of benchmarks which can be found here

Speed Benchmarks

All benchmarks used for speed measurements are found here

The next benchmarks are used for speed measurements

Benchmark Platform Licence
whetstone XCAM free
whetstoneDP XCAM free
searchgame XCAM free
cachebench XCAM free
median XCAM free
qsort XCAM free
rsort XCAM free
towers XCAM free
vvadd XCAM free
multiply XCAM free
dhrystone XCAM free
spmv XCAM free
coremark XCAM free
linpack NCAM free

Recommended way of benchmarking:

ARC HS4xD NCAM:

make clean
make reports
cat *.rep > arc_hs4xd_ncam.txt

ARC HS44 XCAM: Using hs44_perf_1c model.

make clean
make reports SIM=xcam CPU=hs4x
cat *.rep > arc_hs4x_xcam.txt

ARC HS38 XCAM: Using hs34_perf_1c model.

make clean
make reports SIM=xcam CPU=hs38
cat *.rep > arc_archs_xcam.txt

ARC EM4 XCAM: Using em4_dmips_v3 model.

make clean
make reports SIM=xcam CPU=em4
cat *.rep > arc_arcem_xcam.txt

ARC 700 XCAM: Using `arc7001 model.

make clean
make reports SIM=xcam CPU=arc700
cat *.rep > arc_arc700_xcam.txt

ARC 625 XCAM: Using arc625 model.

make clean
make reports SIM=xcam CPU=arc625
cat *.rep > arc_arc625_xcam.txt

For all performance runs, one needs:

  • a valid ARC compiler in path
  • the hostlink library (libhlt.a) for the specific processor
  • MDB
  • NSIM (for NCAM model)
  • A valid XCAM model