This sample measures single-trip delivery latency of one cache line-sized block between two CPUs of the same core.
The samples writes a cacheline to memory by a core, which is read it by another core. The cacheline is duplicated and is read by an original core. The roundtrip time is measured and divided by two.
-
x86
-
linux
-
multicore CPU
-
a C++14-compatible compiler
-
cmake 3.18 or later
- Create build folder
mkdir build && cd build
- Create cmake auxiliary files
cmake ..
- Build
cmake --build .
CPU frequency of the cores involved needs to be precalculated before starting.
It can be done by launching
cat /proc/cpuinfo | grep Hz
model name : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
cpu MHz : 800.000
model name : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
cpu MHz : 2800.000
model name : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
cpu MHz : 800.000
model name : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
cpu MHz : 800.000
model name : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
cpu MHz : 2794.449
model name : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
cpu MHz : 800.000
model name : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
cpu MHz : 800.000
model name : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
cpu MHz : 800.000
Select two cores on which you'll be launching the benchmark (ex, cores 2, 3).
./measure_cache_line_delivery_ns [<num_messages> (default 1M)] [<consumer cpu id> (default 0)] [<producer cpu id> (default 1)] [<cpu freq (HGz)> (default 3.0)]
Examples:
./measure_cache_line_delivery_ns 1000 2 3 4.0
A tool for measuring latency of a single cache line-length message delivery between different cores of a CPU
Usage: ./measure_cache_line_delivery_ns [<num_messages> (default 1M)] [<consumer cpu id> (default 0)] [<producer cpu id> (default 1)] [<cpu freq (HGz)> (default 3.0)]
example: ./measure_cache_line_delivery_ns 1000 5 6 4.0
Running the benchmark with parameters: num_messages=1000, core1=2, core2=3, freq=4GHz...
Single trip is 82ns
./measure_cache_line_delivery_ns
A tool for measuring latency of a single cache line-length message delivery between different cores of a CPU
Usage: ./measure_cache_line_delivery_ns [<num_messages> (default 1M)] [<consumer cpu id> (default 0)] [<producer cpu id> (default 1)] [<cpu freq (HGz)> (default 3.0)]
example: ./measure_cache_line_delivery_ns 1000 5 6 4.0
Running the benchmark with parameters: num_messages=1000000, core1=0, core2=1, freq=3GHz...
Single trip is 80ns