TSC Benchmark

A low overhead nanosecond TSC benchmark which allows you to measure small sections of code to nanosecond accuracy.

Links

RDTSC And RDTSCP Instructions
- Hardware Support
TSC Reordering
TSC Overhead
Run Benchmark

Example

benchmarking::TSCBenchmarking benchmark{};
benchmark.Initialize();
auto result = benchmark.Run(code_for_benchmarking, settings);
std::cout << "Benchmark result: " << result.time_ << " ns" << std::endl;

RDTSC And RDTSCP Instructions

The benchmark uses rdtsc instruction and simple arithmatic operations to implement a clock with 1 ns precision, and is much faster and stable in terms of latency in less than 10 ns.

Also, the rdtscp instruction can be used to check that the programm did not switch to another cpu between tsc calls, which can significantly distort the measurements. To check cpu migration during the benchmarks please pass to the TSCBenchmarking template parameter bool CheckCpuMigration the true value.

Hardware Support

The benchmark checks that your /proc/cpuinfo contains nonstop_tsc, constant_tsc. But in general, the TSC, on the all modern x86 systems, runs at constant rate and never stops across all P states.

Also, the benchmark checks whether your system supports Invariant TSC, which can significantly affect the accuracy of measurements.

TSC Reordering

The compiler may reorder the reading of the TSC during benchmark. To avoid this, benchmarking::TSCClock<Barrier BarrierType> class is used, which implements different approaches of barriers:

OneCpuId barrier (default barrier type):

cpuid
rdtsc
code
cpuid
rdtsc

LFence barrier:

cpuid
rdtsc
code
lfence
rdtsc
cpuid

MFence barrier:

cpuid
rdtsc
code
cpuid
rdtsc
mfence

Rdtscp barrier (intel approach):

cpuid
rdtsc
code
rdtscp
cpuid

Four cpuid barrier:

cpuid
rdtsc
cpuid
code
cpuid
rdtsc
cpuid

TSC Overhead

In the benchmarking::TSCBenchmarking::Initialize method, the benchmark prepare and configure the system, calibrates the TSC for accurate results.

In addition, it makes several tests to calculate the overhead from tsc calls, which then needs to be subtracted from the final measured time.

Run Benchmark

After initialization, you can run the benchmark using the benchmarking::TSCBenchmarking::Run method.

This method sets the cpu on which the benchmark will be performed, warm up the benchmark and your code, makes several runs of your code and returns the average time.

In addition, you can use a minimalistic method benchmarking::TSCBenchmarking::MeasureTime of the benchmark. Which does nothing except reading the tsc. This method can be used in the code hot path to take simple measurements first, and then to translate them in another process into a more readable format.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
include		include
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
example.cpp		example.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TSC Benchmark

Links

Example

RDTSC And RDTSCP Instructions

Hardware Support

TSC Reordering

TSC Overhead

Run Benchmark

About

Releases

Packages

Languages

BagritsevichStepan/tsc-benchmark

Folders and files

Latest commit

History

Repository files navigation

TSC Benchmark

Links

Example

RDTSC And RDTSCP Instructions

Hardware Support

TSC Reordering

TSC Overhead

Run Benchmark

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages