Skip to content

Commit

Permalink
Implementation of random interleaving. (google#1105)
Browse files Browse the repository at this point in the history
* Implementation of random interleaving. See
http://github.com/google/benchmark/issues/1051 for the feature requests.

Committer: Hai Huang (http://github.com/haih-g)

On branch fr-1051
Changes to be committed:
modified:   include/benchmark/benchmark.h
modified:   src/benchmark.cc
new file:   src/benchmark_adjust_repetitions.cc
new file:   src/benchmark_adjust_repetitions.h
modified:   src/benchmark_api_internal.cc
modified:   src/benchmark_api_internal.h
modified:   src/benchmark_register.cc
modified:   src/benchmark_runner.cc
modified:   src/benchmark_runner.h
modified:   test/CMakeLists.txt
new file:   test/benchmark_random_interleaving_gtest.cc

* Fix benchmark_random_interleaving_gtest.cc for fr-1051

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark.cc
modified:   src/benchmark_runner.cc
modified:   test/benchmark_random_interleaving_gtest.cc

* Fix macos build for fr-1051

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark_api_internal.cc
modified:   src/benchmark_api_internal.h
modified:   src/benchmark_runner.cc

* Fix macos and windows build for fr-1051.

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark_runner.cc

* Fix benchmark_random_interleaving_test.cc for macos and windows in fr-1051

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   test/benchmark_random_interleaving_gtest.cc

* Fix int type benchmark_random_interleaving_gtest for macos in fr-1051

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   test/benchmark_random_interleaving_gtest.cc

* Address dominichamon's comments 03/29 for fr-1051

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark.cc
modified:   src/benchmark_api_internal.cc
modified:   src/benchmark_api_internal.h
modified:   test/benchmark_random_interleaving_gtest.cc

* Address dominichamon's comment on default min_time / repetitions for fr-1051.
Also change sentinel of random_interleaving_repetitions to -1. Hopefully it
fixes the failures on Windows.

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark.cc
modified:   src/benchmark_api_internal.cc
modified:   src/benchmark_api_internal.h

* Fix windows test failures for fr-1051

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark_api_internal.cc
modified:   src/benchmark_runner.cc

* Add license blurb for fr-1051.

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark_adjust_repetitions.cc
modified:   src/benchmark_adjust_repetitions.h

* Switch to std::shuffle() for fr-1105.

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark.cc

* Change to 1e-9 in fr-1105

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark_adjust_repetitions.cc

* Fix broken build caused by bad merge for fr-1105.

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark_api_internal.cc
modified:   src/benchmark_runner.cc

* Fix build breakage for fr-1051.

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark.cc
modified:   src/benchmark_api_internal.cc
modified:   src/benchmark_api_internal.h
modified:   src/benchmark_register.cc
modified:   src/benchmark_runner.cc

* Print out reports as they come in if random interleaving is disabled (fr-1051)

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark.cc

* size_t, int64_t --> int in benchmark_runner for fr-1051.

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark_runner.cc
modified:   src/benchmark_runner.h

* Address comments from dominichamon for fr-1051

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark.cc
modified:   src/benchmark_adjust_repetitions.cc
modified:   src/benchmark_adjust_repetitions.h
modified:   src/benchmark_api_internal.cc
modified:   src/benchmark_api_internal.h
modified:   test/benchmark_random_interleaving_gtest.cc

* benchmar_indices --> size_t to make CI pass: fr-1051

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark.cc

* Fix min_time not initialized issue for fr-1051.

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark_api_internal.cc
modified:   src/benchmark_api_internal.h

* min_time --> MinTime in fr-1051.

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   src/benchmark_api_internal.cc
modified:   src/benchmark_api_internal.h
modified:   src/benchmark_runner.cc

* Add doc for random interleaving for fr-1051

Committer: Hai Huang <haih@google.com>

On branch fr-1051
Your branch is up to date with 'origin/fr-1051'.

Changes to be committed:
modified:   README.md
new file:   docs/random_interleaving.md

Co-authored-by: Dominic Hamon <dominichamon@users.noreply.github.com>
  • Loading branch information
haihuang-ml and dominichamon authored May 20, 2021
1 parent c983c3e commit a6a738c
Show file tree
Hide file tree
Showing 11 changed files with 772 additions and 85 deletions.
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ BENCHMARK_MAIN();
```
To run the benchmark, compile and link against the `benchmark` library
(libbenchmark.a/.so). If you followed the build steps above, this library will
(libbenchmark.a/.so). If you followed the build steps above, this library will
be under the build directory you created.
```bash
Expand Down Expand Up @@ -300,6 +300,8 @@ too (`-lkstat`).

[Setting the Time Unit](#setting-the-time-unit)

[Random Interleaving](docs/random_interleaving.md)

[User-Requested Performance Counters](docs/perf_counters.md)

[Preventing Optimization](#preventing-optimization)
Expand Down Expand Up @@ -400,8 +402,8 @@ Write benchmark results to a file with the `--benchmark_out=<filename>` option
(or set `BENCHMARK_OUT`). Specify the output format with
`--benchmark_out_format={json|console|csv}` (or set
`BENCHMARK_OUT_FORMAT={json|console|csv}`). Note that the 'csv' reporter is
deprecated and the saved `.csv` file
[is not parsable](https://github.com/google/benchmark/issues/794) by csv
deprecated and the saved `.csv` file
[is not parsable](https://github.com/google/benchmark/issues/794) by csv
parsers.

Specifying `--benchmark_out` does not suppress the console output.
Expand Down
26 changes: 26 additions & 0 deletions docs/random_interleaving.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
<a name="interleaving" />

# Random Interleaving

[Random Interleaving](https://github.com/google/benchmark/issues/1051) is a
technique to lower run-to-run variance. It breaks the execution of a
microbenchmark into multiple chunks and randomly interleaves them with chunks
from other microbenchmarks in the same benchmark test. Data shows it is able to
lower run-to-run variance by
[40%](https://github.com/google/benchmark/issues/1051) on average.

To use, set `--benchmark_enable_random_interleaving=true`.

It's a known issue that random interleaving may increase the benchmark execution
time, if:

1. A benchmark has costly setup and / or teardown. Random interleaving will run
setup and teardown many times and may increase test execution time
significantly.
2. The time to run a single benchmark iteration is larger than the desired time
per repetition (i.e., `benchmark_min_time / benchmark_repetitions`).

The overhead of random interleaving can be controlled by
`--benchmark_random_interleaving_max_overhead`. The default value is 0.4 meaning
the total execution time under random interlaving is limited by 1.4 x original
total execution time. Set it to `inf` for unlimited overhead.
212 changes: 165 additions & 47 deletions src/benchmark.cc
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,10 @@
#include <cstdlib>
#include <fstream>
#include <iostream>
#include <limits>
#include <map>
#include <memory>
#include <random>
#include <string>
#include <thread>
#include <utility>
Expand All @@ -54,6 +56,18 @@
#include "thread_manager.h"
#include "thread_timer.h"

// Each benchmark can be repeated a number of times, and within each
// *repetition*, we run the user-defined benchmark function a number of
// *iterations*. The number of repetitions is determined based on flags
// (--benchmark_repetitions).
namespace {

// Attempt to make each repetition run for at least this much of time.
constexpr double kDefaultMinTimeTotalSecs = 0.5;
constexpr int kRandomInterleavingDefaultRepetitions = 12;

} // namespace

// Print a list of benchmarks. This option overrides all other options.
DEFINE_bool(benchmark_list_tests, false);

Expand All @@ -62,16 +76,39 @@ DEFINE_bool(benchmark_list_tests, false);
// linked into the binary are run.
DEFINE_string(benchmark_filter, ".");

// Minimum number of seconds we should run benchmark before results are
// considered significant. For cpu-time based tests, this is the lower bound
// on the total cpu time used by all threads that make up the test. For
// real-time based tests, this is the lower bound on the elapsed time of the
// benchmark execution, regardless of number of threads.
DEFINE_double(benchmark_min_time, 0.5);
// Do NOT read these flags directly. Use Get*() to read them.
namespace do_not_read_flag_directly {

// Minimum number of seconds we should run benchmark per repetition before
// results are considered significant. For cpu-time based tests, this is the
// lower bound on the total cpu time used by all threads that make up the test.
// For real-time based tests, this is the lower bound on the elapsed time of the
// benchmark execution, regardless of number of threads. If left unset, will use
// kDefaultMinTimeTotalSecs / FLAGS_benchmark_repetitions, if random
// interleaving is enabled. Otherwise, will use kDefaultMinTimeTotalSecs.
// Do NOT read this flag directly. Use GetMinTime() to read this flag.
DEFINE_double(benchmark_min_time, -1.0);

// The number of runs of each benchmark. If greater than 1, the mean and
// standard deviation of the runs will be reported.
DEFINE_int32(benchmark_repetitions, 1);
// standard deviation of the runs will be reported. By default, the number of
// repetitions is 1 if random interleaving is disabled, and up to
// kDefaultRepetitions if random interleaving is enabled. (Read the
// documentation for random interleaving to see why it might be less than
// kDefaultRepetitions.)
// Do NOT read this flag directly, Use GetRepetitions() to access this flag.
DEFINE_int32(benchmark_repetitions, -1);

} // namespace do_not_read_flag_directly

// The maximum overhead allowed for random interleaving. A value X means total
// execution time under random interleaving is limited by
// (1 + X) * original total execution time. Set to 'inf' to allow infinite
// overhead.
DEFINE_double(benchmark_random_interleaving_max_overhead, 0.4);

// If set, enable random interleaving. See
// http://github.com/google/benchmark/issues/1051 for details.
DEFINE_bool(benchmark_enable_random_interleaving, false);

// Report the result of each benchmark repetitions. When 'true' is specified
// only the mean, standard deviation, and other statistics are reported for
Expand Down Expand Up @@ -122,6 +159,30 @@ DEFINE_kvpairs(benchmark_context, {});

std::map<std::string, std::string>* global_context = nullptr;

// Performance measurements always come with random variances. Defines a
// factor by which the required number of iterations is overestimated in order
// to reduce the probability that the minimum time requirement will not be met.
const double kSafetyMultiplier = 1.4;

// Wraps --benchmark_min_time and returns valid default values if not supplied.
double GetMinTime() {
const double default_min_time = kDefaultMinTimeTotalSecs / GetRepetitions();
const double flag_min_time =
do_not_read_flag_directly::FLAGS_benchmark_min_time;
return flag_min_time >= 0.0 ? flag_min_time : default_min_time;
}

// Wraps --benchmark_repetitions and return valid default value if not supplied.
int GetRepetitions() {
const int default_repetitions =
FLAGS_benchmark_enable_random_interleaving
? kRandomInterleavingDefaultRepetitions
: 1;
const int flag_repetitions =
do_not_read_flag_directly::FLAGS_benchmark_repetitions;
return flag_repetitions >= 0 ? flag_repetitions : default_repetitions;
}

// FIXME: wouldn't LTO mess this up?
void UseCharPointer(char const volatile*) {}

Expand Down Expand Up @@ -241,23 +302,57 @@ void State::FinishKeepRunning() {
namespace internal {
namespace {

// Flushes streams after invoking reporter methods that write to them. This
// ensures users get timely updates even when streams are not line-buffered.
void FlushStreams(BenchmarkReporter* reporter) {
if (!reporter) return;
std::flush(reporter->GetOutputStream());
std::flush(reporter->GetErrorStream());
};

// Reports in both display and file reporters.
void Report(BenchmarkReporter* display_reporter,
BenchmarkReporter* file_reporter, const RunResults& run_results) {
auto report_one = [](BenchmarkReporter* reporter,
bool aggregates_only,
const RunResults& results) {
assert(reporter);
// If there are no aggregates, do output non-aggregates.
aggregates_only &= !results.aggregates_only.empty();
if (!aggregates_only)
reporter->ReportRuns(results.non_aggregates);
if (!results.aggregates_only.empty())
reporter->ReportRuns(results.aggregates_only);
};

report_one(display_reporter, run_results.display_report_aggregates_only,
run_results);
if (file_reporter)
report_one(file_reporter, run_results.file_report_aggregates_only,
run_results);

FlushStreams(display_reporter);
FlushStreams(file_reporter);
};

void RunBenchmarks(const std::vector<BenchmarkInstance>& benchmarks,
BenchmarkReporter* display_reporter,
BenchmarkReporter* file_reporter) {
// Note the file_reporter can be null.
CHECK(display_reporter != nullptr);

// Determine the width of the name field using a minimum width of 10.
bool might_have_aggregates = FLAGS_benchmark_repetitions > 1;
bool might_have_aggregates = GetRepetitions() > 1;
size_t name_field_width = 10;
size_t stat_field_width = 0;
for (const BenchmarkInstance& benchmark : benchmarks) {
name_field_width =
std::max<size_t>(name_field_width, benchmark.name().str().size());
might_have_aggregates |= benchmark.repetitions() > 1;

for (const auto& Stat : benchmark.statistics())
for (const auto& Stat : benchmark.statistics()) {
stat_field_width = std::max<size_t>(stat_field_width, Stat.name_.size());
}
}
if (might_have_aggregates) name_field_width += 1 + stat_field_width;

Expand All @@ -268,45 +363,61 @@ void RunBenchmarks(const std::vector<BenchmarkInstance>& benchmarks,
// Keep track of running times of all instances of current benchmark
std::vector<BenchmarkReporter::Run> complexity_reports;

// We flush streams after invoking reporter methods that write to them. This
// ensures users get timely updates even when streams are not line-buffered.
auto flushStreams = [](BenchmarkReporter* reporter) {
if (!reporter) return;
std::flush(reporter->GetOutputStream());
std::flush(reporter->GetErrorStream());
};

if (display_reporter->ReportContext(context) &&
(!file_reporter || file_reporter->ReportContext(context))) {
flushStreams(display_reporter);
flushStreams(file_reporter);

for (const auto& benchmark : benchmarks) {
RunResults run_results = RunBenchmark(benchmark, &complexity_reports);

auto report = [&run_results](BenchmarkReporter* reporter,
bool report_aggregates_only) {
assert(reporter);
// If there are no aggregates, do output non-aggregates.
report_aggregates_only &= !run_results.aggregates_only.empty();
if (!report_aggregates_only)
reporter->ReportRuns(run_results.non_aggregates);
if (!run_results.aggregates_only.empty())
reporter->ReportRuns(run_results.aggregates_only);
};

report(display_reporter, run_results.display_report_aggregates_only);
if (file_reporter)
report(file_reporter, run_results.file_report_aggregates_only);

flushStreams(display_reporter);
flushStreams(file_reporter);
FlushStreams(display_reporter);
FlushStreams(file_reporter);

// Without random interleaving, benchmarks are executed in the order of:
// A, A, ..., A, B, B, ..., B, C, C, ..., C, ...
// That is, repetition is within RunBenchmark(), hence the name
// inner_repetitions.
// With random interleaving, benchmarks are executed in the order of:
// {Random order of A, B, C, ...}, {Random order of A, B, C, ...}, ...
// That is, repetitions is outside of RunBenchmark(), hence the name
// outer_repetitions.
int inner_repetitions =
FLAGS_benchmark_enable_random_interleaving ? 1 : GetRepetitions();
int outer_repetitions =
FLAGS_benchmark_enable_random_interleaving ? GetRepetitions() : 1;
std::vector<size_t> benchmark_indices(benchmarks.size());
for (size_t i = 0; i < benchmarks.size(); ++i) {
benchmark_indices[i] = i;
}

std::random_device rd;
std::mt19937 g(rd());
// 'run_results_vector' and 'benchmarks' are parallel arrays.
std::vector<RunResults> run_results_vector(benchmarks.size());
for (int i = 0; i < outer_repetitions; i++) {
if (FLAGS_benchmark_enable_random_interleaving) {
std::shuffle(benchmark_indices.begin(), benchmark_indices.end(), g);
}
for (size_t j : benchmark_indices) {
// Repetitions will be automatically adjusted under random interleaving.
if (!FLAGS_benchmark_enable_random_interleaving ||
i < benchmarks[j].RandomInterleavingRepetitions()) {
RunBenchmark(benchmarks[j], outer_repetitions, inner_repetitions,
&complexity_reports, &run_results_vector[j]);
if (!FLAGS_benchmark_enable_random_interleaving) {
// Print out reports as they come in.
Report(display_reporter, file_reporter, run_results_vector.at(j));
}
}
}
}

if (FLAGS_benchmark_enable_random_interleaving) {
// Print out all reports at the end of the test.
for (const RunResults& run_results : run_results_vector) {
Report(display_reporter, file_reporter, run_results);
}
}
}
display_reporter->Finalize();
if (file_reporter) file_reporter->Finalize();
flushStreams(display_reporter);
flushStreams(file_reporter);
FlushStreams(display_reporter);
FlushStreams(file_reporter);
}

// Disable deprecated warnings temporarily because we need to reference
Expand Down Expand Up @@ -456,6 +567,7 @@ void PrintUsageAndExit() {
" [--benchmark_filter=<regex>]\n"
" [--benchmark_min_time=<min_time>]\n"
" [--benchmark_repetitions=<num_repetitions>]\n"
" [--benchmark_enable_random_interleaving={true|false}]\n"
" [--benchmark_report_aggregates_only={true|false}]\n"
" [--benchmark_display_aggregates_only={true|false}]\n"
" [--benchmark_format=<console|json|csv>]\n"
Expand All @@ -476,10 +588,16 @@ void ParseCommandLineFlags(int* argc, char** argv) {
if (ParseBoolFlag(argv[i], "benchmark_list_tests",
&FLAGS_benchmark_list_tests) ||
ParseStringFlag(argv[i], "benchmark_filter", &FLAGS_benchmark_filter) ||
ParseDoubleFlag(argv[i], "benchmark_min_time",
&FLAGS_benchmark_min_time) ||
ParseInt32Flag(argv[i], "benchmark_repetitions",
&FLAGS_benchmark_repetitions) ||
ParseDoubleFlag(
argv[i], "benchmark_min_time",
&do_not_read_flag_directly::FLAGS_benchmark_min_time) ||
ParseInt32Flag(
argv[i], "benchmark_repetitions",
&do_not_read_flag_directly::FLAGS_benchmark_repetitions) ||
ParseBoolFlag(argv[i], "benchmark_enable_random_interleaving",
&FLAGS_benchmark_enable_random_interleaving) ||
ParseDoubleFlag(argv[i], "benchmark_random_interleaving_max_overhead",
&FLAGS_benchmark_random_interleaving_max_overhead) ||
ParseBoolFlag(argv[i], "benchmark_report_aggregates_only",
&FLAGS_benchmark_report_aggregates_only) ||
ParseBoolFlag(argv[i], "benchmark_display_aggregates_only",
Expand Down
Loading

0 comments on commit a6a738c

Please sign in to comment.