[FR] Add the ability to dry-run benchmarks #1827

ldionne · 2024-07-31T16:21:23Z

Is your feature request related to a problem? Please describe.
As part of the libc++ test suite, we use GoogleBenchmark to write micro-benchmarks for various parts of the library. Running those benchmarks takes a lot of time, so we don't run them by default. However, this has the downside that benchmarks can easily rot and become invalid.

Describe the solution you'd like
We would like the ability to "dry-run" benchmarks. Basically, we'd like an argument that can be passed to a benchmark that forces it to do only one iteration and then return. This would allow us to build and dry-run the benchmarks in our normal testing configuration, and to run the full benchmarks when we actually want to.

There are a few different ways this could be achieved depending on what the maintainers think is best:

Add a new --benchmark_dry_run=True argument (or similar, but basically a boolean that represents a dry-run)
Add the ability to control the maximum number of iterations of a benchmark like --benchmark_max_iterations=N
Add the ability to bound the time taken for each benchmark, like --benchmark_max_time=TIME

Any of these APIs would solve our problem.

The text was updated successfully, but these errors were encountered:

dmah42 · 2024-07-31T16:29:20Z

i think the first is what i'd prefer.

i'd like to understand a bit about how they rot though: they still get compiled, right? so at least the APIs being called in the library under test are updated as necessary. do you assert on anything in the benchmarks and behaviour changes cause those to fail?

ldionne · 2024-07-31T16:37:01Z

Yes, they get compiled but we don't get any code coverage out of them other than compilation.

Also, with the changes I am currently doing to move the benchmarks out of our CMake setup and into our test suite (so that e.g. other implementations can also benefit from those benchmarks), they wouldn't get compiled at all anymore unless I go through a lot of hoops to make that happen.

LebedevRI · 2024-07-31T16:38:37Z

Why --benchmark_min_time=1x isn't enough? This is what i do, and i agree that the problem is real.

That being said, the effect of said proposed flag would be to fully and completely override per-BENCHMARK
MinTime()/MinWarmUpTime()/Iterations()/Repetitions() overrides, that normally take priority.
So i guess this may make sense.

ldionne · 2024-07-31T17:06:02Z

@LebedevRI My understanding is that benchmark_min_time controls the minimum amount of time that the benchmark will run for, not the maximum amount of time. Is there something I missed about the behavior of benchmark_min_time that would solve my problem?

LebedevRI · 2024-07-31T17:16:17Z

--benchmark_min_time=1x would result in a single iteration being run. But would not affect the repetition count,
or the

per-BENCHMARK() MinTime()/MinWarmUpTime()/Iterations()/Repetitions() overrides, that take priority over command-line flags.

Instead of building the benchmarks separately via CMake and running them separately from the test suite, this patch merges the benchmarks into the test suite and handles both uniformly. As a result: - It is now possible to run individual benchmarks like we run tests (e.g. using libcxx-lit), which is a huge quality-of-life improvement. - The benchmarks will be run under exactly the same configuration as the rest of the tests, which is a nice simplification. This does mean that one has to be careful to enable the desired optimization flags when running benchmarks, but that is easy with e.g. `libcxx-lit <...> --param optimization=speed`. - Benchmarks can use the same annotations as the rest of the test suite, such as `// UNSUPPORTED` & friends. When running the tests via `check-cxx`, we only compile the benchmarks because running them would be too time consuming. This introduces a bit of complexity in the testing setup, and instead it would be better to allow passing a --dry-run flag to GoogleBenchmark executables, which is the topic of google/benchmark#1827. I am not really satisfied with the layering violation of adding the %{benchmark_flags} substitution to cmake-bridge, however I believe this can be improved in the future.

ldionne · 2024-07-31T20:10:28Z

--benchmark_min_time=1x would result in a single iteration being run. But would not affect the repetition count, or the

per-BENCHMARK() MinTime()/MinWarmUpTime()/Iterations()/Repetitions() overrides, that take priority over command-line flags.

Interesting! I tested this out and while it does seem to run much faster, it doesn't seem to force running exactly one iteration, since I was seeing several tests run for more iterations even when they don't override the number of iterations. Is that possible?

LebedevRI · 2024-07-31T20:18:08Z

Does the bundled benchmark version support said syntax? (see ./some_benchmark --help)
Can you show the console output of such benchmark run (including the run-line)?
Can you point the file with said benchmark?

Instead of building the benchmarks separately via CMake and running them separately from the test suite, this patch merges the benchmarks into the test suite and handles both uniformly. As a result: - It is now possible to run individual benchmarks like we run tests (e.g. using libcxx-lit), which is a huge quality-of-life improvement. - The benchmarks will be run under exactly the same configuration as the rest of the tests, which is a nice simplification. This does mean that one has to be careful to enable the desired optimization flags when running benchmarks, but that is easy with e.g. `libcxx-lit <...> --param optimization=speed`. - Benchmarks can use the same annotations as the rest of the test suite, such as `// UNSUPPORTED` & friends. When running the tests via `check-cxx`, we only compile the benchmarks because running them would be too time consuming. This introduces a bit of complexity in the testing setup, and instead it would be better to allow passing a --dry-run flag to GoogleBenchmark executables, which is the topic of google/benchmark#1827. I am not really satisfied with the layering violation of adding the %{benchmark_flags} substitution to cmake-bridge, however I believe this can be improved in the future.

ldionne · 2024-08-02T20:14:40Z

I think you were right -- it actually works for most benchmarks. However, it seems like it doesn't work for benchmarks that use state.KeepRunningBatch(...) instead of state.keepRunning().

Do you know if there's a way for even those benchmarks to listen to benchmark_min_time=1x? Or should we try to rewrite them to avoid using KeepRunningBatch (although I think this was the better way of writing those benchmarks last I checked).

LebedevRI · 2024-08-02T20:36:56Z

I think you were right -- it actually works for most benchmarks. However, it seems like it doesn't work for benchmarks that use state.KeepRunningBatch(...) instead of state.keepRunning().

Oh that makes sense :(

Do you know if there's a way for even those benchmarks to listen to benchmark_min_time=1x? Or should we try to rewrite them to avoid using KeepRunningBatch (although I think this was the better way of writing those benchmarks last I checked).

Looks like the docs don't even mention KeepRunningBatch, but they say this:
https://github.com/google/benchmark/blob/ef73a30083ccd4eb1ad6e67a68b23163bf195561/docs/user_guide.md#a-faster-keep-running-loop

I think KeepRunningBatch() is meant for nano-benchmarks,
are you sure you can't just use the conventional for (auto _ : state) { /*???*/ } loop syntax?

Shaan-Mistry · 2024-08-13T03:03:40Z

Could you assign this issue to me? I would like to contribute to this. Thank you!

Instead of building the benchmarks separately via CMake and running them separately from the test suite, this patch merges the benchmarks into the test suite and handles both uniformly. As a result: - It is now possible to run individual benchmarks like we run tests (e.g. using libcxx-lit), which is a huge quality-of-life improvement. - The benchmarks will be run under exactly the same configuration as the rest of the tests, which is a nice simplification. This does mean that one has to be careful to enable the desired optimization flags when running benchmarks, but that is easy with e.g. `libcxx-lit <...> --param optimization=speed`. - Benchmarks can use the same annotations as the rest of the test suite, such as `// UNSUPPORTED` & friends. When running the tests via `check-cxx`, we only compile the benchmarks because running them would be too time consuming. This introduces a bit of complexity in the testing setup, and instead it would be better to allow passing a --dry-run flag to GoogleBenchmark executables, which is the topic of google/benchmark#1827. I am not really satisfied with the layering violation of adding the %{benchmark_flags} substitution to cmake-bridge, however I believe this can be improved in the future.

ldionne · 2024-10-31T19:20:59Z

@dmah42 I think this can be closed since this was implemented in #1851 ?

ldionne · 2024-10-31T19:21:18Z

(Thank you @Shaan-Mistry by the way)

Instead of building the benchmarks separately via CMake and running them separately from the test suite, this patch merges the benchmarks into the test suite and handles both uniformly. As a result: - It is now possible to run individual benchmarks like we run tests (e.g. using libcxx-lit), which is a huge quality-of-life improvement. - The benchmarks will be run under exactly the same configuration as the rest of the tests, which is a nice simplification. This does mean that one has to be careful to enable the desired optimization flags when running benchmarks, but that is easy with e.g. `libcxx-lit <...> --param optimization=speed`. - Benchmarks can use the same annotations as the rest of the test suite, such as `// UNSUPPORTED` & friends. When running the tests via `check-cxx`, we only compile the benchmarks because running them would be too time consuming. This introduces a bit of complexity in the testing setup, and instead it would be better to allow passing a --dry-run flag to GoogleBenchmark executables, which is the topic of google/benchmark#1827. I am not really satisfied with the layering violation of adding the %{benchmark_flags} substitution to cmake-bridge, however I believe this can be improved in the future.

dmah42 added enhancement good first issue help wanted labels Jul 31, 2024

ldionne mentioned this issue Jul 31, 2024

[libc++] Unify the benchmarks with the test suite llvm/llvm-project#101399

Merged

dmah42 assigned Shaan-Mistry Aug 13, 2024

dmah42 closed this as completed Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] Add the ability to dry-run benchmarks #1827

[FR] Add the ability to dry-run benchmarks #1827

ldionne commented Jul 31, 2024

dmah42 commented Jul 31, 2024

ldionne commented Jul 31, 2024

LebedevRI commented Jul 31, 2024 •

edited

Loading

ldionne commented Jul 31, 2024

LebedevRI commented Jul 31, 2024

ldionne commented Jul 31, 2024

LebedevRI commented Jul 31, 2024

ldionne commented Aug 2, 2024

LebedevRI commented Aug 2, 2024

Shaan-Mistry commented Aug 13, 2024

ldionne commented Oct 31, 2024

ldionne commented Oct 31, 2024

[FR] Add the ability to dry-run benchmarks #1827

[FR] Add the ability to dry-run benchmarks #1827

Comments

ldionne commented Jul 31, 2024

dmah42 commented Jul 31, 2024

ldionne commented Jul 31, 2024

LebedevRI commented Jul 31, 2024 • edited Loading

ldionne commented Jul 31, 2024

LebedevRI commented Jul 31, 2024

ldionne commented Jul 31, 2024

LebedevRI commented Jul 31, 2024

ldionne commented Aug 2, 2024

LebedevRI commented Aug 2, 2024

Shaan-Mistry commented Aug 13, 2024

ldionne commented Oct 31, 2024

ldionne commented Oct 31, 2024

LebedevRI commented Jul 31, 2024 •

edited

Loading