Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Add the ability to dry-run benchmarks #1827

Closed
ldionne opened this issue Jul 31, 2024 · 12 comments
Closed

[FR] Add the ability to dry-run benchmarks #1827

ldionne opened this issue Jul 31, 2024 · 12 comments

Comments

@ldionne
Copy link
Contributor

ldionne commented Jul 31, 2024

Is your feature request related to a problem? Please describe.
As part of the libc++ test suite, we use GoogleBenchmark to write micro-benchmarks for various parts of the library. Running those benchmarks takes a lot of time, so we don't run them by default. However, this has the downside that benchmarks can easily rot and become invalid.

Describe the solution you'd like
We would like the ability to "dry-run" benchmarks. Basically, we'd like an argument that can be passed to a benchmark that forces it to do only one iteration and then return. This would allow us to build and dry-run the benchmarks in our normal testing configuration, and to run the full benchmarks when we actually want to.

There are a few different ways this could be achieved depending on what the maintainers think is best:

  • Add a new --benchmark_dry_run=True argument (or similar, but basically a boolean that represents a dry-run)
  • Add the ability to control the maximum number of iterations of a benchmark like --benchmark_max_iterations=N
  • Add the ability to bound the time taken for each benchmark, like --benchmark_max_time=TIME

Any of these APIs would solve our problem.

@dmah42
Copy link
Member

dmah42 commented Jul 31, 2024

i think the first is what i'd prefer.

i'd like to understand a bit about how they rot though: they still get compiled, right? so at least the APIs being called in the library under test are updated as necessary. do you assert on anything in the benchmarks and behaviour changes cause those to fail?

@ldionne
Copy link
Contributor Author

ldionne commented Jul 31, 2024

Yes, they get compiled but we don't get any code coverage out of them other than compilation.

Also, with the changes I am currently doing to move the benchmarks out of our CMake setup and into our test suite (so that e.g. other implementations can also benefit from those benchmarks), they wouldn't get compiled at all anymore unless I go through a lot of hoops to make that happen.

@LebedevRI
Copy link
Collaborator

LebedevRI commented Jul 31, 2024

Why --benchmark_min_time=1x isn't enough? This is what i do, and i agree that the problem is real.

That being said, the effect of said proposed flag would be to fully and completely override per-BENCHMARK
MinTime()/MinWarmUpTime()/Iterations()/Repetitions() overrides, that normally take priority.
So i guess this may make sense.

@ldionne
Copy link
Contributor Author

ldionne commented Jul 31, 2024

@LebedevRI My understanding is that benchmark_min_time controls the minimum amount of time that the benchmark will run for, not the maximum amount of time. Is there something I missed about the behavior of benchmark_min_time that would solve my problem?

@LebedevRI
Copy link
Collaborator

--benchmark_min_time=1x would result in a single iteration being run. But would not affect the repetition count,
or the

per-BENCHMARK() MinTime()/MinWarmUpTime()/Iterations()/Repetitions() overrides, that take priority over command-line flags.

ldionne added a commit to ldionne/llvm-project that referenced this issue Jul 31, 2024
Instead of building the benchmarks separately via CMake and running them
separately from the test suite, this patch merges the benchmarks into
the test suite and handles both uniformly.

As a result:
- It is now possible to run individual benchmarks like we run tests
  (e.g. using libcxx-lit), which is a huge quality-of-life improvement.

- The benchmarks will be run under exactly the same configuration as
  the rest of the tests, which is a nice simplification. This does
  mean that one has to be careful to enable the desired optimization
  flags when running benchmarks, but that is easy with e.g.
  `libcxx-lit <...> --param optimization=speed`.

- Benchmarks can use the same annotations as the rest of the test
  suite, such as `// UNSUPPORTED` & friends.

When running the tests via `check-cxx`, we only compile the benchmarks
because running them would be too time consuming. This introduces a bit
of complexity in the testing setup, and instead it would be better to
allow passing a --dry-run flag to GoogleBenchmark executables, which is
the topic of google/benchmark#1827.

I am not really satisfied with the layering violation of adding the
%{benchmark_flags} substitution to cmake-bridge, however I believe
this can be improved in the future.
ldionne added a commit to ldionne/llvm-project that referenced this issue Jul 31, 2024
Instead of building the benchmarks separately via CMake and running them
separately from the test suite, this patch merges the benchmarks into
the test suite and handles both uniformly.

As a result:
- It is now possible to run individual benchmarks like we run tests
  (e.g. using libcxx-lit), which is a huge quality-of-life improvement.

- The benchmarks will be run under exactly the same configuration as
  the rest of the tests, which is a nice simplification. This does
  mean that one has to be careful to enable the desired optimization
  flags when running benchmarks, but that is easy with e.g.
  `libcxx-lit <...> --param optimization=speed`.

- Benchmarks can use the same annotations as the rest of the test
  suite, such as `// UNSUPPORTED` & friends.

When running the tests via `check-cxx`, we only compile the benchmarks
because running them would be too time consuming. This introduces a bit
of complexity in the testing setup, and instead it would be better to
allow passing a --dry-run flag to GoogleBenchmark executables, which is
the topic of google/benchmark#1827.

I am not really satisfied with the layering violation of adding the
%{benchmark_flags} substitution to cmake-bridge, however I believe
this can be improved in the future.
@ldionne
Copy link
Contributor Author

ldionne commented Jul 31, 2024

--benchmark_min_time=1x would result in a single iteration being run. But would not affect the repetition count, or the

per-BENCHMARK() MinTime()/MinWarmUpTime()/Iterations()/Repetitions() overrides, that take priority over command-line flags.

Interesting! I tested this out and while it does seem to run much faster, it doesn't seem to force running exactly one iteration, since I was seeing several tests run for more iterations even when they don't override the number of iterations. Is that possible?

@LebedevRI
Copy link
Collaborator

  • Does the bundled benchmark version support said syntax? (see ./some_benchmark --help)
  • Can you show the console output of such benchmark run (including the run-line)?
  • Can you point the file with said benchmark?

ldionne added a commit to ldionne/llvm-project that referenced this issue Aug 2, 2024
Instead of building the benchmarks separately via CMake and running them
separately from the test suite, this patch merges the benchmarks into
the test suite and handles both uniformly.

As a result:
- It is now possible to run individual benchmarks like we run tests
  (e.g. using libcxx-lit), which is a huge quality-of-life improvement.

- The benchmarks will be run under exactly the same configuration as
  the rest of the tests, which is a nice simplification. This does
  mean that one has to be careful to enable the desired optimization
  flags when running benchmarks, but that is easy with e.g.
  `libcxx-lit <...> --param optimization=speed`.

- Benchmarks can use the same annotations as the rest of the test
  suite, such as `// UNSUPPORTED` & friends.

When running the tests via `check-cxx`, we only compile the benchmarks
because running them would be too time consuming. This introduces a bit
of complexity in the testing setup, and instead it would be better to
allow passing a --dry-run flag to GoogleBenchmark executables, which is
the topic of google/benchmark#1827.

I am not really satisfied with the layering violation of adding the
%{benchmark_flags} substitution to cmake-bridge, however I believe
this can be improved in the future.
@ldionne
Copy link
Contributor Author

ldionne commented Aug 2, 2024

I think you were right -- it actually works for most benchmarks. However, it seems like it doesn't work for benchmarks that use state.KeepRunningBatch(...) instead of state.keepRunning().

Do you know if there's a way for even those benchmarks to listen to benchmark_min_time=1x? Or should we try to rewrite them to avoid using KeepRunningBatch (although I think this was the better way of writing those benchmarks last I checked).

@LebedevRI
Copy link
Collaborator

I think you were right -- it actually works for most benchmarks. However, it seems like it doesn't work for benchmarks that use state.KeepRunningBatch(...) instead of state.keepRunning().

Oh that makes sense :(

Do you know if there's a way for even those benchmarks to listen to benchmark_min_time=1x? Or should we try to rewrite them to avoid using KeepRunningBatch (although I think this was the better way of writing those benchmarks last I checked).

Looks like the docs don't even mention KeepRunningBatch, but they say this:
https://github.com/google/benchmark/blob/ef73a30083ccd4eb1ad6e67a68b23163bf195561/docs/user_guide.md#a-faster-keep-running-loop

I think KeepRunningBatch() is meant for nano-benchmarks,
are you sure you can't just use the conventional for (auto _ : state) { /*???*/ } loop syntax?

@Shaan-Mistry
Copy link
Contributor

Could you assign this issue to me? I would like to contribute to this. Thank you!

ldionne added a commit to ldionne/llvm-project that referenced this issue Oct 23, 2024
Instead of building the benchmarks separately via CMake and running them
separately from the test suite, this patch merges the benchmarks into
the test suite and handles both uniformly.

As a result:
- It is now possible to run individual benchmarks like we run tests
  (e.g. using libcxx-lit), which is a huge quality-of-life improvement.

- The benchmarks will be run under exactly the same configuration as
  the rest of the tests, which is a nice simplification. This does
  mean that one has to be careful to enable the desired optimization
  flags when running benchmarks, but that is easy with e.g.
  `libcxx-lit <...> --param optimization=speed`.

- Benchmarks can use the same annotations as the rest of the test
  suite, such as `// UNSUPPORTED` & friends.

When running the tests via `check-cxx`, we only compile the benchmarks
because running them would be too time consuming. This introduces a bit
of complexity in the testing setup, and instead it would be better to
allow passing a --dry-run flag to GoogleBenchmark executables, which is
the topic of google/benchmark#1827.

I am not really satisfied with the layering violation of adding the
%{benchmark_flags} substitution to cmake-bridge, however I believe
this can be improved in the future.
ldionne added a commit to ldionne/llvm-project that referenced this issue Oct 23, 2024
Instead of building the benchmarks separately via CMake and running them
separately from the test suite, this patch merges the benchmarks into
the test suite and handles both uniformly.

As a result:
- It is now possible to run individual benchmarks like we run tests
  (e.g. using libcxx-lit), which is a huge quality-of-life improvement.

- The benchmarks will be run under exactly the same configuration as
  the rest of the tests, which is a nice simplification. This does
  mean that one has to be careful to enable the desired optimization
  flags when running benchmarks, but that is easy with e.g.
  `libcxx-lit <...> --param optimization=speed`.

- Benchmarks can use the same annotations as the rest of the test
  suite, such as `// UNSUPPORTED` & friends.

When running the tests via `check-cxx`, we only compile the benchmarks
because running them would be too time consuming. This introduces a bit
of complexity in the testing setup, and instead it would be better to
allow passing a --dry-run flag to GoogleBenchmark executables, which is
the topic of google/benchmark#1827.

I am not really satisfied with the layering violation of adding the
%{benchmark_flags} substitution to cmake-bridge, however I believe
this can be improved in the future.
@ldionne
Copy link
Contributor Author

ldionne commented Oct 31, 2024

@dmah42 I think this can be closed since this was implemented in #1851 ?

@ldionne
Copy link
Contributor Author

ldionne commented Oct 31, 2024

(Thank you @Shaan-Mistry by the way)

@dmah42 dmah42 closed this as completed Nov 1, 2024
ldionne added a commit to llvm/llvm-project that referenced this issue Nov 7, 2024
Instead of building the benchmarks separately via CMake and running them
separately from the test suite, this patch merges the benchmarks into
the test suite and handles both uniformly.

As a result:
- It is now possible to run individual benchmarks like we run tests
  (e.g. using libcxx-lit), which is a huge quality-of-life improvement.

- The benchmarks will be run under exactly the same configuration as
  the rest of the tests, which is a nice simplification. This does
  mean that one has to be careful to enable the desired optimization
  flags when running benchmarks, but that is easy with e.g.
  `libcxx-lit <...> --param optimization=speed`.

- Benchmarks can use the same annotations as the rest of the test
  suite, such as `// UNSUPPORTED` & friends.

When running the tests via `check-cxx`, we only compile the benchmarks
because running them would be too time consuming. This introduces a bit
of complexity in the testing setup, and instead it would be better to
allow passing a --dry-run flag to GoogleBenchmark executables, which is
the topic of google/benchmark#1827.

I am not really satisfied with the layering violation of adding the
%{benchmark_flags} substitution to cmake-bridge, however I believe
this can be improved in the future.
Groverkss pushed a commit to iree-org/llvm-project that referenced this issue Nov 15, 2024
Instead of building the benchmarks separately via CMake and running them
separately from the test suite, this patch merges the benchmarks into
the test suite and handles both uniformly.

As a result:
- It is now possible to run individual benchmarks like we run tests
  (e.g. using libcxx-lit), which is a huge quality-of-life improvement.

- The benchmarks will be run under exactly the same configuration as
  the rest of the tests, which is a nice simplification. This does
  mean that one has to be careful to enable the desired optimization
  flags when running benchmarks, but that is easy with e.g.
  `libcxx-lit <...> --param optimization=speed`.

- Benchmarks can use the same annotations as the rest of the test
  suite, such as `// UNSUPPORTED` & friends.

When running the tests via `check-cxx`, we only compile the benchmarks
because running them would be too time consuming. This introduces a bit
of complexity in the testing setup, and instead it would be better to
allow passing a --dry-run flag to GoogleBenchmark executables, which is
the topic of google/benchmark#1827.

I am not really satisfied with the layering violation of adding the
%{benchmark_flags} substitution to cmake-bridge, however I believe
this can be improved in the future.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants