Benchmark parameters II (API improvements) #63

arximboldi · 2016-08-10T16:12:36Z

This pull request tries to improve the parameters API. It is a continuation of #56. Following up on the discussion on that pull-request, I try here to make the parameters API more terse and robust. The last commit includes more details.

This branch also has a commit addressing #10. It could be extracted, but I just happened to do it in this branch.

The user may declare parameters by doing, at the top level: ```cpp NONIUS_PARAM(name, <default_value>); ``` This declares a parameter. Parameters can be then passed to the `main` program by passing arguments of the form: ``` -p <name>:<value> ``` The parameters have to be accessed from the chronometer. Since parameters can have any type and are string-addressed, they can only be finally parsed or unboxed and looked up in the benchmark. To stop users from mistakenly adding that cost to their tests, they are forced to extract the parameters before, from the chronometer. Of course they can cheat by capturing the chronometer, but the documentation should guide users in the right direction. A good example: ``` NONIUS_BENCHMARK("push_back vector", [](nonius::chronometer meter) { auto n = meter.param("size"); auto storage = std::vector<std::vector<char>>(meter.runs()); meter.measure([&](int i) { auto& x = storage[i]; for (auto j = 0; j < n; ++j) x.push_back(static_cast<char>(j)); }); }) ```

The parameter name is now required to be a string already. There is no fundamental reason why parameter names should be limited to identifier names rules. The fact that `NONIUS_DETAIL_UNIQUE_NAME` already exists is convenient here. This changes syntax for the user from: ```cpp NONIUS_PARAM(name, 42); ``` To: ```cpp NONIUS_PARAM("name", 42) ``` Note the semicolon is not needed anymore either.

The syntax is: ``` -p <name>:<oper>:<init>:<delta>:<steps> ``` This tells that the benchmark should be run with `<steps>` different values for parameter `<name>`. Each value is computed as: ``` next = last <oper> deta ``` Where `<oper>` may be `+` or `*` as for testing with linear or exponential increments (the later is useful when checking logarithmic algorithms or subtle non-linearities in the "constant factors" of an implementation).

The old code was convoluted, with too many assertions and implicit assumptions. Instead of generating all the samples that we want to try in the argument parsing, we delay this logic to a more appropriate place later, without loss of information. The only drawback of the new configuration representation is that it may be less flexible, as the previous one allowed to easily add passing explicitly all the various parmeters to try. We can, though, easily add this later by adding more options to the configuration.

The functions `go_benchmark` and `go_param` where extracted, since the function had grown too big to my taste after the parameter running logic had been introduced. Also, the main loop has been simplified to avoid the need to `continue`, which is more structured IMHO.

We always store parameters generically as `std::string`. The user implicitly suggests a concrete value type with the type of the default value of the parameter. We use that type when we need to do arithmetic on the parameter and also when checking that it parses correctly when passed from the command line. We do this by casting the strings back to the original `T` using the `param_spec` interface. This might seem overkill. Another approach could be to type erase the concrete values once intially parsed from the command line, using a `boost::any`-like type erasure class. However that approach is less flexible, because it requires the user to provide the specific concrete type when calling `chronometer::param<T>()` and just a mistake in signedness or whatever could be fatal. Just parsing everything into and back from strings is more forgiving on numeric types, which is the most common case. This commit adds a new `example7` that where using ranged parameter would not work correctly previously.

In the last commit, we broke parameters of types that do not support `+` and `*` operators. Some of these parameters might be useful, for example `std::string`. Now, parameters may or may not support these operations. When they do not support, the benchmarks will fail when we try to run the parameter through a range with those operations. Funny, we can do: ``` ./bin/examples/example6 -p string:+:hola-mundo:-yeah:10 ``` But we can't: ``` ./bin/examples/example6 -p string:*:hola-mundo:-oh,no!:10 ```

This commit refactors a lot of the code added in this branch. The original goal was to reverse how `go()` runs through various parameter values. Before it did like: for each benchmark for each parameter run-benchmark() Now it is more like for each parameter for each benchmark run-benchmark() This new structure should make writing reporters easier, since it better fits how the data could be presented to the user in meaningful ways -- we'll see more of this in later commits. A couple of more refactorings has sneaked in this commit: - `param_map` and `param_registry` etc. are not in `detail::` anymore. This is to better fit other parts of the library (e.g. benchmark, reporter) that are not in `detail` and arguably at the same level of abstraction from the user API POV. - A couple of functions now have a more "data driven" approach, trying to reuse previously computed data more often, as to improve the overall performance.

The old API for benchmarks that have fixtures is like this: ```cpp NONIUS_BENCHMARK([] (nonius::chronometer meter){ // setup code m.measure([&] { // benchmark code }) }); ``` That API is nice, but since it has to assume that the benchmark code can always mutate the setup, the setup can not be reused across samples. This is ok in general, but for benchmarks that have expensive setups relative to their measured time, time might be wasted uneedesly. Our alternative syntax is like: ```cpp NONIUS_BENCHMARK([] (nonius::parameters meter){ // immutable setup code return [=] { // benchmark code } }); ``` Or even further: ```cpp NONIUS_BENCHMARK([] (nonius::parameters params){ // immutable setup code return [=] (nonius::chronometer meter) { // mutable setup code meter.measure([&] { // benchmark code }) } }); ``` The idea is that when the benchmark function can take a `parameters` argument, it is assumed to be a factory that then returns the actual benchmark function. This factory function is called *only once* for a given set of parameters (in `prepare(...)`) and the resulting benchmark function is kept in the `execution_plan` and reused for collecting all the samples. Furthermore, this syntax may be more intuitive for people that need parameters, but do not need a chronometer.

When trying to figure out how long to run a test to get a meaningful value, we may fall into an infinite loop when the whole benchmark has been optimized away. We now try to detect this case by hard-limiting the number of iterations we do in the loop. This partly solves libnonius#52. A complete solution involves properly deoptimizing the return value of the benchmark function.

The exception message would get mixed together with other lines of messages creating confusion.

Now all the logic of deciding which parameters to execute is contained in `generate_params`. We removed the "feature" that `parameters` looks up default values. This way the code we remove dependencies on global state, and make the feature more testable. `generate_params` now merges the default values with the fixed values with the values generated for a parameter run.

The command line parser would only look at the first instance of an argument. This would prevent us from passing multiple parameters to the benchmark runner. Examples that did not work but now do work: ``` # Would only set string=foo, discarding size example6 -p string:foo -p size:42 ``` ``` # Would only run size:42,52,..., discarding string example6 -p size:+:42:10:10 -p string:foo ```

The HTML reporter has been changed to generate multiple plots instead of just one. The plots may be choosed with a drop-down menu at the top. The drop down menu is always focused such that the user may use the keyboard arrows to quickly navigate the plots. These are the plots that are generated: **summary plot:** - If the benchmarks are run with only one set of parameters, it is a bar graph showing the mean and stddev of each benchmark. - If the benchmarks are run with against various parameters, it is a line graph showing the evolution of the benchmark results for the different parameters. The graph is shown in logarithmic scale if when the parameters are generated using `*`. **samples plot:** - For every set of parameters, a plot is generated depicting every sample taken for every benchmark. This plot is like the one that was generated before this change.

rmartinho · 2016-08-10T16:12:43Z

The build server cannot build this PR because the author is not whitelisted. A meatbag will have to look at it.

This is an automated message.

griwes · 2016-08-11T08:27:59Z

I like this. 👍

ThePhD · 2016-08-11T11:44:40Z

🎉 Cheers!

rmartinho · 2016-08-11T16:49:49Z

This looks nice. I'll probably only merge all this next month when I'm back from holiday, once I write docs for it: #64.

rmartinho · 2016-08-16T11:57:08Z

Can you move the commit that addresses #10 out? There's some discussion to be had on that and it'd be better if it was done in a separate PR.

Before, parameters were store as strings. No type information about the parameter was stored. The user had to specify the type both at the declaration site and at the usage site. The conversion rules where thouse of parsing with `>>` and `<<`, but some operations (e.g. `+` or `*`) where applied in the initially declared type. That was messy. Instead, now declarations define a special type tag that encapsulates name and concrete type. Parameter values are parsed only once and then carried around in a COW-enabled type-erased wrapper. When the user queries a parameter, they just pass the tag, and get a value casted in the right type already. Sehr gut.

arximboldi · 2016-08-17T09:06:51Z

Done!

arximboldi added 22 commits August 5, 2016 11:39

Define basic API for declaring and using parameters

376c5a2

Run through a parameter range when the option is enabled

77a4a26

Rename param_map to parameters

ccaaaba

Improve output of standard_reporter on error

cb0f2a4

The exception message would get mixed together with other lines of messages creating confusion.

Test changes in detail::benchmark_function

be91a3e

Test new parameters API

fa1f10f

Test parameter run generation

cacc15b

Fix missing override warning

c5b48e6

Fix large numbers would be truncated incorrectly in HTML reporter

ebb807a

arximboldi force-pushed the moar-params branch 2 times, most recently from 0261a8c to 4023d3e Compare August 11, 2016 08:21

rmartinho added feature interface enhancement labels Aug 11, 2016

rmartinho removed the feature label Aug 11, 2016

rmartinho added this to the Parameterized benchmarking milestone Aug 11, 2016

rmartinho mentioned this pull request Aug 11, 2016

Document parameters #64

Open

rmartinho self-assigned this Aug 15, 2016

arximboldi force-pushed the moar-params branch from 4023d3e to f681566 Compare August 17, 2016 09:05

rmartinho merged commit f681566 into libnonius:devel Sep 1, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark parameters II (API improvements) #63

Benchmark parameters II (API improvements) #63

arximboldi commented Aug 10, 2016

rmartinho commented Aug 10, 2016

griwes commented Aug 11, 2016

ThePhD commented Aug 11, 2016

rmartinho commented Aug 11, 2016

rmartinho commented Aug 16, 2016

arximboldi commented Aug 17, 2016

Benchmark parameters II (API improvements) #63

Benchmark parameters II (API improvements) #63

Conversation

arximboldi commented Aug 10, 2016

rmartinho commented Aug 10, 2016

griwes commented Aug 11, 2016

ThePhD commented Aug 11, 2016

rmartinho commented Aug 11, 2016

rmartinho commented Aug 16, 2016

arximboldi commented Aug 17, 2016