Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark parameters II (API improvements) #63

Merged
merged 23 commits into from
Sep 1, 2016

Conversation

arximboldi
Copy link
Contributor

This pull request tries to improve the parameters API. It is a continuation of #56. Following up on the discussion on that pull-request, I try here to make the parameters API more terse and robust. The last commit includes more details.

This branch also has a commit addressing #10. It could be extracted, but I just happened to do it in this branch.

The user may declare parameters by doing, at the top level:
```cpp
    NONIUS_PARAM(name, <default_value>);
```

This declares a parameter.  Parameters can be then passed to the `main`
program by passing arguments of the form:
```
    -p <name>:<value>
```

The parameters have to be accessed from the chronometer.  Since
parameters can have any type and are string-addressed, they can only be
finally parsed or unboxed and looked up in the benchmark.  To stop users
from mistakenly adding that cost to their tests, they are forced to
extract the parameters before, from the chronometer.  Of course they can
cheat by capturing the chronometer, but the documentation should guide
users in the right direction.  A good example:
```
NONIUS_BENCHMARK("push_back vector", [](nonius::chronometer meter)
{
    auto n = meter.param("size");
    auto storage = std::vector<std::vector<char>>(meter.runs());
    meter.measure([&](int i) {
        auto& x = storage[i];
        for (auto j = 0; j < n; ++j)
            x.push_back(static_cast<char>(j));
    });
})
```
The parameter name is now required to be a string already.  There is no
fundamental reason why parameter names should be limited to identifier
names rules.  The fact that `NONIUS_DETAIL_UNIQUE_NAME` already exists
is convenient here.

This changes syntax for the user from:
```cpp
    NONIUS_PARAM(name, 42);
```

To:
```cpp
    NONIUS_PARAM("name", 42)
```

Note the semicolon is not needed anymore either.
The syntax is:
```
    -p <name>:<oper>:<init>:<delta>:<steps>
```

This tells that the benchmark should be run with `<steps>` different
values for parameter `<name>`.  Each value is computed as:
```
    next = last <oper> deta
```

Where `<oper>` may be `+` or `*` as for testing with linear or
exponential increments (the later is useful when checking logarithmic
algorithms or subtle non-linearities in the "constant factors" of an
implementation).
The old code was convoluted, with too many assertions and implicit
assumptions.  Instead of generating all the samples that we want to try
in the argument parsing, we delay this logic to a more appropriate place
later, without loss of information.

The only drawback of the new configuration representation is that it may
be less flexible, as the previous one allowed to easily add passing
explicitly all the various parmeters to try.  We can, though, easily add
this later by adding more options to the configuration.
The functions `go_benchmark` and `go_param` where extracted, since the
function had grown too big to my taste after the parameter running logic
had been introduced.

Also, the main loop has been simplified to avoid the need to `continue`,
which is more structured IMHO.
We always store parameters generically as `std::string`.  The user
implicitly suggests a concrete value type with the type of the default
value of the parameter.  We use that type when we need to do arithmetic
on the parameter and also when checking that it parses correctly when
passed from the command line.  We do this by casting the strings back to
the original `T` using the `param_spec` interface.

This might seem overkill.  Another approach could be to type erase the
concrete values once intially parsed from the command line, using a
`boost::any`-like type erasure class.  However that approach is less
flexible, because it requires the user to provide the specific concrete
type when calling `chronometer::param<T>()` and just a mistake in
signedness or whatever could be fatal.  Just parsing everything into and
back from strings is more forgiving on numeric types, which is the most
common case.

This commit adds a new `example7` that where using ranged parameter
would not work correctly previously.
In the last commit, we broke parameters of types that do not support `+`
and `*` operators.  Some of these parameters might be useful, for
example `std::string`.

Now, parameters may or may not support these operations.  When they do
not support, the benchmarks will fail when we try to run the parameter
through a range with those operations.

Funny, we can do:
```
   ./bin/examples/example6 -p string:+:hola-mundo:-yeah:10
```

But we can't:
```
   ./bin/examples/example6 -p string:*:hola-mundo:-oh,no!:10
```
This commit refactors a lot of the code added in this branch. The
original goal was to reverse how `go()` runs through various
parameter values.  Before it did like:

    for each benchmark
        for each parameter
            run-benchmark()

Now it is more like

    for each parameter
        for each benchmark
            run-benchmark()

This new structure should make writing reporters easier, since it better
fits how the data could be presented to the user in meaningful ways --
we'll see more of this in later commits.

A couple of more refactorings has sneaked in this commit:

- `param_map` and `param_registry` etc. are not in `detail::` anymore.
  This is to better fit other parts of the library (e.g. benchmark,
  reporter) that are not in `detail` and arguably at the same level of
  abstraction from the user API POV.

- A couple of functions now have a more "data driven" approach, trying
  to reuse previously computed data more often, as to improve the
  overall performance.
The old API for benchmarks that have fixtures is like this:
```cpp
NONIUS_BENCHMARK([] (nonius::chronometer meter){
  // setup code
  m.measure([&] {
      // benchmark code
  })
});
```

That API is nice, but since it has to assume that the benchmark code can
always mutate the setup, the setup can not be reused across samples.
This is ok in general, but for benchmarks that have expensive setups
relative to their measured time, time might be wasted uneedesly.

Our alternative syntax is like:
```cpp
NONIUS_BENCHMARK([] (nonius::parameters meter){
  // immutable setup code
  return [=] {
      // benchmark code
  }
});
```

Or even further:
```cpp
NONIUS_BENCHMARK([] (nonius::parameters params){
  // immutable setup code
  return [=] (nonius::chronometer meter) {
      // mutable setup code
      meter.measure([&] {
          // benchmark code
      })
  }
});
```

The idea is that when the benchmark function can take a `parameters`
argument, it is assumed to be a factory that then returns the actual
benchmark function.  This factory function is called *only once* for a
given set of parameters (in `prepare(...)`) and the resulting benchmark
function is kept in the `execution_plan` and reused for collecting all
the samples.

Furthermore, this syntax may be more intuitive for people that need
parameters, but do not need a chronometer.
When trying to figure out how long to run a test to get a meaningful
value, we may fall into an infinite loop when the whole benchmark has
been optimized away.  We now try to detect this case by hard-limiting
the number of iterations we do in the loop.

This partly solves libnonius#52.  A complete solution involves properly
deoptimizing the return value of the benchmark function.
The exception message would get mixed together with other lines of
messages creating confusion.
Now all the logic of deciding which parameters to execute is contained
in `generate_params`.  We removed the "feature" that `parameters` looks
up default values.  This way the code we remove dependencies on global
state, and make the feature more testable.  `generate_params` now merges
the default values with the fixed values with the values generated for a
parameter run.
The command line parser would only look at the first instance of an
argument.  This would prevent us from passing multiple parameters to the
benchmark runner. Examples that did not work but now do work:
```
    # Would only set string=foo, discarding size
    example6 -p string:foo -p size:42
```
```
    # Would only run size:42,52,..., discarding string
    example6 -p size:+:42:10:10 -p string:foo
```
The HTML reporter has been changed to generate multiple plots instead of
just one.  The plots may be choosed with a drop-down menu at the top.
The drop down menu is always focused such that the user may use the
keyboard arrows to quickly navigate the plots.

These are the plots that are generated:

**summary plot:**

  - If the benchmarks are run with only one set of parameters, it is a bar
    graph showing the mean and stddev of each benchmark.

  - If the benchmarks are run with against various parameters, it is a
    line graph showing the evolution of the benchmark results for the
    different parameters.  The graph is shown in logarithmic scale if
    when the parameters are generated using `*`.

**samples plot:**

   - For every set of parameters, a plot is generated depicting every
     sample taken for every benchmark.  This plot is like the one that
     was generated before this change.
@rmartinho
Copy link
Collaborator

The build server cannot build this PR because the author is not whitelisted. A meatbag will have to look at it.

This is an automated message.

@arximboldi arximboldi force-pushed the moar-params branch 2 times, most recently from 0261a8c to 4023d3e Compare August 11, 2016 08:21
@griwes
Copy link

griwes commented Aug 11, 2016

I like this. 👍

@ThePhD
Copy link
Contributor

ThePhD commented Aug 11, 2016

🎉 Cheers!

@rmartinho
Copy link
Collaborator

This looks nice. I'll probably only merge all this next month when I'm back from holiday, once I write docs for it: #64.

@rmartinho rmartinho mentioned this pull request Aug 11, 2016
@rmartinho rmartinho self-assigned this Aug 15, 2016
@rmartinho
Copy link
Collaborator

Can you move the commit that addresses #10 out? There's some discussion to be had on that and it'd be better if it was done in a separate PR.

Before, parameters were store as strings.  No type information about the
parameter was stored.  The user had to specify the type both at the
declaration site and at the usage site.  The conversion rules where
thouse of parsing with `>>` and `<<`, but some operations (e.g. `+` or
`*`) where applied in the initially declared type.

That was messy.

Instead, now declarations define a special type tag that encapsulates
name and concrete type.  Parameter values are parsed only once and then
carried around in a COW-enabled type-erased wrapper.  When the user
queries a parameter, they just pass the tag, and get a value casted in
the right type already.

Sehr gut.
@arximboldi
Copy link
Contributor Author

Done!

@rmartinho rmartinho merged commit f681566 into libnonius:devel Sep 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants