Statistics: iteration count #586

LebedevRI · 2018-05-04T21:14:03Z

Right now each statistic contains the same iteration count as the first repetition.

benchmark/src/statistics.cc

Line 156 in 50ffc78

data.iterations = run_iterations;

benchmark/src/statistics.cc

Line 128 in 50ffc78

CHECK_EQ(run_iterations, run.iterations);

benchmark/src/statistics.cc

Line 105 in 50ffc78

int64_t const run_iterations = reports.front().iterations;

Are we sure this is the correct value that we should be outputting?
We don't actually store each of the iteration times, but average them, and operate "on averages".
Are we sure we don't want to put the repetition count there?

I'm currently looking into finally adding t-test stuff into tools, and thus thinking about "so what is the actual number of observations"? (and i guess the answer is - the repetition count.)

dmah42 · 2018-05-08T10:42:19Z

There's two ways to think about the number of observations. the number of iterations for a given run is the number of observations that lead to the average times that are reported. similarly, when repeating and getting the second level of statistics, we're looking at the stability of the benchmark, and in that case the number of observations is the repetition count. So i think it depends on what it is you want to look at with regard to the statistics. Now after the fact, we've lost the information about individual iterations, and we only have the statistics. While it's somewhat possible to reconstruct the distribution given enough statistics (percentiles or quartiles, for example) it's not really viable to store and output every iteration. I'm not sure it's true (any more?) that every repetition should have the same number of iterations (as per the comment on line 105). I'm also agree that the number of iterations in a statistic should likely follow the statistic: the mean should be the mean across repetitions, the std dev should be the std dev across repetitions, etc, to give more information about what's going on with the benchmark. Dominic Hamon | Google *There are no bad ideas; only good ideas that go horribly wrong.*

…

On Fri, May 4, 2018 at 10:14 PM Roman Lebedev ***@***.***> wrote: Right now each statistic contains the same iteration count as the first repetition. https://github.com/google/benchmark/blob/50ffc781b1df6d582d6a453b9942cc8cb512db69/src/statistics.cc#L156 https://github.com/google/benchmark/blob/50ffc781b1df6d582d6a453b9942cc8cb512db69/src/statistics.cc#L128 https://github.com/google/benchmark/blob/50ffc781b1df6d582d6a453b9942cc8cb512db69/src/statistics.cc#L105 Are we *sure* this is the correct value that we should be outputting? We don't actually store each of the iteration times, but average them, and operate "on averages". Are we sure we don't want to put the repetition count there? I'm currently looking into finally adding t-test stuff into tools, and thus thinking about "so what is the actual number of observations"? (and i guess the answer is - the repetition count.) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#586>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAfIMlKSpXQC0FinG-ZeD0oNX6Jc4kF_ks5tvMSdgaJpZM4TzR4_> .

LebedevRI · 2018-05-08T10:46:46Z

Thank you for replying!

I'm also agree
that the number of iterations in a statistic should likely follow the
statistic: the mean should be the mean across repetitions, the std dev
should be the std dev across repetitions, etc, to give more information
about what's going on with the benchmark.

So what is the "TLDR": we should use repetition count?
If not, please feel free to close the issue :)

dmah42 · 2018-05-08T11:17:54Z

I don't know the right answer. I think the iteration count is "wrong" for statistics outputs, because you expect the values in the row to reflect the stastic. ie, for the 'mean' it should be the 'mean' number of iterations across repetitions. That doesn't make sense today, because we assume that every repetition runs the same number of iterations. It likely does, but i'm not sure if it's guaranteed with the batch running.

Perhaps then it makes sense to not output the iterations at all for statistics rows.

LebedevRI · 2018-10-17T15:06:48Z

After looking a bit more into reporting separate iterations, i now believe that for aggregates,
the actual count of rows used for the computation of said aggregate should be displayed.
Because if you have one iteration per repetition, all the aggregates will say that they are over one iteration, but they are over n repetitions.

Similarly, this in-repetition averaging dramatically cripples all those measurements
(up to the point that it is completely wrong to compute anything other than median on those "results"),
so again, claiming that those aggregates are for I iterations is misleading. They are for N repetitions.

I'll write a patch.

It is incorrect to say that an aggregate is computed over run's iterations, because those iterations already got averaged. Similarly, if there are N repetitions with 1 iterations each, an aggregate will be computed over N measurements, not 1. Thus it is best to simply use the count of separate reports. Fixes google#586.

It is incorrect to say that an aggregate is computed over run's iterations, because those iterations already got averaged. Similarly, if there are N repetitions with 1 iterations each, an aggregate will be computed over N measurements, not 1. Thus it is best to simply use the count of separate reports. Fixes #586.

It is incorrect to say that an aggregate is computed over run's iterations, because those iterations already got averaged. Similarly, if there are N repetitions with 1 iterations each, an aggregate will be computed over N measurements, not 1. Thus it is best to simply use the count of separate reports. Fixes google#586.

LebedevRI mentioned this issue Oct 17, 2018

Aggregates: use non-aggregate count as iteration count. #706

Merged

LebedevRI closed this as completed in #706 Oct 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Statistics: iteration count #586

Statistics: iteration count #586

LebedevRI commented May 4, 2018

dmah42 commented May 8, 2018 via email

LebedevRI commented May 8, 2018 •

edited

Loading

dmah42 commented May 8, 2018

LebedevRI commented Oct 17, 2018

Statistics: iteration count #586

Statistics: iteration count #586

Comments

LebedevRI commented May 4, 2018

dmah42 commented May 8, 2018 via email

LebedevRI commented May 8, 2018 • edited Loading

dmah42 commented May 8, 2018

LebedevRI commented Oct 17, 2018

LebedevRI commented May 8, 2018 •

edited

Loading