Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statistics: iteration count #586

Closed
LebedevRI opened this issue May 4, 2018 · 4 comments
Closed

Statistics: iteration count #586

LebedevRI opened this issue May 4, 2018 · 4 comments

Comments

@LebedevRI
Copy link
Collaborator

Right now each statistic contains the same iteration count as the first repetition.

data.iterations = run_iterations;

CHECK_EQ(run_iterations, run.iterations);

int64_t const run_iterations = reports.front().iterations;

Are we sure this is the correct value that we should be outputting?
We don't actually store each of the iteration times, but average them, and operate "on averages".
Are we sure we don't want to put the repetition count there?

I'm currently looking into finally adding t-test stuff into tools, and thus thinking about "so what is the actual number of observations"? (and i guess the answer is - the repetition count.)

@dmah42
Copy link
Member

dmah42 commented May 8, 2018 via email

@LebedevRI
Copy link
Collaborator Author

LebedevRI commented May 8, 2018

Thank you for replying!

I'm also agree
that the number of iterations in a statistic should likely follow the
statistic: the mean should be the mean across repetitions, the std dev
should be the std dev across repetitions, etc, to give more information
about what's going on with the benchmark.

So what is the "TLDR": we should use repetition count?
If not, please feel free to close the issue :)

@dmah42
Copy link
Member

dmah42 commented May 8, 2018

I don't know the right answer. I think the iteration count is "wrong" for statistics outputs, because you expect the values in the row to reflect the stastic. ie, for the 'mean' it should be the 'mean' number of iterations across repetitions. That doesn't make sense today, because we assume that every repetition runs the same number of iterations. It likely does, but i'm not sure if it's guaranteed with the batch running.

Perhaps then it makes sense to not output the iterations at all for statistics rows.

@LebedevRI
Copy link
Collaborator Author

After looking a bit more into reporting separate iterations, i now believe that for aggregates,
the actual count of rows used for the computation of said aggregate should be displayed.
Because if you have one iteration per repetition, all the aggregates will say that they are over one iteration, but they are over n repetitions.

Similarly, this in-repetition averaging dramatically cripples all those measurements
(up to the point that it is completely wrong to compute anything other than median on those "results"),
so again, claiming that those aggregates are for I iterations is misleading. They are for N repetitions.

I'll write a patch.

LebedevRI added a commit to LebedevRI/benchmark that referenced this issue Oct 17, 2018
It is incorrect to say that an aggregate is computed over
run's iterations, because those iterations already got averaged.
Similarly, if there are N repetitions with 1 iterations each,
an aggregate will be computed over N measurements, not 1.
Thus it is best to simply use the count of separate reports.

Fixes google#586.
LebedevRI added a commit that referenced this issue Oct 18, 2018
It is incorrect to say that an aggregate is computed over
run's iterations, because those iterations already got averaged.
Similarly, if there are N repetitions with 1 iterations each,
an aggregate will be computed over N measurements, not 1.
Thus it is best to simply use the count of separate reports.

Fixes #586.
EricWF pushed a commit to efcs/benchmark that referenced this issue Nov 29, 2018
It is incorrect to say that an aggregate is computed over
run's iterations, because those iterations already got averaged.
Similarly, if there are N repetitions with 1 iterations each,
an aggregate will be computed over N measurements, not 1.
Thus it is best to simply use the count of separate reports.

Fixes google#586.
JBakamovic pushed a commit to JBakamovic/benchmark that referenced this issue Dec 6, 2018
It is incorrect to say that an aggregate is computed over
run's iterations, because those iterations already got averaged.
Similarly, if there are N repetitions with 1 iterations each,
an aggregate will be computed over N measurements, not 1.
Thus it is best to simply use the count of separate reports.

Fixes google#586.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants