[Tooling] Rewrite generate_difference_report(). #678

LebedevRI · 2018-09-16T16:44:20Z

My knowledge of python is not great, so this is kinda horrible.

Two things:

If there were repetitions, for the RHS (i.e. the new value) we were always using the first repetition,
which naturally results in incorrect change reports for the second and following repetitions.
And what is even worse, that completely broke U test. :(
A better support for different repetition count for U test was missing.
It's important if we are to be able to report 'iteration as repetition',
since it is rather likely that the iteration count will mismatch.

Now, the rough idea on how this is implemented now. I think this is the right solution.

Get all benchmark names (in order) from the lhs benchmark.
While preserving the order, keep the unique names
Get all benchmark names (in order) from the rhs benchmark.
While preserving the order, keep the unique names
Intersect 2. and 4., get the list of unique benchmark names that exist on both sides.
Now, we want to group (partition) all the benchmarks with the same name.
```
BM_FOO:
    [lhs]: BM_FOO/repetition0 BM_FOO/repetition1
    [rhs]: BM_FOO/repetition0 BM_FOO/repetition1 BM_FOO/repetition2
...
```
We also drop mismatches in time_unit here.
(whose bright idea was it to store arbitrarily scaled timers in json ?! )
Iterate for each partition
7.1. Conditionally, diff the overlapping repetitions (the count of repetitions may be different.)
7.2. Conditionally, do the U test:
7.2.1. Get all the values of "real_time" field from the lhs benchmark
7.2.2. Get all the values of "cpu_time" field from the lhs benchmark
7.2.3. Get all the values of "real_time" field from the rhs benchmark
7.2.4. Get all the values of "cpu_time" field from the rhs benchmark
NOTE: the repetition count may be different, but we want all the values!
7.2.5. Do the rest of the u test stuff
7.2.6. Print u test
???
PROFIT!

Fixes #677

coveralls · 2018-09-16T17:00:52Z

Coverage remained the same at 89.022% when pulling d2e9a41 on LebedevRI:tooling-unbreak-repetitions into a5e9c06 on google:master.

AppVeyorBot · 2018-09-16T17:43:17Z

❌ Build benchmark 1440 failed (commit 5f31abde8c by @LebedevRI)

dmah42 · 2018-09-17T08:29:34Z

tools/compare.py

+        dest='display_aggregates_only',
+        action="store_true",
+        help="If there are repetitions, by default, we display everything - the"
+             " actual runs, and the aggregates computed. Sometimes, it is "


nit: please put the space before 'actual' on the previous line.

tools/gbench/report.py

AppVeyorBot · 2018-09-17T11:08:46Z

❌ Build benchmark 1443 failed (commit 24bab0e192 by @LebedevRI)

LebedevRI · 2018-09-18T20:19:54Z

Further thoughts here? Moar tests?

LebedevRI · 2018-09-19T12:58:41Z

Alright, thank you for the review!

My knowledge of python is not great, so this is kinda horrible. Two things: 1. If there were repetitions, for the RHS (i.e. the new value) we were always using the first repetition, which naturally results in incorrect change reports for the second and following repetitions. And what is even worse, that completely broke U test. :( 2. A better support for different repetition count for U test was missing. It's important if we are to be able to report 'iteration as repetition', since it is rather likely that the iteration count will mismatch. Now, the rough idea on how this is implemented now. I think this is the right solution. 1. Get all benchmark names (in order) from the lhs benchmark. 2. While preserving the order, keep the unique names 3. Get all benchmark names (in order) from the rhs benchmark. 4. While preserving the order, keep the unique names 5. Intersect `2.` and `4.`, get the list of unique benchmark names that exist on both sides. 6. Now, we want to group (partition) all the benchmarks with the same name. ``` BM_FOO: [lhs]: BM_FOO/repetition0 BM_FOO/repetition1 [rhs]: BM_FOO/repetition0 BM_FOO/repetition1 BM_FOO/repetition2 ... ``` We also drop mismatches in `time_unit` here. _(whose bright idea was it to store arbitrarily scaled timers in json **?!** )_ 7. Iterate for each partition 7.1. Conditionally, diff the overlapping repetitions (the count of repetitions may be different.) 7.2. Conditionally, do the U test: 7.2.1. Get **all** the values of `"real_time"` field from the lhs benchmark 7.2.2. Get **all** the values of `"cpu_time"` field from the lhs benchmark 7.2.3. Get **all** the values of `"real_time"` field from the rhs benchmark 7.2.4. Get **all** the values of `"cpu_time"` field from the rhs benchmark NOTE: the repetition count may be different, but we want *all* the values! 7.2.5. Do the rest of the u test stuff 7.2.6. Print u test 8. ??? 9. **PROFIT**! Fixes google#677

LebedevRI requested review from dmah42 and EricWF September 16, 2018 16:44

googlebot added the cla: yes label Sep 16, 2018

dmah42 reviewed Sep 17, 2018

View reviewed changes

Just squash the thing.

6a3fc63

LebedevRI force-pushed the tooling-unbreak-repetitions branch from 996adfc to 6a3fc63 Compare September 17, 2018 09:02

LebedevRI changed the title ~~[Do Not Merge][Tooling] Rewrite generate_difference_report().~~ [Tooling] Rewrite generate_difference_report(). Sep 17, 2018

Some review

d2e9a41

dmah42 approved these changes Sep 19, 2018

View reviewed changes

LebedevRI merged commit aad33aa into google:master Sep 19, 2018

LebedevRI deleted the tooling-unbreak-repetitions branch September 19, 2018 12:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tooling] Rewrite generate_difference_report(). #678

[Tooling] Rewrite generate_difference_report(). #678

LebedevRI commented Sep 16, 2018 •

edited

Loading

coveralls commented Sep 16, 2018 •

edited

Loading

AppVeyorBot commented Sep 16, 2018

dmah42 Sep 17, 2018

AppVeyorBot commented Sep 17, 2018

LebedevRI commented Sep 18, 2018

LebedevRI commented Sep 19, 2018

[Tooling] Rewrite generate_difference_report(). #678

[Tooling] Rewrite generate_difference_report(). #678

Conversation

LebedevRI commented Sep 16, 2018 • edited Loading

coveralls commented Sep 16, 2018 • edited Loading

AppVeyorBot commented Sep 16, 2018

dmah42 Sep 17, 2018

Choose a reason for hiding this comment

AppVeyorBot commented Sep 17, 2018

LebedevRI commented Sep 18, 2018

LebedevRI commented Sep 19, 2018

LebedevRI commented Sep 16, 2018 •

edited

Loading

coveralls commented Sep 16, 2018 •

edited

Loading