-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tooling] Rewrite generate_difference_report(). #678
Merged
LebedevRI
merged 2 commits into
google:master
from
LebedevRI:tooling-unbreak-repetitions
Sep 19, 2018
Merged
[Tooling] Rewrite generate_difference_report(). #678
LebedevRI
merged 2 commits into
google:master
from
LebedevRI:tooling-unbreak-repetitions
Sep 19, 2018
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
❌ Build benchmark 1440 failed (commit 5f31abde8c by @LebedevRI) |
dmah42
reviewed
Sep 17, 2018
tools/compare.py
Outdated
dest='display_aggregates_only', | ||
action="store_true", | ||
help="If there are repetitions, by default, we display everything - the" | ||
" actual runs, and the aggregates computed. Sometimes, it is " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please put the space before 'actual' on the previous line.
996adfc
to
6a3fc63
Compare
❌ Build benchmark 1443 failed (commit 24bab0e192 by @LebedevRI) |
Further thoughts here? Moar tests? |
dmah42
approved these changes
Sep 19, 2018
Alright, thank you for the review! |
JBakamovic
pushed a commit
to JBakamovic/benchmark
that referenced
this pull request
Dec 6, 2018
My knowledge of python is not great, so this is kinda horrible. Two things: 1. If there were repetitions, for the RHS (i.e. the new value) we were always using the first repetition, which naturally results in incorrect change reports for the second and following repetitions. And what is even worse, that completely broke U test. :( 2. A better support for different repetition count for U test was missing. It's important if we are to be able to report 'iteration as repetition', since it is rather likely that the iteration count will mismatch. Now, the rough idea on how this is implemented now. I think this is the right solution. 1. Get all benchmark names (in order) from the lhs benchmark. 2. While preserving the order, keep the unique names 3. Get all benchmark names (in order) from the rhs benchmark. 4. While preserving the order, keep the unique names 5. Intersect `2.` and `4.`, get the list of unique benchmark names that exist on both sides. 6. Now, we want to group (partition) all the benchmarks with the same name. ``` BM_FOO: [lhs]: BM_FOO/repetition0 BM_FOO/repetition1 [rhs]: BM_FOO/repetition0 BM_FOO/repetition1 BM_FOO/repetition2 ... ``` We also drop mismatches in `time_unit` here. _(whose bright idea was it to store arbitrarily scaled timers in json **?!** )_ 7. Iterate for each partition 7.1. Conditionally, diff the overlapping repetitions (the count of repetitions may be different.) 7.2. Conditionally, do the U test: 7.2.1. Get **all** the values of `"real_time"` field from the lhs benchmark 7.2.2. Get **all** the values of `"cpu_time"` field from the lhs benchmark 7.2.3. Get **all** the values of `"real_time"` field from the rhs benchmark 7.2.4. Get **all** the values of `"cpu_time"` field from the rhs benchmark NOTE: the repetition count may be different, but we want *all* the values! 7.2.5. Do the rest of the u test stuff 7.2.6. Print u test 8. ??? 9. **PROFIT**! Fixes google#677
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
My knowledge of python is not great, so this is kinda horrible.
Two things:
which naturally results in incorrect change reports for the second and following repetitions.
And what is even worse, that completely broke U test. :(
It's important if we are to be able to report 'iteration as repetition',
since it is rather likely that the iteration count will mismatch.
Now, the rough idea on how this is implemented now. I think this is the right solution.
2.
and4.
, get the list of unique benchmark names that exist on both sides.time_unit
here.(whose bright idea was it to store arbitrarily scaled timers in json ?! )
7.1. Conditionally, diff the overlapping repetitions (the count of repetitions may be different.)
7.2. Conditionally, do the U test:
7.2.1. Get all the values of
"real_time"
field from the lhs benchmark7.2.2. Get all the values of
"cpu_time"
field from the lhs benchmark7.2.3. Get all the values of
"real_time"
field from the rhs benchmark7.2.4. Get all the values of
"cpu_time"
field from the rhs benchmarkNOTE: the repetition count may be different, but we want all the values!
7.2.5. Do the rest of the u test stuff
7.2.6. Print u test
Fixes #677