-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Benchmark result depends on the order in which benchmarks were registrated #1469
Comments
This seems to be a problem with how the assembly is placed on memory i think, i can get similar problems when statically linking something that does nothing (some assembler files that never get called) and magically 3 of 10 tests get 40-60% worse timings on variable data sizes and iterations. |
But in my case all benchmarks call the same function Also it is strange that such situation occurs only in the specific case 1000 queues and 1 element in each. I do not have such problem in case with 1000 queues and 2 elements in each. |
can you try running using random interleaving? see #1051 for the details. |
Yes, it helped. Thank you! |
Describe the bug
I tried to measure different solution of the "merge from multiple sources" problem.
In order to do that I needed to prepare
vector
of sortedqueue
s which I was going to merge.I decided to put these
queue
generation code inside the measuring loop and simply created separate benchmark where I measure time required for queue generation only.The problem is that in one corner case (
1000
queues with1
element in each) application showed that it is faster to generate data and merge it than ONLY generate.Other cases, even 1000 queues with 2 elements in each showed adequate results - only generation of the data is faster than generation with its manipulation.
Anyway, in order to discard the possibility for compiler to optimize something I tried to move the function which generates data to the separate file, use
benchmark::DoNotOptimize
everywhere but result was still meaningless.The result for the problematic case become adequate if I register benchmark which measures only data generation after at least one another benchmark.
This is, the order of benchmark registration by macros
BENCHMARK()
has influence on the measured values.System
OS Ubuntu 20.04
The bug was reproduced on both
Also bug can be reproduced on the quick bench website with clang 13.0.
To reproduce
Here is the link to the quick bench website where I was able to reproduce the bug
https://quick-bench.com/q/ALwB-w76BalgONHH_rypCFOtwH0
Here is the code which reproduces bug, for the case if the link above does not work:
GenerateSourceBM
- is the benchmark which measures the duration of only data generation.Results of this benchmark depends on whether it is registered as the first one or not.
Expected behavior
I expect that benchmark results will not depend on the order of their registration
Screenshots
Results when generation data benchmark is the first in the row
Results when generation data benchmark is the last in the row
The text was updated successfully, but these errors were encountered: