-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarking guidelines #606
Comments
Mostly I think this is good advice. A few quibbles:
|
There's a pleora of ways to get it wrong. The notion of
I know a VM for which this might be false depending on a benchmark as it can do deep speculations on values. Graal.
I'm not convinced. Would you elaborate? |
There's lots of stuff to get wrong with microbenchmarking, including trying to benchmark extremely inexpensive operations with overhead that dominates. So it's not exactly more reliable, it just shifts what you need to worry about, and renders more things unobservable. I suppose that it's a better default, but I wouldn't select it when the person writing the benchmark has some reason to think that some real operation is reasonably appropriate. Also, if Graal is really good at deeply speculating on values but Blackhole.consume prevents the speculation is that actually more representative of running real code? Again, I'm not sure it's really better, just different. Most of the time these things are going to matter much less than your microbenchmark being run in the wrong optimization context and without the normal GC load. As far as the RNG goes, uhm, have you checked what it does mod 8, for instance? |
(I mean--you haven't even implemented it right. If you do implement it right, it's still pretty lousy, especially with differences which matter due to caching effects at least on array lookups.) |
My benchmarks suggest that
Because we shouldn't be benchmarking the rare "good" case of our collection. We should provide collections that are guaranteed to perform well in common case, when optimizations fail. |
Fair enough. Maybe |
If you mean that I didn't go into longs as the benchmark suggests then yes. I'm not arguing that my RNG is the best RNG. I'm trying to see your reasoning behind your critique of using fast RNGs here. Or did you argue that this particular one is bad one? |
This is a particularly bad choice given the behavior for modulus of small powers of two. A fast RNG is in principle fine; the trick is to get one that doesn't make you measure something other than what you think you're measuring. If you care about random stride length, none of the LCGs are safe. If you just care about hitting arbitrary elements, some LCGs are safe, but the particular implementation you chose (mod 2^32 instead of 2^32-1) is really not good for small powers of two. |
OK, I agree.
It was fine in my case given that magical numbers for that collection
aren't powers of two, but I see your reasoning in general.
I'll update the guideline to your suggestion with an array.
…On 30 March 2017 02:17:49 Ichoran ***@***.***> wrote:
This is a particularly bad choice given the behavior for modulus of small
powers of two. A fast RNG is in principle fine; the trick is to get one
that doesn't make you measure something other than what you think you're
measuring. If you care about random stride length, none of the LCGs are
safe. If you just care about hitting arbitrary elements, some LCGs are
safe, but the particular implementation you chose (mod 2^32 instead of
2^32-1) is really not good for small powers of two.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
https://github.com/scala/collection-strawman/issues/54#issuecomment-290264710
|
Very nice recommendations by @DarkDimius and @Ichoran! I'd like to build upon this list with some more points/questions (just food for thought):
BTW, do you know whether we have support for JMH's WDYT @DarkDimius? References |
That's the exact opposite of what you want. You actually want it to become megamorhic because this is the common case.
I'd say a strong "yes"
Linux only. |
Doesn't it work if you build hsdis https://github.com/AdoptOpenJDK/jitwatch/wiki/Building-hsdis ? |
@lrytz, there is no |
Ah yes, thanks. I remember now seeing this tweet :) |
This comment has been minimized.
This comment has been minimized.
Closing, as we're not going to directly take action from this - but it's good info and it's still reachable through search. |
I'm writing this as a guide on how to make contributed benchmarks "more valid" and actually measure what they are supposed to. Note that as of current state benchmarks in this repo don't follow them.
0, 1, 2, 3, 4, 7, 8, 15, 16, 17, 39, 282, 73121, 7312102
.Blackhole.consume
instead of returning a value from a methodBlackhole.consume
to ensure that data you are accessing is actually accessedforeach
as a way to run multiple iterations of benchmark. Use a hand-written while cycle and@OperationsPerInvocation
, eg https://github.com/DarkDimius/DelayingArrays/blob/rewrite/src/main/scala/me/d_d/delaying/Benchmarks.scala#L151list.head
, make sure to have a set-up run to ellide it from memory. This is hard to get right and this is why I recommend to not include such benchmarks.If you want to get reliable numbers when running on your machine , make sure that you have:
Additions and suggestions are welcome.
The text was updated successfully, but these errors were encountered: