-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Longer running benchmarks #220
Comments
What is the history of using 20 events? |
I think way back on the KNC days, this was a compromise on wall clock time and throughput. And then, when we introduced the MEIF tests, this was as you say to fit within the 5K events of the binary file / 256 threads for KNL. So, indeed, we can test what makes sense to do phi3 (and also if we want to use |
we can now enabling looping over the same file multiple times. This should remove the constraint of the total available number of input events |
In my CHEP19 area I have two different options in addition to the default:
I should probably test these options and make a PR... I am not sure if we need to loop over the same file multiple times. |
That's what I did for CHEP18: (N_events = 20 * N_threads, ignore N_meif) |
As demonstrated in test throughput studies from @makortel , running with larger number of events eliminates edge effects and improves parallel throughout performance.
Quoting Matti on the chat:
It seems that it may be beneficial to rewrite part of the benchmarking scripts to use more events / thread to achieve a higher parallel utilization. The question is: is this solely "forConf", to have our "best" results on display, or should we be doing this with every PR as well?
The case for every PR (although it will lengthen the time to run the benchmarking) is that compute performance gains and losses could be hiding behind this under-utilization in some systematic way. I should mention we partially account for this when running the standard benchmarks, as we drop the first event from the average build time, since we have seen it does in fact have a time per event an order of magnitude different from the average. The question, even with dropping this first event, does the average time per event improve when processing more events.
Let me know what you think (and who might want to tackle this.
The text was updated successfully, but these errors were encountered: