Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking changes: drop phi1+2, change validation to MVT and check turbo and scaling settings #238

Merged
merged 8 commits into from
Aug 19, 2019

Conversation

tresreid
Copy link
Collaborator

@tresreid tresreid commented Aug 16, 2019

Changes to the benchmarking scripts.

  1. phi1 and phi2 have been dropped from the default. The useLNX flag has been changed to useARCH flag.
    useARCH=0 (default) : phi3 only
    useARCH=1: lnx4108 and lnx7188 only
    useARCH=2: phi3 + lnx4108 and lnx7188
    useARCH=3: phi1+phi2+phi3
    useARCH=4: phi1+phi2+phi3 +lnx4108 +lnx7188

  2. CMSSW val and SIM val have both been dropped. They have been replaced by MTV and MTV with required seeds

  3. the turbo setting and scaling governor are printed out to the terminal when running. By default, the benchmarking script will crash if turbo is on or if the machine is not in performance mode.

Benchmarks can be found here:
http://mireid.web.cern.ch/mireid/Benchmarks/benchmarks_phidrop_08_16_19/benchmarks_Performance_phidrop_8_16_19v2/
http://mireid.web.cern.ch/mireid/Benchmarks/benchmarks_phidrop_08_16_19/benchmarks_Powersave_phidrop_8_16_19v2/

@srlantz
Copy link
Collaborator

srlantz commented Aug 16, 2019

To enforce the agreed-upon hardware settings, namely no_turbo=1 and scaling_governor=performance, all 5 test machines are now equipped with the turbo-chmod service and have their tuned profile set to throughput-performance, as recommended by @dan131riley in the thread for issue #233. As a result, our preferred settings should now survive reboots. With these OS changes in place, everyone should now be able to switch to no_turbo=0 on a temporary basis (AND change it back!!!!), but only sudoers can mess with the scaling_governor.

@@ -79,7 +79,7 @@ done
######################################

# Make SimTrack Validation directories
simdir="SIMVAL"
simdir="SIMVAL_MTV"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tresreid , since the contents of SIMVAL_MTV and SIMVAL_MTV_SEED will be the same, we can reduce the copy-paste, and just make a for loop over the two directories.

@kmcdermo
Copy link
Collaborator

@tresreid , this looks really great -- thanks for doing this. This brings the following issues along

#233 - new baselines for scaling_governor (although I guess we will have to wait for phi1 to come back to life to close this for good)
#232 - scaling_governor MT
#230 - drop phi1/phi2 + switch validation

I made a few comments, otherwise the changes look good to me. @srlantz , I think it makes sense if you review the code as well (and then hit the merge when you are satisfied and all requested changes are in + phi1 benchmarks).

@kmcdermo
Copy link
Collaborator

@areinsvo , can you have a look at the MTV (+MTV-seed) validation plots as well? I will review the plots later today as well.

@tresreid
Copy link
Collaborator Author

@kmcdermo I've made the changes you suggested.
I also plan to rerun the full benchmarks once all the machines are up and running so that the final benchmarks can all be in one place.

@srlantz
Copy link
Collaborator

srlantz commented Aug 16, 2019

The script changes good to me. I'll let someone else (@areinsvo or @kmcdermo) do the merge after checking the physics performance plots, since those involve the newer metrics.

Comments on the benchmarks: as expected, the 3 SKL-SP machines show the same kind of dramatic improvement in multithreaded performance that we first saw (serendipitously) on phi3. KNL and SNB (in the plots that ave SNB data) don't show any real change. This is in line with expectations on KNL, which does almost no CPU frequency scaling. But I thought we might have seen some difference on SNB. (True, SKL-SP is much more aggressive about frequency scaling than SNB; SKL-SP can make separate adjustments on each core, and do so at a higher cadence.)

However, I think something might be wrong with the 8/16/19 SNB plots, because the same ones from 7/3/19 (when powersave was in effect) show significantly worse multithreaded performance on phi1, while the others are basically the same:

http://mireid.web.cern.ch/mireid/Benchmarks/benchmarks_7_3_19_saferdel/Benchmarks/

I wonder, was powersave mode ever really in effect on 8/16/19? It would be good to recheck when Tres presents the final-final benchmark plots for this PR.

@kmcdermo
Copy link
Collaborator

@kmcdermo I've made the changes you suggested.
I also plan to rerun the full benchmarks once all the machines are up and running so that the final benchmarks can all be in one place.

Looks good to me, thanks for doing this. Once the final-final benchmarks are in, I will make sure to take a look (and also @areinsvo , if you could also look to confirm MTV looks good, just so we are at a nice clean baseline).

@tresreid
Copy link
Collaborator Author

FYI: I've edited the original comment to put in the new benchmarks

@areinsvo
Copy link
Collaborator

The MTV-like and MTV-require-seeds plots look consistent with the original set posted here: http://areinsvo.web.cern.ch/areinsvo/MkFit/Benchmarks/July2019_AddValOpt

From my point of view this can be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants