Benchmarking changes: drop phi1+2, change validation to MVT and check turbo and scaling settings #238

tresreid · 2019-08-16T18:19:16Z

Changes to the benchmarking scripts.

phi1 and phi2 have been dropped from the default. The useLNX flag has been changed to useARCH flag.
useARCH=0 (default) : phi3 only
useARCH=1: lnx4108 and lnx7188 only
useARCH=2: phi3 + lnx4108 and lnx7188
useARCH=3: phi1+phi2+phi3
useARCH=4: phi1+phi2+phi3 +lnx4108 +lnx7188
CMSSW val and SIM val have both been dropped. They have been replaced by MTV and MTV with required seeds
the turbo setting and scaling governor are printed out to the terminal when running. By default, the benchmarking script will crash if turbo is on or if the machine is not in performance mode.

Benchmarks can be found here:
http://mireid.web.cern.ch/mireid/Benchmarks/benchmarks_phidrop_08_16_19/benchmarks_Performance_phidrop_8_16_19v2/
http://mireid.web.cern.ch/mireid/Benchmarks/benchmarks_phidrop_08_16_19/benchmarks_Powersave_phidrop_8_16_19v2/

val_scripts/validation-cmssw-benchmarks.sh

srlantz · 2019-08-16T18:34:33Z

To enforce the agreed-upon hardware settings, namely no_turbo=1 and scaling_governor=performance, all 5 test machines are now equipped with the turbo-chmod service and have their tuned profile set to throughput-performance, as recommended by @dan131riley in the thread for issue #233. As a result, our preferred settings should now survive reboots. With these OS changes in place, everyone should now be able to switch to no_turbo=0 on a temporary basis (AND change it back!!!!), but only sudoers can mess with the scaling_governor.

kmcdermo · 2019-08-16T18:35:50Z

web/collectBenchmarks.sh

@@ -79,7 +79,7 @@ done
 ######################################

 # Make SimTrack Validation directories
-simdir="SIMVAL"
+simdir="SIMVAL_MTV"


@tresreid , since the contents of SIMVAL_MTV and SIMVAL_MTV_SEED will be the same, we can reduce the copy-paste, and just make a for loop over the two directories.

kmcdermo · 2019-08-16T18:44:19Z

@tresreid , this looks really great -- thanks for doing this. This brings the following issues along

#233 - new baselines for scaling_governor (although I guess we will have to wait for phi1 to come back to life to close this for good)
#232 - scaling_governor MT
#230 - drop phi1/phi2 + switch validation

I made a few comments, otherwise the changes look good to me. @srlantz , I think it makes sense if you review the code as well (and then hit the merge when you are satisfied and all requested changes are in + phi1 benchmarks).

kmcdermo · 2019-08-16T19:09:36Z

@areinsvo , can you have a look at the MTV (+MTV-seed) validation plots as well? I will review the plots later today as well.

tresreid · 2019-08-16T19:42:25Z

@kmcdermo I've made the changes you suggested.
I also plan to rerun the full benchmarks once all the machines are up and running so that the final benchmarks can all be in one place.

srlantz · 2019-08-16T20:12:17Z

The script changes good to me. I'll let someone else (@areinsvo or @kmcdermo) do the merge after checking the physics performance plots, since those involve the newer metrics.

Comments on the benchmarks: as expected, the 3 SKL-SP machines show the same kind of dramatic improvement in multithreaded performance that we first saw (serendipitously) on phi3. KNL and SNB (in the plots that ave SNB data) don't show any real change. This is in line with expectations on KNL, which does almost no CPU frequency scaling. But I thought we might have seen some difference on SNB. (True, SKL-SP is much more aggressive about frequency scaling than SNB; SKL-SP can make separate adjustments on each core, and do so at a higher cadence.)

However, I think something might be wrong with the 8/16/19 SNB plots, because the same ones from 7/3/19 (when powersave was in effect) show significantly worse multithreaded performance on phi1, while the others are basically the same:

http://mireid.web.cern.ch/mireid/Benchmarks/benchmarks_7_3_19_saferdel/Benchmarks/

I wonder, was powersave mode ever really in effect on 8/16/19? It would be good to recheck when Tres presents the final-final benchmark plots for this PR.

kmcdermo · 2019-08-16T21:27:55Z

@kmcdermo I've made the changes you suggested.
I also plan to rerun the full benchmarks once all the machines are up and running so that the final benchmarks can all be in one place.

Looks good to me, thanks for doing this. Once the final-final benchmarks are in, I will make sure to take a look (and also @areinsvo , if you could also look to confirm MTV looks good, just so we are at a nice clean baseline).

tresreid · 2019-08-19T14:06:28Z

FYI: I've edited the original comment to put in the new benchmarks

areinsvo · 2019-08-19T15:30:41Z

The MTV-like and MTV-require-seeds plots look consistent with the original set posted here: http://areinsvo.web.cern.ch/areinsvo/MkFit/Benchmarks/July2019_AddValOpt

From my point of view this can be merged.

tresreid added 7 commits August 16, 2019 09:17

drop phi12 initial

73ff291

added system settings for checking turbo and scaling mode

ac66b2f

changed validation to mtv and mtv with seeds

3ddea45

post benchmarks

95702cb

changed benchmark range on th time

6100c93

changed useLNX-> useARCH. Adjusted README

58ff0ad

cleanup

74ff8ca

kmcdermo reviewed Aug 16, 2019

View reviewed changes

val_scripts/validation-cmssw-benchmarks.sh Show resolved Hide resolved

kmcdermo reviewed Aug 16, 2019

View reviewed changes

validation cleanup

6111efa

kmcdermo merged commit 371a025 into trackreco:devel Aug 19, 2019

tresreid deleted the phidrop branch August 19, 2019 15:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarking changes: drop phi1+2, change validation to MVT and check turbo and scaling settings #238

Benchmarking changes: drop phi1+2, change validation to MVT and check turbo and scaling settings #238

tresreid commented Aug 16, 2019 •

edited

Loading

srlantz commented Aug 16, 2019

kmcdermo Aug 16, 2019

kmcdermo commented Aug 16, 2019

kmcdermo commented Aug 16, 2019

tresreid commented Aug 16, 2019

srlantz commented Aug 16, 2019

kmcdermo commented Aug 16, 2019

tresreid commented Aug 19, 2019

areinsvo commented Aug 19, 2019

Benchmarking changes: drop phi1+2, change validation to MVT and check turbo and scaling settings #238

Benchmarking changes: drop phi1+2, change validation to MVT and check turbo and scaling settings #238

Conversation

tresreid commented Aug 16, 2019 • edited Loading

srlantz commented Aug 16, 2019

kmcdermo Aug 16, 2019

Choose a reason for hiding this comment

kmcdermo commented Aug 16, 2019

kmcdermo commented Aug 16, 2019

tresreid commented Aug 16, 2019

srlantz commented Aug 16, 2019

kmcdermo commented Aug 16, 2019

tresreid commented Aug 19, 2019

areinsvo commented Aug 19, 2019

tresreid commented Aug 16, 2019 •

edited

Loading