Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PhaseII: optimize seeding, reduce cpu-timing #16885

Merged
merged 15 commits into from
Dec 9, 2016

Conversation

VinInn
Copy link
Contributor

@VinInn VinInn commented Dec 6, 2016

This PR builds upon #16858 and

  1. remove PIxelPair seeding
    makes PhaseII consistent with PhaseI
  2. remove redundant Layers combinations w/o affecting physics performance
    a) Triplets with gaps
    b) very-forward conbinations
    c) an intermediate FPIX combination

Timing for ttbar 200PU TkOnly wf is reduced from 406 seconds to 169 seconds (on 3.5GHz Haswell)
see details in
https://twiki.cern.ch/twiki/bin/viewauth/CMS/ViTkPerfPhase21216?sortcol=3;table=1;up=1#sorted_table

physics performance are even better than those in #16858
http://innocent.home.cern.ch/innocent/RelVal/tkRecoD4_9pre2/

http://innocent.home.cern.ch/innocent/RelVal/tkRecoD4_9pre2/plots_highPurity/effandfake1.pdf
please note also
http://innocent.home.cern.ch/innocent/RelVal/tkRecoD4_9pre2/plots_highPurity/hitsAndPt.pdf
http://innocent.home.cern.ch/innocent/RelVal/tkRecoD4_9pre2/plots_selectedOfflinePrimaryVertices/recovsgen.pdf
http://innocent.home.cern.ch/innocent/RelVal/tkRecoD4_9pre2/plots_selectedOfflinePrimaryVertices/pvtagging.pdf

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 6, 2016

A new Pull Request was created by @VinInn (Vincenzo Innocente) for CMSSW_9_0_X.

It involves the following packages:

RecoPixelVertexing/PixelTriplets
RecoTracker/ConversionSeedGenerators
RecoTracker/FinalTrackSelectors
RecoTracker/IterativeTracking

@cmsbuild, @cvuosalo, @slava77, @davidlange6 can you please review it and eventually sign? Thanks.
@ghellwig, @makortel, @felicepantaleo, @GiacomoSguazzoni, @rovere, @VinInn, @mschrode, @gpetruc, @dgulhan this is something you requested to watch as well.
@slava77, @smuzaffar you are the release manager for this.

cms-bot commands are listed here #13028

@VinInn
Copy link
Contributor Author

VinInn commented Dec 6, 2016

@ebrondol , @boudoul , @venturia

@VinInn
Copy link
Contributor Author

VinInn commented Dec 6, 2016

@cmsbuild , please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 6, 2016

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/16807/console Started: 2016/12/06 16:00

'FPix6_pos+FPix7_pos+FPix8_pos+FPix9_pos', 'FPix6_neg+FPix7_neg+FPix8_neg+FPix9_neg']
# 'FPix5_pos+FPix6_pos+FPix7_pos+FPix8_pos', 'FPix5_neg+FPix6_neg+FPix7_neg+FPix8_neg',
# 'FPix5_pos+FPix6_pos+FPix7_pos+FPix9_pos', 'FPix5_neg+FPix6_neg+FPix7_neg+FPix9_neg',
# 'FPix6_pos+FPix7_pos+FPix8_pos+FPix9_pos', 'FPix6_neg+FPix7_neg+FPix8_neg+FPix9_neg'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this commented out code needed?
Please remove or add comments inline in the code why the commented out block is relevant

indivShareFrac = [1.0,0.16,0.095,0.09,0.09,0.09],
selectedTrackQuals = cms.VInputTag(cms.InputTag("initialStepSelector","initialStep"),
cms.InputTag("highPtTripletStepSelector","highPtTripletStep"),
cms.InputTag("lowPtQuadStepSelector","lowPtQuadStep"),
cms.InputTag("lowPtTripletStepSelector","lowPtTripletStep"),
cms.InputTag("detachedQuadStep"),
cms.InputTag("pixelPairStepSelector","pixelPairStep")
# cms.InputTag("pixelPairStepSelector","pixelPairStep")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this commented out code needed?
Please remove or add comments inline in the code why the commented out block is relevant

'FPix6_pos+FPix7_pos+FPix8_pos', 'FPix6_neg+FPix7_neg+FPix8_neg',
'FPix6_pos+FPix7_pos+FPix9_pos', 'FPix6_neg+FPix7_neg+FPix9_neg']
# 'FPix6_pos+FPix7_pos+FPix8_pos', 'FPix6_neg+FPix7_neg+FPix8_neg',
# 'FPix6_pos+FPix7_pos+FPix9_pos', 'FPix6_neg+FPix7_neg+FPix9_neg']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this commented out code needed?
Please remove or add comments inline in the code why the commented out block is relevant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slava77, as you realize this is for this Xmas production.
We cannot exclude to revive these combinations later on.
I prefer to keep the comments so that everybody knows how it was...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a good reason to keep this, please add a comment inline (in the code) that clarifies why the commented code is here

@@ -63,7 +63,7 @@
"LowPtQuadStep",
"LowPtTripletStep",
"DetachedQuadStep",
"PixelPairStep",
# "PixelPairStep",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this commented out code needed?
Please remove or add comments inline in the code why the commented out block is relevant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no this one not...

@VinInn
Copy link
Contributor Author

VinInn commented Dec 6, 2016

@cmsbuild , please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 6, 2016

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/16810/console Started: 2016/12/06 16:48

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 6, 2016

Pull request #16885 was updated. @cmsbuild, @cvuosalo, @slava77, @davidlange6 can you please check and sign again.

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 6, 2016

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 6, 2016

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 6, 2016

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-16885/16810/summary.html

The workflows 1003.0, 1001.0, 1000.0, 140.53, 136.731, 4.22 have different files in step1_dasquery.log than the ones found in the baseline. You may want to check and retrigger the tests if necessary. You can check it in the "files" directory in the results of the comparisons

@@ -40,13 +40,16 @@
'BPix1+BPix2+BPix3+FPix1_pos','BPix1+BPix2+BPix3+FPix1_neg',
'BPix1+BPix2+FPix1_pos+FPix2_pos', 'BPix1+BPix2+FPix1_neg+FPix2_neg',
'BPix1+FPix1_pos+FPix2_pos+FPix3_pos', 'BPix1+FPix1_neg+FPix2_neg+FPix3_neg',
'FPix1_pos+FPix2_pos+FPix3_pos+FPix4_pos', 'FPix1_neg+FPix2_neg+FPix3_neg+FPix4_neg',
# removed as redundant in current geometry (here for documentation)
# 'FPix1_pos+FPix2_pos+FPix3_pos+FPix4_pos', 'FPix1_neg+FPix2_neg+FPix3_neg+FPix4_neg',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean T1/T2 will have some losses here?

Copy link
Contributor Author

@VinInn VinInn Dec 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not check...
Is it relevant?
We do not have a dynamic seeding approach to geometry layout (yet...)
In any case Triplets should kick in (it may run slower)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was still no clear message that we can abandon T1/T2 (phase-1 like PIX detector).
There was only a clear message that the main samples will be the T3 (D4 full detector layout),
but then there were references to studies possibly still needed for tracker trigger etc
@boudoul @atricomi @delaere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently PixelDigitizer has even more serious issues with T1/T2 ....
(and things will continue to diverge).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just decided at the Pixel Detector Modelling and Simulations meeting to rework the tracker geometry scenarii to keep only the "new ph2 pixel", with either the flat or tilted OT. All instances of "ph1-like pixel will be removed. So we are good to go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, given https://github.com/cms-sw/cmssw/pull/16885/files#r91294457 we can ignore pixel-related performance changes in phase1-like layouts in T1/T2

'pixelPairStepTracks'],
hasSelector = [1,1,1,1,1,1],
],
hasSelector = [1,1,1,1,1],
indivShareFrac = [1.0,0.16,0.095,0.09,0.09,0.09],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the indivShareFrac length be changed to 5 as well? (I'm assuming it's cosmetic, given that the code ran; no length match asserts)


trackingPhase1PU70.toReplaceWith(initialStepHitQuadruplets, _initialStepHitQuadrupletsMerging)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this edit for trackingPhase1PU70 is intended, right? it just stands out in a "PhaseII" PR.
@makortel
Given limited scope of this era, it's probably not that important either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slava77
The "change" is to preserve the current behaviour for trackingPhase1PU70. Before, initialStepHitQuadruplets label pointed to PixelQuadrupletMergerEDProducer, while Erica changes it point to PixelQuadrupletEDProducer instead. So this line changes the label to point back to PixelQuadrupletMergerEDProducer for trackingPhase1PU70.

Given limited scope of this era, it's probably not that important either way.

Indeed, I'll disable the trackingPhase1PU70 workflows in a PR to be submitted soon (leaving the actual cleanup later to avoid conflicting developments for pre2).

@slava77
Copy link
Contributor

slava77 commented Dec 7, 2016

in 900pre1 with PU200 D4Timing full workflow the winners are now clearly validation modules
https://slava77sk.web.cern.ch/slava77sk/reco/time_CMSSW_9_0_0_pre1-sign806a.2023PU200.from-step2.23034.0_TTbar_14TeV.step3_100_fullMT8.txt

compared to my older tests
https://slava77sk.web.cern.ch/slava77sk/reco/time_CMSSW_9_0_X_2016-11-29-2300-sign803.8a315d6.2023PU200.23034.0_TTbar_14TeV.step3_100_fullMT4.txt

mem profiles and (other than already posted MTV ) plot checks will come later

@slava77
Copy link
Contributor

slava77 commented Dec 8, 2016

Technical for PU200 23034-equivalent in 900pre1

delta/moduleMean delta/jobMean  baseline           PR
   -1.973754      -6.98%    179185.00 ms/ev ->      1183.51 ms/ev detachedQuadStepHitQuadruplets
   -1.851616      -4.25%    112624.00 ms/ev ->      4338.84 ms/ev lowPtQuadStepHitQuadruplets
   -1.801749      -0.99%     26746.40 ms/ev ->      1394.75 ms/ev initialStepHitQuadruplets
   -1.580470      -6.04%    174444.00 ms/ev ->     20439.90 ms/ev ecalDrivenElectronSeeds
   -1.202900      -0.62%     21185.40 ms/ev ->      5272.37 ms/ev detachedQuadStepTrackCandidates
   -1.145867      -0.37%     13095.30 ms/ev ->      3555.50 ms/ev allConversions
   -0.947412      -0.33%     13237.10 ms/ev ->      4727.27 ms/ev gsfGeneralInOutOutInConversionTrackMerger
   -0.917723      -6.01%    243648.00 ms/ev ->     90376.80 ms/ev lowPtTripletStepTrackCandidates
   -0.868800      -0.75%     31476.90 ms/ev ->     12411.70 ms/ev electronCkfTrackCandidates
   -0.848011      -2.41%    103142.00 ms/ev ->     41719.80 ms/ev pfTrackElec
   -0.766076      -0.71%     32779.80 ms/ev ->     14622.80 ms/ev electronGsfTracks
   -0.710509      -3.89%    189404.00 ms/ev ->     90106.60 ms/ev highPtTripletStepTrackCandidates
   -0.548154      -0.19%     11179.30 ms/ev ->      6369.56 ms/ev unsortedOfflinePrimaryVertices1D
   -0.500005      -1.09%     69682.90 ms/ev ->     41809.50 ms/ev unsortedOfflinePrimaryVertices4D
   -0.443686      -0.18%     12350.10 ms/ev ->      7865.43 ms/ev lowPtQuadStepHitTriplets
   -0.438799      -0.64%     45110.70 ms/ev ->     28877.70 ms/ev unsortedOfflinePrimaryVertices
   +0.344100      +1.31%     80259.70 ms/ev ->    113616.00 ms/ev lowPtQuadStepTrackCandidates
   -0.332981      -0.11%     10245.00 ms/ev ->      7320.51 ms/ev pixelVertices
   -0.286055      -0.25%     25320.00 ms/ev ->     18983.40 ms/ev particleFlowDisplacedVertex
   -0.165572      -0.37%     61379.50 ms/ev ->     51993.80 ms/ev initialStepTrackCandidates
   -0.159311      -0.11%     18785.30 ms/ev ->     16013.40 ms/ev muons1stStep
   -0.045020      -0.23%    133232.00 ms/ev ->    127366.00 ms/ev recoTauAK4PFJets08RegionBoosted
Total printed: 1.63214e+06 -> 710365
Job total:  2549.58 s/ev ==> 1494.55 s/ev

sorted times are available in
https://slava77sk.web.cern.ch/slava77sk/reco/time_CMSSW_9_0_0_pre1-orig.2023PU200.from-step2.23034.0_TTbar_14TeV.step3_100_fullMT8.txt for baseline
and
https://slava77sk.web.cern.ch/slava77sk/reco/time_CMSSW_9_0_0_pre1-sign806a.2023PU200.from-step2.23034.0_TTbar_14TeV.step3_100_fullMT8.txt for this PR

Memory peak in full job step3 with validation included changed from 7.15 GiB/thread to 6.07 GiB/thread.
A job without VALIDATION,DQM and writing RECO,AOD,MINIAOD had a peak of 2.44 GiB/thread.

RECO size is down by 25% to 34 MB/evt from a combination of:

  • x2.5 less allConversions
  • x2.3 less electronGsfTracks
  • x5 less electronMergedSeeds
  • 15% less muons
  • 25% less pixelTracks
  • 8% less generalTracks

some plots to come in a separate post

@slava77
Copy link
Contributor

slava77 commented Dec 8, 2016

Some plots

23034 PU200:
all_sign806avsorig_ttbar14tev2023d4timingpu200wf23034p0c_log10recopfcandidates_particleflowegamma__reco_obj_pt
all_sign806avsorig_ttbar14tev2023d4timingpu200wf23034p0c_log10recopfjets_ak4pfjets__reco_obj_et
all_sign806avsorig_ttbar14tev2023d4timingpu200wf23034p0c_log10recopfjets_ak4pfjetschs__reco_obj_et

there is no significant change in the GEN-matched jet response.

20% less PVs
all_sign806avsorig_ttbar14tev2023d4timingpu200wf23034p0c_log10recovertexs_offlineprimaryvertices__reco_obj_xerror
all_sign806avsorig_ttbar14tev2023d4timingpu200wf23034p0c_log10recovertexs_offlineprimaryvertices__reco_obj_zerror

all_sign806avsorig_ttbar14tev2023d4timingpu200wf23034p0c_recoelectronseeds_electronmergedseeds__reco_obj_nhits

ecalDrivenGsfElectrons are almost unchanged in the endcaps, while the GsfTrack population is down there quite a bit
all_sign806avsorig_ttbar14tev2023d4timingpu200wf23034p0c_recogsftracks_electrongsftracks__reco_obj_eta

ecalDrivenGsfElectrons change mostly in the barrel (apparently similar reduction as gedGsfElectrons)
all_sign806avsorig_ttbar14tev2023d4timingpu200wf23034p0c_recogsfelectrons_gedgsfelectrons__reco_obj_esuperclusteroverp

B-tagging is apparently better
wf23034pu200_csvv2_b_vs_l
wf23034pu200_jbp_b_vs_l

============================
without PU:
changes are rather small, except for the reduced coverage for tracks with eta>~3.5

generalTrack MTV plots are in line with what's posted with the PR description

electronGsf MTV suggest there is a small loss of efficiency
in jets (wf 21253):
wf21253_gsf_eff_pt
in ZEE it's barely visible
wf21246_gsf_eff_global

@slava77
Copy link
Contributor

slava77 commented Dec 8, 2016

+1

for #16885 1ea58bb

  • changes in iter tracking configuration are as described: made in phase-2 era to cope with CPU&memory use, also with some improvements in performance in physics (eff ~ same, vs lower fakes)
  • jenkins tests pass and comparisons with baseline show differences only in 2023 workflows
  • tests with PU200 and some higher stat tests in QCD and ZEE wflows (no PU) in 2023 D4 show roughly expected behavior

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 8, 2016

This pull request is fully signed and it will be integrated in one of the next CMSSW_9_0_X IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @smuzaffar

@davidlange6
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 363f619 into cms-sw:CMSSW_9_0_X Dec 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants