Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ParticleNet to NanoAOD #31096

Merged
merged 3 commits into from
Aug 26, 2020
Merged

Add ParticleNet to NanoAOD #31096

merged 3 commits into from
Aug 26, 2020

Conversation

hqucms
Copy link
Contributor

@hqucms hqucms commented Aug 7, 2020

PR description:

This PR adds the ParticleNet tagger to NanoAOD. The following scores will be included in NanoAOD:

  • ParticleNet: TvsQCD, WvsQCD, ZvsQCD, HbbvsQCD, HccvsQCD, H4qvsQCD, QCD
  • ParticleNet-MD: Xbb, Xcc, Xqq, QCD

The ParticleNet tagger needs to run in the NanoAOD sequence until we have the tagger stored in MiniAOD. The updated training [V01] should be used for both UL and EOY samples.

FYI @camclean @alefisico

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2020

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31096/17641

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2020

A new Pull Request was created by @hqucms (Huilin Qu) for master.

It involves the following packages:

PhysicsTools/NanoAOD

@gouskos, @cmsbuild, @fgolf, @mariadalfonso, @santocch, @peruzzim can you please review it and eventually sign? Thanks.
@gpetruc this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor

slava77 commented Aug 7, 2020

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2020

+1
Tested at: 872f4c7
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c69aaa/8657/summary.html
CMSSW: CMSSW_11_2_X_2020-08-07-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2020

Comparison job queued.

@mariadalfonso
Copy link
Contributor

mariadalfonso commented Aug 7, 2020

This PR add 9 floats for a total of 37 floats of ID for the AK8 ?

btagHbb = Var("bDiscriminator('pfBoostedDoubleSecondaryVertexAK8BJetTags')",float,doc="Higgs to BB tagger discriminator",precision=10),
btagDDBvL_noMD = Var("bDiscriminator('pfDeepDoubleBvLJetTags:probHbb')",float,doc="DeepDoubleX discriminator (no mass-decorrelation) for H(Z)->bb vs QCD",precision=10),
btagDDCvL_noMD = Var("bDiscriminator('pfDeepDoubleCvLJetTags:probHcc')",float,doc="DeepDoubleX discriminator (no mass-decorrelation) for H(Z)->cc vs QCD",precision=10),
btagDDCvB_noMD = Var("bDiscriminator('pfDeepDoubleCvBJetTags:probHcc')",float,doc="DeepDoubleX discriminator (no mass-decorrelation) for H(Z)->cc vs H(Z)->bb",precision=10),
btagDDBvL = Var("bDiscriminator('pfMassIndependentDeepDoubleBvLJetTags:probHbb')",float,doc="DeepDoubleX (mass-decorrelated) discriminator for H(Z)->bb vs QCD",precision=10),
btagDDCvL = Var("bDiscriminator('pfMassIndependentDeepDoubleCvLJetTags:probHcc')",float,doc="DeepDoubleX (mass-decorrelated) discriminator for H(Z)->cc vs QCD",precision=10),
btagDDCvB = Var("bDiscriminator('pfMassIndependentDeepDoubleCvBJetTags:probHcc')",float,doc="DeepDoubleX (mass-decorrelated) discriminator for H(Z)->cc vs H(Z)->bb",precision=10),
deepTag_TvsQCD = Var("bDiscriminator('pfDeepBoostedDiscriminatorsJetTags:TvsQCD')",float,doc="DeepBoostedJet tagger top vs QCD discriminator",precision=10),
deepTag_WvsQCD = Var("bDiscriminator('pfDeepBoostedDiscriminatorsJetTags:WvsQCD')",float,doc="DeepBoostedJet tagger W vs QCD discriminator",precision=10),
deepTag_ZvsQCD = Var("bDiscriminator('pfDeepBoostedDiscriminatorsJetTags:ZvsQCD')",float,doc="DeepBoostedJet tagger Z vs QCD discriminator",precision=10),
deepTag_H = Var("bDiscriminator('pfDeepBoostedJetTags:probHbb')+bDiscriminator('pfDeepBoostedJetTags:probHcc')+bDiscriminator('pfDeepBoostedJetTags:probHqqqq')",float,doc="DeepBoostedJet tagger H(bb,cc,4q) sum",precision=10),
deepTag_QCD = Var("bDiscriminator('pfDeepBoostedJetTags:probQCDbb')+bDiscriminator('pfDeepBoostedJetTags:probQCDcc')+bDiscriminator('pfDeepBoostedJetTags:probQCDb')+bDiscriminator('pfDeepBoostedJetTags:probQCDc')+bDiscriminator('pfDeepBoostedJetTags:probQCDothers')",float,doc="DeepBoostedJet tagger QCD(bb,cc,b,c,others) sum",precision=10),
deepTag_QCDothers = Var("bDiscriminator('pfDeepBoostedJetTags:probQCDothers')",float,doc="DeepBoostedJet tagger QCDothers value",precision=10),
deepTagMD_TvsQCD = Var("bDiscriminator('pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:TvsQCD')",float,doc="Mass-decorrelated DeepBoostedJet tagger top vs QCD discriminator",precision=10),
deepTagMD_WvsQCD = Var("bDiscriminator('pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:WvsQCD')",float,doc="Mass-decorrelated DeepBoostedJet tagger W vs QCD discriminator",precision=10),
deepTagMD_ZvsQCD = Var("bDiscriminator('pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:ZvsQCD')",float,doc="Mass-decorrelated DeepBoostedJet tagger Z vs QCD discriminator",precision=10),
deepTagMD_ZHbbvsQCD = Var("bDiscriminator('pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:ZHbbvsQCD')",float,doc="Mass-decorrelated DeepBoostedJet tagger Z/H->bb vs QCD discriminator",precision=10),
deepTagMD_ZbbvsQCD = Var("bDiscriminator('pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:ZbbvsQCD')",float,doc="Mass-decorrelated DeepBoostedJet tagger Z->bb vs QCD discriminator",precision=10),
deepTagMD_HbbvsQCD = Var("bDiscriminator('pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:HbbvsQCD')",float,doc="Mass-decorrelated DeepBoostedJet tagger H->bb vs QCD discriminator",precision=10),
deepTagMD_ZHccvsQCD = Var("bDiscriminator('pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:ZHccvsQCD')",float,doc="Mass-decorrelated DeepBoostedJet tagger Z/H->cc vs QCD discriminator",precision=10),
deepTagMD_H4qvsQCD = Var("bDiscriminator('pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:H4qvsQCD')",float,doc="Mass-decorrelated DeepBoostedJet tagger H->4q vs QCD discriminator",precision=10),
deepTagMD_bbvsLight = Var("bDiscriminator('pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:bbvsLight')",float,doc="Mass-decorrelated DeepBoostedJet tagger Z/H/gluon->bb vs light flavour discriminator",precision=10),
deepTagMD_ccvsLight = Var("bDiscriminator('pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:ccvsLight')",float,doc="Mass-decorrelated DeepBoostedJet tagger Z/H/gluon->cc vs light flavour discriminator",precision=10),
particleNet_TvsQCD = Var("bDiscriminator('pfParticleNetDiscriminatorsJetTags:TvsQCD')",float,doc="ParticleNet tagger top vs QCD discriminator",precision=10),
particleNet_WvsQCD = Var("bDiscriminator('pfParticleNetDiscriminatorsJetTags:WvsQCD')",float,doc="ParticleNet tagger W vs QCD discriminator",precision=10),
particleNet_ZvsQCD = Var("bDiscriminator('pfParticleNetDiscriminatorsJetTags:ZvsQCD')",float,doc="ParticleNet tagger Z vs QCD discriminator",precision=10),
particleNet_HbbvsQCD = Var("bDiscriminator('pfParticleNetDiscriminatorsJetTags:HbbvsQCD')",float,doc="ParticleNet tagger H(->bb) vs QCD discriminator",precision=10),
particleNet_QCD = Var("bDiscriminator('pfParticleNetJetTags:probQCDbb')+bDiscriminator('pfParticleNetJetTags:probQCDcc')+bDiscriminator('pfParticleNetJetTags:probQCDb')+bDiscriminator('pfParticleNetJetTags:probQCDc')+bDiscriminator('pfParticleNetJetTags:probQCDothers')",float,doc="ParticleNet tagger QCD(bb,cc,b,c,others) sum",precision=10),
particleNetMD_Xbb = Var("bDiscriminator('pfMassDecorrelatedParticleNetJetTags:probXbb')",float,doc="Mass-decorrelated ParticleNet tagger raw X->bb score. For X->bb vs QCD tagging, use Xbb/(Xbb+QCD)",precision=10),
particleNetMD_Xcc = Var("bDiscriminator('pfMassDecorrelatedParticleNetJetTags:probXcc')",float,doc="Mass-decorrelated ParticleNet tagger raw X->cc score. For X->cc vs QCD tagging, use Xcc/(Xcc+QCD)",precision=10),
particleNetMD_Xqq = Var("bDiscriminator('pfMassDecorrelatedParticleNetJetTags:probXqq')",float,doc="Mass-decorrelated ParticleNet tagger raw X->qq (uds) score. For X->qq vs QCD tagging, use Xqq/(Xqq+QCD). For W vs QCD tagging, use (Xcc+Xqq)/(Xcc+Xqq+QCD)",precision=10),
particleNetMD_QCD = Var("bDiscriminator('pfMassDecorrelatedParticleNetJetTags:probQCDbb')+bDiscriminator('pfMassDecorrelatedParticleNetJetTags:probQCDcc')+bDiscriminator('pfMassDecorrelatedParticleNetJetTags:probQCDb')+bDiscriminator('pfMassDecorrelatedParticleNetJetTags:probQCDc')+bDiscriminator('pfMassDecorrelatedParticleNetJetTags:probQCDothers')",float,doc="Mass-decorrelated ParticleNet tagger raw QCD score",precision=10),

tau1 = Var("userFloat('NjettinessAK8Puppi:tau1')",float, doc="Nsubjettiness (1 axis)",precision=10),
tau2 = Var("userFloat('NjettinessAK8Puppi:tau2')",float, doc="Nsubjettiness (2 axis)",precision=10),
tau3 = Var("userFloat('NjettinessAK8Puppi:tau3')",float, doc="Nsubjettiness (3 axis)",precision=10),
tau4 = Var("userFloat('NjettinessAK8Puppi:tau4')",float, doc="Nsubjettiness (4 axis)",precision=10),
n2b1 = Var("userFloat('ak8PFJetsPuppiSoftDropValueMap:nb1AK8PuppiSoftDropN2')", float, doc="N2 with beta=1", precision=10),
n3b1 = Var("userFloat('ak8PFJetsPuppiSoftDropValueMap:nb1AK8PuppiSoftDropN3')", float, doc="N3 with beta=1", precision=10),

Are the SF available for all these ?

@@ -313,18 +317,21 @@ def nanoAOD_customizeCommon(process):
nanoAOD_addDeepBTag_switch = cms.untracked.bool(False),
nanoAOD_addDeepBoostedJet_switch = cms.untracked.bool(True), # will deactivate this in future miniAOD releases
nanoAOD_addDeepDoubleX_switch = cms.untracked.bool(True),
nanoAOD_addParticleNet_switch = cms.untracked.bool(True),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to re-run the particleNet also when is available in the mini ?
For the re-mini we can use the one stored in the miniAOD ?
For the old samples [ run2_nanoAOD_94X2016, run2_nanoAOD_94XMiniAODv1, run2_nanoAOD_94XMiniAODv2, run2_nanoAOD_102Xv1, run2_nanoAOD_106Xv1 ] indeed we need to re-run.

Copy link
Contributor Author

@hqucms hqucms Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to re-run the particleNet also when is available in the mini ?

No. We can use the ones stored in MiniAOD once they are available (e.g., in UL re-mini).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. good the taggers are already available in master version.

--> for master:
nanoAOD_produceParticleNet_switch = cms.untracked.bool(False)
if run2_nanoAOD_106Xv1 + anyOld:
nanoAOD_produceParticleNet_switch = cms.untracked.bool(True)
add deepParticleNet

for 10_6:
if NOT run2_miniAOD_devel:
nanoAOD_produceParticleNet_switch = cms.untracked.bool(True)
add ParticleNet

probably same logic should be applied to all the previous tagger

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. good the taggers are already available in master version.

--> for master:
nanoAOD_produceParticleNet_switch = cms.untracked.bool(False)
if run2_nanoAOD_106Xv1 + anyOld:
nanoAOD_produceParticleNet_switch = cms.untracked.bool(True)
add deepParticleNet

for 10_6:
if NOT run2_miniAOD_devel:
nanoAOD_produceParticleNet_switch = cms.untracked.bool(True)
add ParticleNet

probably same logic should be applied to all the previous tagger

Yes indeed. Shall I change this or would you prefer to do it centrally afterwards?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be done here and not afterwards

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be done here and not afterwards

Done in 2c7b9ae.

@hqucms
Copy link
Contributor Author

hqucms commented Aug 7, 2020

@mariadalfonso
SFs will be available for ParticleNet TvsQCD, WvsQCD and all the ParticleNet-MD scores. The long-term plan is to replace most of the DeepAK8 taggers by their ParticleNet counterparts, but depending on the production plans I guess the DeepAK8 taggers will still be needed by ongoing analyses.

Also given that on average the number of high-pt AK8 jets in typical events (e.g., ttbar) is <<1, the overall space increase should be quite small.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2020

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c69aaa/8657/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 25 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2612401
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2612352
  • DQMHistoTests: Total skipped: 48
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 1.155 KiB( 34 files compared)
  • DQMHistoSizes: changed ( 1325.7 ): 1.155 KiB Physics/NanoAODDQM
  • Checked 149 log files, 22 edm output root files, 35 DQM output files

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 17, 2020

The tests are being triggered in jenkins.
Test Parameters:

@cmsbuild
Copy link
Contributor

+1
Tested at: 2c7b9ae
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c69aaa/8788/summary.html
CMSSW: CMSSW_11_2_X_2020-08-16-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c69aaa/8788/summary.html

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-c69aaa/10224.15_TTbar_13+2017PU_JMENano+TTbar_13TeV_TuneCUETP8M1_GenSim+DigiPU+RecoPU+HARVESTPU+Nano
  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-c69aaa/1325.61_TTbar_13_106Xv1NanoAODINPUT+TTbar_13_106Xv1NanoAODINPUT+NANOAODMC2017_106XMiniAODv1
  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-c69aaa/1325.81_TTbar_13_106Xv1NanoAODINPUT+TTbar_13_106Xv1NanoAODINPUT+NANOEDMMC2017_106XMiniAODv1+HARVESTNANOAODMC2017_106XMiniAODv1
  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-c69aaa/136.8522_RunJetHT2018A_nanoUL+RunJetHT2018A_nanoUL+NANOEDM2018_106Xv1+HARVESTNANOAOD2018_106Xv1

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 532 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2608246
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2608197
  • DQMHistoTests: Total skipped: 48
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 1.417 KiB( 34 files compared)
  • DQMHistoSizes: changed ( 1325.7 ): 1.417 KiB Physics/NanoAODDQM
  • Checked 149 log files, 22 edm output root files, 35 DQM output files

@santocch
Copy link

+1

@mariadalfonso
Copy link
Contributor

@mariadalfonso
SFs will be available for ParticleNet TvsQCD, WvsQCD and all the ParticleNet-MD scores. The long-term plan is to replace most of the DeepAK8 taggers by their ParticleNet counterparts, but depending on the production plans I guess the DeepAK8 taggers will still be needed by ongoing analyses.
Also given that on average the number of high-pt AK8 jets in typical events (e.g., ttbar) is <<1, the overall space increase should be quite small.

Can you quote separately the increase in space in mini and nano ? ttbar events are fine.

@qliphy
Do you have some news on this ?

@qliphy
Copy link
Contributor

qliphy commented Aug 21, 2020

@mariadalfonso
SFs will be available for ParticleNet TvsQCD, WvsQCD and all the ParticleNet-MD scores. The long-term plan is to replace most of the DeepAK8 taggers by their ParticleNet counterparts, but depending on the production plans I guess the DeepAK8 taggers will still be needed by ongoing analyses.
Also given that on average the number of high-pt AK8 jets in typical events (e.g., ttbar) is <<1, the overall space increase should be quite small.

Can you quote separately the increase in space in mini and nano ? ttbar events are fine.

@qliphy
Do you have some news on this ?

Maybe this is for @hqucms ?

@hqucms
Copy link
Contributor Author

hqucms commented Aug 25, 2020

Can you quote separately the increase in space in mini and nano ? ttbar events are fine.

@mariadalfonso

Increase in Mini is negligible (1 out of 110 b-tag discriminants per ak8 jet).
Increase in Nano is topology dependent -- for ttbar the increase is ~0.01kb/evt [1].

[1] before: http://hqu.web.cern.ch/hqu/dev/pr31096-pre.html
after: http://hqu.web.cern.ch/hqu/dev/pr31096-post.html

@mariadalfonso
Copy link
Contributor

+xpog

This PR add the particleNet in nano EOY, UL and master.
For the DeepAK8 this PR deactivates the recomputation for ULmini and mini made with master since it's already available in miniAOD while it still recompute the recipe for the EOY.

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@mariadalfonso
Copy link
Contributor

mariadalfonso commented Aug 25, 2020

@hqucms please consider making the backport to 10_6

@qliphy
Copy link
Contributor

qliphy commented Aug 26, 2020

+1

@cmsbuild cmsbuild merged commit 30a9d8f into cms-sw:master Aug 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants