Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HLT Timing script for CI #45705

Merged
merged 2 commits into from
Aug 21, 2024
Merged

Fix HLT Timing script for CI #45705

merged 2 commits into from
Aug 21, 2024

Conversation

rovere
Copy link
Contributor

@rovere rovere commented Aug 15, 2024

PR description:

This Pull Request addresses an issue with the Phase2 HLT performance monitoring script, which has been non-functional for several weeks due to incompatible changes introduced by L1T. The script has been updated to use a more recent version of the input dataset that includes all the necessary products. Additionally, the geometry and Global Tag (GT) have been updated to ensure compatibility with the new dataset.

PR validation:

Run locally w/o problems.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 15, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @rovere for master.

It involves the following packages:

  • HLTrigger/Configuration (hlt)

@Martin-Grunewald, @cmsbuild, @mmusich can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @SohamBhattacharya, @VourMa, @missirol, @mmusich, @silviodonato this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@rovere
Copy link
Contributor Author

rovere commented Aug 15, 2024

enable hlt_p2_timing

@rovere
Copy link
Contributor Author

rovere commented Aug 15, 2024

@cmsbuild please test

@mmusich
Copy link
Contributor

mmusich commented Aug 15, 2024

which has been non-functional for several weeks

For my own education, where is one supposed to check?

@rovere
Copy link
Contributor Author

rovere commented Aug 16, 2024

which has been non-functional for several weeks

For my own education, where is one supposed to check?

Hi @mmusich,

Typically, this test is included in the regular test suite for every Integration Build (IB). The results, particularly the HLT pie chart, are usually accessible via a link from the main IB portal. However, the link is currently missing because the test is broken. This PR aims to address this issue.

@cmsbuild
Copy link
Contributor

-1

Failed Tests: HLTP2Timing
Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f2d24e/40941/summary.html
COMMIT: 1ad0870
CMSSW: CMSSW_14_1_X_2024-08-15-1100/el8_amd64_gcc12
Additional Tests: HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/45705/40941/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 3 lines from the logs
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 45
  • DQMHistoTests: Total histograms compared: 3422510
  • DQMHistoTests: Total failures: 9
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3422481
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 44 files compared)
  • Checked 196 log files, 165 edm output root files, 45 DQM output files
  • TriggerResults: no differences found

@cmsbuild
Copy link
Contributor

-1

Failed Tests: HLTP2Timing
Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f2d24e/40941/summary.html
COMMIT: 1ad0870
CMSSW: CMSSW_14_1_X_2024-08-15-1100/el8_amd64_gcc12
Additional Tests: HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/45705/40941/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 3 lines from the logs
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 45
  • DQMHistoTests: Total histograms compared: 3422510
  • DQMHistoTests: Total failures: 9
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3422481
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 44 files compared)
  • Checked 196 log files, 165 edm output root files, 45 DQM output files
  • TriggerResults: no differences found

@smuzaffar
Copy link
Contributor

@rovere , HLT timing test still fails with error

+ cmsDriver.py Phase2 -s L1P2GT,HLT:75e33_timing --processName=HLTX --conditions auto:phase2_realistic_T33 --geometry Extended2026D110 --era Phase2C17I13M9 --customise SLHCUpgradeSimulations/Configuration/aging.customise_aging_1000 --eventcontent FEVTDEBUGHLT --filein= --mc --nThreads 4 '--inputCommands=keep *, drop *_hlt*_*_HLT, drop triggerTriggerFilterObjectWithRefs_l1t*_*_HLT' -n 1000 --no_exec '--output={}'
Traceback (most recent call last):
  File "/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02850/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_X_2024-08-14-2300/bin/el8_amd64_gcc12/cmsDriver.py", line 40, in <module>
    run()
  File "/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02850/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_X_2024-08-14-2300/bin/el8_amd64_gcc12/cmsDriver.py", line 11, in run
    options = OptionsFromCommandLine()
  File "/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02850/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_14_1_X_2024-08-15-1100/src/Configuration/Applications/python/cmsDriverOptions.py", line 36, in OptionsFromCommandLine
    options=OptionsFromItems(sys.argv[1:])
  File "/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02850/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_14_1_X_2024-08-15-1100/src/Configuration/Applications/python/cmsDriverOptions.py", line 153, in OptionsFromItems
    options.filein=trimmedEvtType+"_"+prec_step[first_step]+"."+filesuffix
KeyError: 'L1P2GT'

@smuzaffar
Copy link
Contributor

@rovere
Copy link
Contributor Author

rovere commented Aug 16, 2024

ciao @smuzaffar the first error is a consequence of the missing dataset.
I'll copy the dataset to the online machine and retrigger the test in this PR later this afternoon.

@rovere
Copy link
Contributor Author

rovere commented Aug 16, 2024

@smuzaffar files are now available on the machine. Is there a way to trigger only the hlt-p2-timing test w/o re-running everything else?

@rovere
Copy link
Contributor Author

rovere commented Aug 16, 2024

@cmsbuild please test

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Pull request #45705 was updated. @Martin-Grunewald, @cmsbuild, @mmusich can you please check and sign again.

@rovere
Copy link
Contributor Author

rovere commented Aug 19, 2024

@cmsbuild please test

@cmsbuild
Copy link
Contributor

+1

Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f2d24e/41012/summary.html
COMMIT: 3665896
CMSSW: CMSSW_14_1_X_2024-08-18-2300/el8_amd64_gcc12
Additional Tests: HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/45705/41012/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

@mmusich
Copy link
Contributor

mmusich commented Aug 19, 2024

cmsbot tests run, this

HLT P2 Timing: chart

looks empty though.

@rovere
Copy link
Contributor Author

rovere commented Aug 20, 2024

cmsbot tests run, this

HLT P2 Timing: chart

looks empty though.

indeed, maybe @smuzaffar knows why?

@smuzaffar
Copy link
Contributor

I am looking in to this

@smuzaffar
Copy link
Contributor

@rovere , hlt p2 timing chart is working now

@mmusich
Copy link
Contributor

mmusich commented Aug 20, 2024

hlt p2 timing chart is working now

is it normal to have such a large "idle" fraction?

@rovere
Copy link
Contributor Author

rovere commented Aug 21, 2024

hlt p2 timing chart is working now

is it normal to have such a large "idle" fraction?

That's a good question. In general, I don't think we should have such a big contribution from Idle. Maybe @fwyzard could remind us, once again, what Idle means in this context, exactly?

I think, in any case, that this PR could proceed no matter what since the mechanics of the test are working ok. Agree?

@mmusich
Copy link
Contributor

mmusich commented Aug 21, 2024

I think, in any case, that this PR could proceed no matter what since the mechanics of the test are working ok. Agree?

agreed.

@mmusich
Copy link
Contributor

mmusich commented Aug 21, 2024

+hlt

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @mandrenguyen, @sextonkennedy, @antoniovilela, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@fwyzard
Copy link
Contributor

fwyzard commented Aug 21, 2024

Maybe @fwyzard could remind us, once again, what Idle means in this context, exactly?

It should mean that the TBB worker threads do not have any task they can do, so are sitting idle.

@mandrenguyen
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit daa9c3d into cms-sw:master Aug 21, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants