-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crashing on GE2/1 demonstrator in RelVals Validation #36826
Comments
A new Issue was created by @srimanob Phat Srimanobhas. @Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign pdmv |
New categories assigned: pdmv @bbilin,@wajidalikhan,@jordan-martins,@kskovpen you have been requested to review this Pull request/Issue and eventually sign? Thanks |
Checking on 12_3_0_pre2, I don't see the issue.
FYI @ptcox (just my quick check on Error that shows up CSCDigiValidation:cscDigiValidation (crashed)) |
Here is a stack trace from one of the crashing jobs
|
Hi Phat,
I was not aware of this crash. CSC don't actually make use of any of
those old DigiValidation workflows, but last year Sven Dildick put a lot
of effort into updating them with the object of making more use of them
for the new GEM/CSC integration of trigger primitives. Unfortunately he
has left CMS. I presume a crash is likely to be due to GEM-related code.
Regards,
Tim
Phat Srimanobhas wrote on 1/28/22 15:52:
…
Checking on 12_3_0_pre2, I don't see the issue.
* https://dmytro.web.cern.ch/dmytro/cmsprodmon/workflows.php?campaign=CMSSW_12_3_0_pre2__fullsim_PU_2021_14TeV-1640179268
* https://dmytro.web.cern.ch/dmytro/cmsprodmon/workflows.php?campaign=CMSSW_12_3_0_pre2__fullsim_PU_2021_14TeV_HLT-1640182493
FYI @ptcox <https://github.com/ptcox> (just my quick check on Error
that shows up CSCDigiValidation:cscDigiValidation (crashed))
—
Reply to this email directly, view it on GitHub
<#36826 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGYLHSKDG3KPJY5QEHHO43UYKUSNANCNFSM5NA54GEQ>.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
assign dqm |
New categories assigned: dqm @jfernan2,@ahmad3213,@rvenditti,@emanueleusai,@pbo0,@pmandrik you have been requested to review this Pull request/Issue and eventually sign? Thanks |
here is the source of crash: FYI @cvuosalo @cms-sw/gem-dpg-l2 |
The crash happens in
being called from
That function has three calls to cmssw/Validation/MuonHits/src/GEMSimHitMatcher.cc Lines 91 to 100 in 9c3a5ca
In principle cmssw/Validation/MuonGEMDigis/plugins/GEMPadDigiValidation.cc Lines 252 to 255 in 9c3a5ca
On a separate note, repeated
) |
FYI @watson-ij |
@watson-ij , in the code shown above between line 98 and 99 likely a check is needed that it is not a demo chamber hit. If I understand correctly, demo chambers should not be involved in this type of validation. |
@civanch I had a look and I don't think it should be crashing there. The issue looks to be that the db its picking up has the GE2/1 chamber, but not the corresponding superchamber for the demonstrator. In the PR #36835 I put in a quick fix so that it will build the superchamber, and the file will run through, though I'm not sure if we could fix the db instead? The PR has another fix for a later error that comes up after I built the superchamber also. |
@watson-ij Thanks for the quick fix. I think what is missing in the DB geometry should be fixed. Do you have a clue why it is missing in the last updated GEM DB geometry? |
@watson-ij , can you , please, specify name of supervolume involved. This may help @cvuosalo to understand what is wrong in DB. I myself not sure that some volumes are absent in DB, because in that case there will be no hits in demo chambers. |
Well, its the RecoIdealGeometry thats the issue, so I don't think its a matter of volumes being missing or issues in the sim geometry; the conddb builder is just running through the detIds inside of that, and are picking up the GE1/1 correctly, and the GE2/1 chamber and eta partitions are included, but the superchamber isn't, and the superchambers are used as the signal to include the particular stations in the GEM reco geometry. The fix just looks for the demo chamber and adds the superchamber for it if its missing. Im not sure of the process to build the RecoIdealGeometry or why the GE1/1 superchambers got included but not the GE2/1 demonstrator, so thats where I would need help. |
Hi All, does anyone know at which point the relevant demonstrator superchamber went missing in the geometry database? |
Hi @srimanob since the issue is not in CSC chambers, would you maybe consider to rename this issue? (I today just thought we also need a new CSC geometry, which is not the case as I understand) |
Done. |
We are working on loading the corrected GEM reco geometry into the DB. But after that is done, how can we test to see that the error is fixed? It seems that the crash is quite rare and hard to reproduce. |
Hi @cvuosalo Thanks for taking care of this. If all the fixes are in place, I can run with private production on 10k events to see before we cut the release. I think the error is not so rare, as we see crashes in all relvals, especially on ZMM samples with 0 success. |
@srimanob , thanks for the proposal. It is the best what can be done. |
The candidate GT with the updated GEM reco geometry is 123X_mcRun3_2021_design_Candidate_2022_01_31_23_02_13. |
Running through high stat samples from scratch, everything runs smoothly. Thanks @cvuosalo @watson-ij
|
I also created a backport #36872 to 12_2_X, for just the Validation/MuonGEMHits, since Carl's backport contains all the builder code. |
Looking on Run-3 relvals with CMSSW_12_3_0_pre4 from
Workflows are facing an issue in the 3rd step, RECO+Validation.
It seems we have low statistics and issues seems to come from a failure in CSCDigiValidation:cscDigiValidation. Reports from several workflows point to the same issue.
The text was updated successfully, but these errors were encountered: