-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure of Validation/Geometry unit test in CMSSW_11_0_ROOT6_X IBs #136
Comments
A new Issue was created by @Dr15Jones Chris Jones. @davidlange6, @Dr15Jones, @smuzaffar, @fabiocos, @kpedro88 can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
Thanks @Dr15Jones , I have opened a JIRA issue https://sft.its.cern.ch/jira/browse/ROOT-10353 |
@Dr15Jones , DataFormats/SiPixelDetId/src/classes_def.xml:
|
@vgvassilev , @oshadura any idea about this. With latest ROOT master although the python ROOT is fixed but this unit test is still failing. |
Looks like a teardown issue. Is this the full backtrace? ping @pcanal. |
yes this is full backtrace |
Which ROOT commit is that with? |
these are the commits when we noticed this failure root-project/root@7256943...5c1a587 |
Well that's not good. Some of those were supposed to 'fix' this kind of problem. So I will need a reproducer :( |
on any of cmsdev machines ( e.g. cmsdev21 ) do this. Note that root in this release is already build in debug mode. You might want to run it under
|
The crash is caused by an exception thrown during a TClass construction (and the TClass constructor not being exception safe):
|
So next we need to understand why:
|
and the problem seems to be that the python interpreter is 'shutdown/finalized' in the middle of the cmsRun process and that has the (new) consequence or marking TCling has been in shutdown mode which prevents it to give 'interpreter' information to TClass. essential part of the stack trace
full stack trace:
|
Do I understand correctly that the CMS code in question tries to:
|
the python state (job configuration) is not kept in memory through the computational phase of the job.
… On Oct 17, 2019, at 3:39 PM, Axel Naumann ***@***.***> wrote:
Do I understand correctly that the CMS code in question tries to:
• shut down the (python) interpreter
• then opens a TFile?
What's the intent behind that - is this a "temporary python interpreter"?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@etejedor any ideas? |
If Python is being shut down then the teardown handler that PyROOT programs will be executed and I understand this is a problem only now because before we did not call |
Yes that is the challenge. We added the call to TCling::Shutdown so that for standalone python the process tear down would not trigger fresh initialization (that ends up using some objects that have already been tear down) but this must not be called if python is shutdown before the end of the process ... since, as it is the case here, the process goes on to need ROOT for the rest of its duration (and has another point where the shutdown is called, i.e. the TApplication object tear down). |
Should we call TROOT::EndOfProcessCleanups from within python? What is the motivation for doing that as opposed to leaving ROOT itself to decide when to call it? |
@vgvassilev See 7a592f5: [Exp PyROOT] Run cleanup to nonify Python proxies at teardown This is related to:
|
I see, then we should probably have a special flag to the end of process cleanups denoting some entries are with ‘externally controlled lifetime’ meaning that they will be deleted externally before end of process. If they are not root will do that then. Something like shared_ptr semantics. |
We now get this error with ROOT 6.18 too. We do not get this error with root tag v6-18-04 but we do see it with tip of root 6-18-00-patches branch i.e. one of these ( root-project/root@v6-18-04...869553a ) changes are causing it |
Thanks, that seems to contradict @pcanal 's diagnosis in #136 (comment) that this is related to 7a592f5 I've created https://sft.its.cern.ch/jira/browse/ROOT-10469 to track this on our side. |
Indeed (but it does not really change the end result), the real cause is a 'tightening'/'increase' of the ROOT internal shutdown mechanism. I.e. now (in master and 6.18), the fact that the Shutdown has started is recorded and that information is use to prevent the use of (assumingly but incorrectly in this use case) deleted internal resources (like global static objects used for caching) and result in a (intentional) non-functional state. I.e. we still need to resolve the dichotomy between the 'python-is-tear-down-mid-process' (CMS case) vs 'python-is-tear-down-at-the-end-of-process' (python prompt case). |
@smuzaffar, @etejedor, has pushed a fix in the master, however, could you test if this root-project#4675 solves your issue. This would help us better understand a class of problems that show up from time to time in ROOT. |
@vgvassilev , @Dr15Jones had fixed this unit test cms-sw/cmssw#28619 and the change is already merged. So testing root-project#4675 is not going to tell if it fixes the issue or not. |
Ah, too bad :( |
this was fixed |
The unit test materialBudgetTrackerPlots in Validation/Geometry has been failing. I've tracked the problem down to doing
import ROOT
in the cmsRun configuration file. This even fails just by doing the same import from withinpython
itself. The import causes an assert with the messageThe text was updated successfully, but these errors were encountered: