Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyROOT in root 6.24 branch hangs while loading CMSSW library #7718

Closed
smuzaffar opened this issue Mar 26, 2021 · 69 comments · Fixed by #7773, #7780, #9764 or #9798
Closed

PyROOT in root 6.24 branch hangs while loading CMSSW library #7718

smuzaffar opened this issue Mar 26, 2021 · 69 comments · Fixed by #7773, #7780, #9764 or #9798
Assignees
Labels
bug experiment Affects an experiment / reported by its software & computimng experts priority:critical
Milestone

Comments

@smuzaffar
Copy link
Contributor

Hi,
We are trying to update root 6.24 branch (commit 7c0cfac) in CMSSW special integration builds (https://github.com/cms-sw/cmsdist/pull/6746/files ) but looks like pyROOT fails/hangs for some special dictionaries.

While building cmssw , we use https://github.com/cms-sw/cmssw/blob/master/FWCore/Utilities/scripts/edmCheckClassVersion to check for root dictionaries class versions. This works for most of our dictionaries e.g following two run fine ( https://github.com/cms-sw/cmssw/blob/master/DataFormats/TauReco/src/classes_def_hlt.xml, https://github.com/cms-sw/cmssw/blob/master/DataFormats/TauReco/src/classes_def_1.xml )

> src/FWCore/Utilities/scripts/edmCheckClassVersion -l libDataFormatsTauReco.so -x src/DataFormats/TauReco/src/classes_def_hlt.xml
> src/FWCore/Utilities/scripts/edmCheckClassVersion -l libDataFormatsTauReco.so -x src/DataFormats/TauReco/src/classes_def_1.xml

but it fails/hangs for https://github.com/cms-sw/cmssw/blob/master/DataFormats/TauReco/src/classes_def_2.xml

> src/FWCore/Utilities/scripts/edmCheckClassVersion -l libDataFormatsTauReco.so -x src/DataFormats/TauReco/src/classes_def_2.xml

Most of the times the above command just hangs with error https://muzaffar.web.cern.ch/root624/err1.log but once I was able to get this error https://muzaffar.web.cern.ch/root624/err.log . Can you please look in to it and see if this log helps?

In case you want to try it yourself then you go to cmsdev25 and do

>cd /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100
>cmsenv
>/build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/src/FWCore/Utilities/scripts/edmCheckClassVersion -l libDataFormatsTauReco.so -x /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/src/DataFormats/TauReco/src/classes_def_2.xml

FYI @mrodozov @makortel @Dr15Jones

@smuzaffar smuzaffar added the bug label Mar 26, 2021
@smuzaffar
Copy link
Contributor Author

the root commit b802a6b was working fine. So the changes which we are testing are b802a6b...7c0cfac

@Axel-Naumann Axel-Naumann self-assigned this Mar 29, 2021
@Axel-Naumann Axel-Naumann added this to the 6.24/00 milestone Mar 29, 2021
@Axel-Naumann
Copy link
Member

Would you be able to provide me with a valgrind report? --num-callers=50 --track-origins=yes --suppressions=$ROOTSYS/etc/valgrind-root.supp would help a lot.

@Axel-Naumann
Copy link
Member

Axel-Naumann commented Mar 29, 2021

This might be a dupe of #7657 - but I'd like to make progress with both of them independently to not serialize progress towards v6.24/00! (I.e. I'd still very much appreciate the valgrind report.)

@smuzaffar
Copy link
Contributor Author

@Axel-Naumann , valgrind also hangs without printing nay usefull information. Under gdb I see this [a]. If I build root in Debug mode then I do not get this segmentation fault.
[a]

(gdb) where
#0  0x00007ffff6f5d272 in _int_malloc () from /lib64/libc.so.6
#1  0x00007ffff6f6078c in malloc () from /lib64/libc.so.6
#2  0x00007ffff67ad7c5 in operator new (sz=127) at ../../../../libstdc++-v3/libsupc++/new_op.cc:50
#3  0x00007ffff683fc6d in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate (this=this@entry=0x7fffffff38c0, __pos=63, __len1=__len1@entry=0,
    __s=0x7ffff471c3cb "::", __len2=2)
    at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_11_1_0_pre6-slc7_amd64_gcc900/build/CMSSW_11_1_0_pre6-build/BUILD/slc7_amd64_gcc900/external/gcc/9.3.0/gcc-9.3.0/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:993
#4  0x00007ffff684127b in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append (this=0x7fffffff38c0, __s=<optimized out>, __n=<optimized out>)
    at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_11_1_0_pre6-slc7_amd64_gcc900/build/CMSSW_11_1_0_pre6-build/BUILD/slc7_amd64_gcc900/external/gcc/9.3.0/gcc-9.3.0/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/char_traits.h:300
#5  0x00007ffff199d41f in cling::LookupHelper::findScope(llvm::StringRef, cling::LookupHelper::DiagSetting, clang::Type const**, bool) const ()
   from /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/external/slc7_amd64_gcc900/lib/libCling.so
#6  0x00007ffff19237ef in TClingClassInfo::TClingClassInfo(cling::Interpreter*, char const*, bool) ()
   from /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/external/slc7_amd64_gcc900/lib/libCling.so
#7  0x00007ffff189c644 in TCling::GetInterpreterTypeName(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, bool) ()
   from /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/external/slc7_amd64_gcc900/lib/libCling.so
#8  0x00007ffff6b7745a in TClass::GetClass(char const*, bool, bool, unsigned long, unsigned long) ()
   from /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/external/slc7_amd64_gcc900/lib/libCore.so
#9  0x00007ffff6b9f27b in TProtoClass::FillTClass(TClass*) ()
   from /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/external/slc7_amd64_gcc900/lib/libCore.so
#10 0x00007ffff6b80bea in TClass::Init(char const*, short, std::type_info const*, TVirtualIsAProxy*, char const*, char const*, int, int, ClassInfo_t*, bool) ()
   from /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/external/slc7_amd64_gcc900/lib/libCore.so
#11 0x00007ffff6b821c2 in TClass::TClass(char const*, short, std::type_info const&, TVirtualIsAProxy*, char const*, char const*, int, int, bool) ()
   from /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/external/slc7_amd64_gcc900/lib/libCore.so
#12 0x00007ffff6b826d9 in ROOT::CreateClass(char const*, short, std::type_info const&, TVirtualIsAProxy*, char const*, char const*, int, int) ()
   from /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/external/slc7_amd64_gcc900/lib/libCore.so
#13 0x00007ffff6b918ea in ROOT::TGenericClassInfo::GetClass() ()
   from /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/external/slc7_amd64_gcc900/lib/libCore.so
#14 0x00007ffff6b76fea in TClass::GetClass(char const*, bool, bool, unsigned long, unsigned long) ()
   from /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/external/slc7_amd64_gcc900/lib/libCore.so
#15 0x00007fffdedbc05b in ?? ()
#16 0x000000000000042e in ?? ()
#17 0x0000000006c6f350 in ?? ()
#18 0x0000000006d34680 in ?? ()
#19 0x0000000006c76630 in ?? ()
#20 0x0000000006c764c0 in ?? ()
#21 0x0000000006c6f350 in ?? ()
#22 0x0000000006c76630 in ?? ()
#23 0x00007fffffff42a8 in ?? ()
#24 0x00000000057251f0 in ?? ()
#25 0x00007ffff6b777d0 in ?? ()
   from /build/muz/r624/w/tmp/BUILDROOT/ea8220342d406ab7dbc2d210a1e9351b/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_ROOT624_X_2021-03-25-1100/external/slc7_amd64_gcc900/lib/libCore.so
#26 0x00007fffffff42a8 in ?? ()
#27 0x00007fffffff4220 in ?? ()
#28 0x0000000100000000 in ?? ()
#29 0x0000000000000000 in ?? ()

@smuzaffar
Copy link
Contributor Author

@Axel-Naumann
Copy link
Member

In case you want to try it yourself then you go to cmsdev25

Help! @pcanal maybe? What I'm after, given the backtrace of #7718 (comment) , is which frames cause the inf loop, i.e. which is the frame that doesn't return.

@smuzaffar what we can also try, because of the "only happens in release builds", is #7752

@smuzaffar
Copy link
Contributor Author

@Axel-Naumann , do you have #7752 equivalent for v6.24 branch?

@Axel-Naumann
Copy link
Member

Axel-Naumann commented Mar 31, 2021

@smuzaffar that's #7767

@smuzaffar
Copy link
Contributor Author

@smuzaffar
Copy link
Contributor Author

@Axel-Naumann , looks like some latest development in v6.24 branch has fixed the hanging issue. I have tested 126c9c8 (without #7767) and this time cmssw build was successful. We get runtime errors now, see the details here cms-sw/cmsdist#6777 (comment) . You can find the crash log https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a7de73/13897/runTheMatrix-results/135.4_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS/step3_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS.log.

With your changes in #7767 ( on top of 126c9c8 ) , things look in much better state. PR tests ( cms-sw#153 (comment) ) show no build or run time errors. But we do see some comparison differences for our reconstruction code.

@Axel-Naumann
Copy link
Member

Thank you, @smuzaffar ! Am I reading this correctly that it's currently unknown whether the comparison differences are introduced by #7752 or were pre-existing?

@smuzaffar
Copy link
Contributor Author

smuzaffar commented Apr 1, 2021

These differences were not there when last time we updated to v6.24 branch commit b802a6b (see cms-sw/cmsdist#6730 (comment) and https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_11_3_X_2021-03-14-2300+963134/41710/validateJR.html ).

It could be that some of latest cmssw changes might have caused these differences, I am re-running the tests based on latest cmssw release now

@Axel-Naumann
Copy link
Member

Thanks. To make progress towards v6.24 I will merge #7752 and #7767 - IIUC they improve the situation also for CMS (they do for ATLAS). I hope that this does not complicate finding the source of the comparison failure.

@smuzaffar
Copy link
Contributor Author

@Axel-Naumann , based on latest cmssw ROOT624 IB, the comparison results look good now cms-sw#153 (comment) .

@Axel-Naumann
Copy link
Member

Phew, thanks, @smuzaffar But I have bad news: I needed to revisit the PR #7752 / #7767 . I will have to ask for your favor to test another PR replacing those, sorry about that! (But at least we now know what the baseline is - the comparison is good!) I'll be back...

@smuzaffar
Copy link
Contributor Author

no problem, just ping me when you have something to test @Axel-Naumann

@Axel-Naumann
Copy link
Member

I now have the following:

Could you let me know whether that also solves the issues you saw? Any one of them is enough. Thank you, @smuzaffar !

@smuzaffar
Copy link
Contributor Author

@smuzaffar
Copy link
Contributor Author

@Axel-Naumann , v6.24 fix looks good for CMS . All tests passed cms-sw#155 (comment)

@pcanal
Copy link
Member

pcanal commented Jun 8, 2021

See also #7754

@Axel-Naumann Axel-Naumann modified the milestones: 6.24/02, 6.26/00 Jun 8, 2021
@hahnjo
Copy link
Member

hahnjo commented Jul 19, 2021

@pcanal @Axel-Naumann this issue is open, but marked as fixed. Reading #7718 (comment), should this maybe also say 6.24/02?

@mrodozov
Copy link

mrodozov commented Sep 29, 2021

@pcanal I'd like to bring the 6.22 issue from here to your attention to answer a question by
@osschar from yesteday

@osschar
Copy link
Contributor

osschar commented Sep 29, 2021

3480394

@alja and I saw exceptions when reading PFCandiadate vector, the size obtained by the auto-generated code was O(10^20).

We have a relatively slim reproducer on top of FWLite, can dig it out / give access to the machine here at UCSD if relevant.

#0  0x00007ffff17ae32e in __cxxabiv1::__cxa_throw (obj=obj@entry=0xf49a880, tinfo=0x7ffff18d4270 <typeinfo for std::length_error>, dest=0x7ffff17c34f0 <std::length_error::~length_error()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:78
#1  0x00007ffff17a4cd4 in std::__throw_length_error (__s=0x7ffff7f98726 "vector::_M_fill_insert") at ../../../../../libstdc++-v3/src/c++11/functexcept.cc:78
#2  0x00007ffff7f758d2 in std::vector<void const*, std::allocator<void const*> >::_M_check_len (this=0x13ff4df8, __n=18446744073692016332, __s=0x7ffff7f98726 "vector::_M_fill_insert") at /data2/alja/fwWeb/cmsBetaBld/slc7_amd64_gcc900/external/gcc/9.3.0/include/c++/9.3.0/bits/stl_vector.h:1756
#3  0x00007ffff7f7ab11 in std::vector<void const*, std::allocator<void const*> >::_M_fill_insert (this=0x13ff4df8, __position=..., __n=18446744073692016332, __x=@0x7fffffff9320: 0x0) at /data2/alja/fwWeb/cmsBetaBld/slc7_amd64_gcc900/external/gcc/9.3.0/include/c++/9.3.0/bits/vector.tcc:558
#4  0x00007ffff7f7a2df in std::vector<void const*, std::allocator<void const*> >::resize (this=0x13ff4df8, __new_size=18446744073692016332, __x=@0x7fffffff9320: 0x0) at /data2/alja/fwWeb/cmsBetaBld/slc7_amd64_gcc900/external/gcc/9.3.0/include/c++/9.3.0/bits/stl_vector.h:957
#5  0x00007fffd599ae1c in ROOT::read_recocLcLPFCandidate_2 (target=0x13ff4c40 "\250\373\245\325\377\177", oldObj=0x7fffffff9380) at DataFormatsParticleFlowCandidate/a/DataFormatsParticleFlowCandidate_xr.cc:2669
#6  0x00007ffff472c4e5 in TStreamerInfo::ReadBufferArtificial<char**> (this=this@entry=0x33e2c20, b=..., arr=@0x7fffffff9530: 0x14259120, aElement=aElement@entry=0x85f9470, narr=narr@entry=1529, eoffset=eoffset@entry=0)
    at /data2/alja/fwWeb/cmsBetaBld/BUILD/slc7_amd64_gcc900/lcg/root/6.25.01/root-6.25.01/io/io/src/TStreamerInfoReadBuffer.cxx:550
#7  0x00007ffff47f483c in TStreamerInfo::ReadBuffer<char**> (this=0x33e2c20, b=..., arr=<optimized out>, compinfo=0x13c373a8, first=<optimized out>, last=1, narr=1529, eoffset=0, arrayMode=1)
    at /data2/alja/fwWeb/cmsBetaBld/BUILD/slc7_amd64_gcc900/lcg/root/6.25.01/root-6.25.01/io/io/src/TStreamerInfoReadBuffer.cxx:1672
#8  0x00007ffff469259d in TStreamerInfoActions::VectorLooper::GenericRead (buf=..., start=0x13ff4c40, end=0x140b0e78, loopconfig=0x13385580, config=0x13c37390) at /data2/alja/fwWeb/cmsBetaBld/BUILD/slc7_amd64_gcc900/lcg/root/6.25.01/root-6.25.01/io/io/src/TStreamerInfoActions.cxx:1878
#9  0x00007ffff4586baf in TStreamerInfoActions::TConfiguredAction::operator() (this=0xe3d1670, buffer=..., start_collection=0x13ff4c40, end_collection=0x140b0e78, loopconf=0x13385580) at /data2/alja/fwWeb/cmsBetaBld/BUILD/slc7_amd64_gcc900/lcg/root/6.25.01/root-6.25.01/io/io/inc/TStreamerInfoActions.h:131
#10 0x00007ffff4584ca0 in TBufferFile::ApplySequence (this=0x7fffffff9620, sequence=..., start_collection=0x13ff4c40, end_collection=0x140b0e78) at /data2/alja/fwWeb/cmsBetaBld/BUILD/slc7_amd64_gcc900/lcg/root/6.25.01/root-6.25.01/io/io/src/TBufferFile.cxx:3638
#11 0x00007ffff529b9fb in TBranchElement::GetEntry (this=0x7fa4d20, entry=0, getall=0) at /data2/alja/fwWeb/cmsBetaBld/BUILD/slc7_amd64_gcc900/lcg/root/6.25.01/root-6.25.01/tree/tree/src/TBranchElement.cxx:2696
#12 0x00007ffff529b72f in TBranchElement::GetEntry (this=0x7fa2880, entry=0, getall=0) at /data2/alja/fwWeb/cmsBetaBld/BUILD/slc7_amd64_gcc900/lcg/root/6.25.01/root-6.25.01/tree/tree/src/TBranchElement.cxx:2658
#13 0x00007ffff77fc21c in fwlite::DataGetterHelper::getBranchData(edm::EDProductGetter const*, long long, fwlite::internal::Data&) const () from /data2/alja/fwWeb/cmsBetaBld/slc7_amd64_gcc900/cms/fwlite/CMSSW_11_3_4_FWLITE-cms/lib/slc7_amd64_gcc900/libDataFormatsFWLite.so
#14 0x00007ffff77feb26 in fwlite::DataGetterHelper::getByLabel(std::type_info const&, char const*, char const*, char const*, void*, long) const () from /data2/alja/fwWeb/cmsBetaBld/slc7_amd64_gcc900/cms/fwlite/CMSSW_11_3_4_FWLITE-cms/lib/slc7_amd64_gcc900/libDataFormatsFWLite.so
#15 0x00007ffff7803ef7 in fwlite::Event::getByLabel(std::type_info const&, char const*, char const*, char const*, void*) const () from /data2/alja/fwWeb/cmsBetaBld/slc7_amd64_gcc900/cms/fwlite/CMSSW_11_3_4_FWLITE-cms/lib/slc7_amd64_gcc900/libDataFormatsFWLite.so
#16 0x00007ffff7806f24 in fwlite::EventBase::getByLabelImpl(std::type_info const&, std::type_info const&, edm::InputTag const&) const () from /data2/alja/fwWeb/cmsBetaBld/slc7_amd64_gcc900/cms/fwlite/CMSSW_11_3_4_FWLITE-cms/lib/slc7_amd64_gcc900/libDataFormatsFWLite.so
#17 0x00007ffff7cc7d2d in bool edm::EventBase::getByLabel<edm::FWGenericObject>(edm::InputTag const&, edm::Handle<edm::FWGenericObject>&) const () from /data2/alja/fwWeb/cmsBetaBld/slc7_amd64_gcc900/cms/fwlite/CMSSW_11_3_4_FWLITE-cms/lib/slc7_amd64_gcc900/libFireworksCore.so
#18 0x00000000004073e4 in App::checkPFCandidatesFW (this=0x7fffffff9fd0) at /data2/matevz/CMSSW_11_3_4_FWLITE-cms/src/OssTests/BranchAddr/bin/test-bname-for.cc:122
#19 0x0000000000406547 in main (argc=2, argv=0x7fffffffa0f8) at /data2/matevz/CMSSW_11_3_4_FWLITE-cms/src/OssTests/BranchAddr/bin/test-bname-for.cc:188

@osschar
Copy link
Contributor

osschar commented Oct 20, 2021

@pcanal, this is a bit of a pain for CMS as we have to put an "undo commit" on all our 6.24 and master branches, e.g.:
cms-sw@425ac41

We probably have several incarnation of this undo in various branches :(

@pcanal
Copy link
Member

pcanal commented Oct 20, 2021

I know :( ... This is currently the bug I am working on next. What am I mostly missing (because I stepped away from it too long) is a reproduce of the failing case (ideally I will need a standalone reproducer to add to roottest).

@alja
Copy link
Collaborator

alja commented Oct 21, 2021

@pcanal

If it is any help I reference here a simple cmssw module [1]. When compile it with FWLite (built with root master) you can reproduce the crash. Below is the binary from the test module and the sample file:
test-bname-for.exe /eos/cms/store/group/phys_muon/dmytro/tmp/BPH-RunIIAutumn18DRPremix-00015.root

Crash is the line https://github.com/alja/OssTests/blob/root-test/BranchAddr/bin/test-bname-for.cc#L95

[1] https://github.com/alja/OssTests

@alja
Copy link
Collaborator

alja commented Oct 21, 2021

@pcanal
FYI, When applying two commits from root cms branch for IB?CMSSW_12_X_rootmaster releases the problem is gone:
alja@cc4b249
741fe64

@smuzaffar
Copy link
Contributor Author

@osschar , note that we are only applying the revert for root master branch ( https://github.com/cms-sw/root/commits/cms/master/03d7710 ) .root 6.24 branch of cmssw ( https://github.com/cms-sw/root/commits/cms/v6-24-00-patches/f4ad42e ) does not need the revert.

@smuzaffar
Copy link
Contributor Author

smuzaffar commented Nov 10, 2021

@pcanal , I am testing cms root master branch ( https://github.com/cms-sw/root/commits/cms/master/03d7710 ) without the revert of ofending commit (cms-sw@f9834e3 ) , once it is available then hopefully I will be able to provide you the instructions for reproducer

@smuzaffar
Copy link
Contributor Author

@pcanal , please use

/cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/7442/20426/install.sh
cd CMSSW_12_2_ROOT6_X_2021-11-09-2300

to create cmssw dev and then run the commands in #7718 (comment) and #7718 (comment) to reproduce the errors.

@pcanal
Copy link
Member

pcanal commented Nov 10, 2021

@smuzaffar Thanks.

@smuzaffar
Copy link
Contributor Author

@pcanal note that /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/7442/20426 area(for externals) and CMSSW_12_2_ROOT6_X_2021-11-09-2300 IB will be deleted on SAT 20th NOV. If you still need to debug the issue after that date then let me know so that I can rebuilt it.

@pcanal
Copy link
Member

pcanal commented Nov 15, 2021

@smuzaffar Thanks for the heads up. I made significant progress and will "hopefully" not need this build that long! :)

@Axel-Naumann Axel-Naumann added the experiment Affects an experiment / reported by its software & computimng experts label Jan 27, 2022
@pcanal
Copy link
Member

pcanal commented Feb 3, 2022

@smuzaffar I pushed several related PR to v6.24 and v6.26 (and master). Can you verify that one of them now work properly for this case? Thanks.

@alja
Copy link
Collaborator

alja commented Feb 14, 2022

@pcanal @smuzaffar
My tests for creating s valid edm::Handle on the sample file are now successful using CMSSW_12_3_X_2022-02-10-1100_FWLITE.

@Axel-Naumann
Copy link
Member

Thanks! This means v6-26-00-patches is now fine for CMS, @smuzaffar ?

@smuzaffar
Copy link
Contributor Author

yes root 6.26, after @pcanal PR #9927 , looks good. It has fixed the issue and IBs look good. We still have random root file read error but I do not think those are related to root change.

@Axel-Naumann Axel-Naumann removed this from the 6.26/00 milestone Feb 22, 2022
@pcanal pcanal added this to the 6.26/00 milestone Mar 3, 2022
@pcanal
Copy link
Member

pcanal commented Mar 3, 2022

Fair enough. Let's close this issue and re-open a new one if needed be.

@pcanal pcanal closed this as completed Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment