-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix invalid read in TChain #17259
base: master
Are you sure you want to change the base?
Fix invalid read in TChain #17259
Conversation
Test Results 18 files 18 suites 4d 0h 58m 2s ⏱️ Results for commit d1bc656. ♻️ This comment has been updated with latest results. |
260dce9
to
60f2317
Compare
@pcanal investigating the test error led me to finding some faulty behaviour in the case of a chain moving to a new tree but not properly updating the information of the underlying TTreeIndex (specifically regarding the attached TTreeFormulas). You can find some description in the first commit message, and here is the full valgrind stacktrace valgrind_tchain_friend_updateformulaleaves.txt Let's see if this was all or just part of the reason for the CI failures. |
To avoid following valgrind error leading to crashes in multithreaded RDF runs: ``` ==735217== Invalid read of size 8 ==735217== at 0x6A4517F: TTreeFormula::UpdateFormulaLeaves() (TTreeFormula.cxx:5121) ==735217== by 0x6A5B9AF: TTreeIndex::UpdateFormulaLeaves(TTree const*) (TTreeIndex.cxx:625) ==735217== by 0x67A5447: TChain::LoadTree(long long) (TChain.cxx:1670) ==735217== by 0x6A94DBD: TTreeReader::SetEntryBase(long long, bool) (TTreeReader.cxx:669) ==735217== by 0x6A96027: TTreeReader::SetEntry(long long) (TTreeReader.h:225) ==735217== by 0x7576BAA: TTreeReader::Next() (TTreeReader.h:217) ==735217== by 0x756D591: (anonymous namespace)::validTTreeReaderRead(TTreeReader&) (RLoopManager.cxx:538) ==735217== by 0x756D9A9: ROOT::Detail::RDF::RLoopManager::RunTreeProcessorMT()::{lambda(TTreeReader&)#1}::operator()(TTreeReader&) const (RLoopManager.cxx:575) ==735217== by 0x7574F7E: void std::__invoke_impl<void, ROOT::Detail::RDF::RLoopManager::RunTreeProcessorMT()::{lambda(TTreeReader&)#1}&, TTreeReader&>(std::__invoke_other, ROOT::Detail::RDF::RLoopManager::RunTreeProcessorMT()::{lambda(TTreeReader&)#1}&, TTreeReader&) (invoke.h:61) ==735217== by 0x757480E: std::enable_if<is_invocable_r_v<void, ROOT::Detail::RDF::RLoopManager::RunTreeProcessorMT()::{lambda(TTreeReader&)#1}&, TTreeReader&>, void>::type std::__invoke_r<void, ROOT::Detail::RDF::RLoopManager::RunTreeProcessorMT()::{lambda(TTreeReader&)#1}&, TTreeReader&>(ROOT::Detail::RDF::RLoopManager::RunTreeProcessorMT()::{lambda(TTreeReader&)#1}&, TTreeReader&) (invoke.h:111) ==735217== by 0x757433A: std::_Function_handler<void (TTreeReader&), ROOT::Detail::RDF::RLoopManager::RunTreeProcessorMT()::{lambda(TTreeReader&)#1}>::_M_invoke(std::_Any_data const&, TTreeReader&) (std_function.h:290) ==735217== by 0x6ABCE3E: std::function<void (TTreeReader&)>::operator()(TTreeReader&) const (std_function.h:591) ==735217== Address 0x15b4e000 is 0 bytes inside a block of size 712 free'd ==735217== at 0x48456C6: operator delete(void*) (vg_replace_malloc.c:1131) ==735217== by 0x4E9494B: TStorage::ObjectDealloc(void*) (TStorage.cxx:325) ==735217== by 0x4E6B181: TObject::operator delete(void*) (TObject.cxx:1113) ==735217== by 0x67F47E9: TTree::~TTree() (TTree.cxx:1023) ==735217== by 0x4EED8F1: TCollection::GarbageCollect(TObject*) (TCollection.cxx:736) ==735217== by 0x4EF5D4C: TList::Delete(char const*) (TList.cxx:535) ==735217== by 0x4EF0428: THashList::Delete(char const*) (THashList.cxx:215) ==735217== by 0x5507848: TDirectoryFile::Close(char const*) (TDirectoryFile.cxx:585) ==735217== by 0x5523835: TFile::Close(char const*) (TFile.cxx:986) ==735217== by 0x5521226: TFile::~TFile() (TFile.cxx:563) ==735217== by 0x55215FF: TFile::~TFile() (TFile.cxx:600) ==735217== by 0x67A4990: TChain::LoadTree(long long) (TChain.cxx:1490) ==735217== Block was alloc'd at ==735217== at 0x4841FEC: operator new(unsigned long) (vg_replace_malloc.c:487) ==735217== by 0x4E948DF: TStorage::ObjectAlloc(unsigned long) (TStorage.cxx:293) ==735217== by 0x48C648E: TObject::operator new(unsigned long) (TObject.h:181) ==735217== by 0x672E04D: ROOT::new_TTree(void*) (G__Tree.cxx:4363) ==735217== by 0x4F4AA25: TClass::NewObject(TClass::ENewType, bool) const (TClass.cxx:5084) ==735217== by 0x4F4A932: TClass::New(TClass::ENewType, bool) const (TClass.cxx:5061) ==735217== by 0x556146A: TKey::ReadObj() (TKey.cxx:809) ==735217== by 0x55091D3: TDirectoryFile::Get(char const*) (TDirectoryFile.cxx:988) ==735217== by 0x67A4CA4: TChain::LoadTree(long long) (TChain.cxx:1542) ==735217== by 0x6A94DBD: TTreeReader::SetEntryBase(long long, bool) (TTreeReader.cxx:669) ==735217== by 0x6A96027: TTreeReader::SetEntry(long long) (TTreeReader.h:225) ==735217== by 0x6A94855: TTreeReader::SetEntriesRange(long long, long long) (TTreeReader.cxx:560) ```
60f2317
to
14b8763
Compare
14b8763
to
237629b
Compare
A new battery of four tests is added: 1. Try to build TChainIndex, fail, fallback to building TTreeIndex, check correct matching. For both the same cardinality and different cardinality cases (auxiliary smaller than primary). 2. Try to build TChainIndex and succeed. For both the same cardinality and different cardinality cases (auxiliary smaller than primary).
237629b
to
d1bc656
Compare
After merging #17301 , there is no need to create the logic to suppress the TChainIndex build error presently. The invalid read found by valgrind is still there, so this PR includes a test for the global join case and fixes the memory error.