Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault in the TrackProducer for EphemeralHLTPhysics #38869

Closed
ebrondol opened this issue Jul 27, 2022 · 28 comments · Fixed by #38881
Closed

Segmentation fault in the TrackProducer for EphemeralHLTPhysics #38869

ebrondol opened this issue Jul 27, 2022 · 28 comments · Fixed by #38881

Comments

@ebrondol
Copy link
Contributor

ebrondol commented Jul 27, 2022

[as current shadow ORM]

A single job was paused while processing PromptReco for the EphemeralHLTPhysics PD due to segmentation fault in the TrackProducer:

%MSG-w BasicTrajectoryState:  TrackProducer:detachedTripletStepTracks  26-Jul-2022 20:59:11 CEST Run: 356077 Event: 94087667
BasicTrajectoryState: attempt to access errors when none available  accessing local error..
freestate pointer: parameters
x =       6.70089    -0.113416     -19.3376
p =          -nan         -nan         -nan
no error defined.

local error valid/values :0
[         -nan        -nan        -nan        -nan        -nan
          -nan        -nan        -nan        -nan        -nan
          -nan        -nan        -nan        -nan        -nan
          -nan        -nan        -nan        -nan        -nan
          -nan        -nan        -nan        -nan        -nan ]
%MSG

The error is locally reproducible and the job tarball (including PSet and log files) can be found here:

/afs/cern.ch/user/c/cmst0/public/tarballs/Run2022C/vocms013.cern.ch-391090-3-log.tar.gz

The full description of the issue can be found in CMS Talk.

@mmusich , could you please have a look?

@cmsbuild
Copy link
Contributor

A new Issue was created by @ebrondol Erica Brondolin.

@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@ebrondol ebrondol changed the title Segmentation fault in the TrackProducer Segmentation fault in the TrackProducer for EphemeralHLTPhysics Jul 27, 2022
@francescobrivio
Copy link
Contributor

francescobrivio commented Jul 27, 2022

assign trk-dpg,tracking-pog

@cmsbuild
Copy link
Contributor

New categories assigned: trk-dpg

@connorpa,@mmusich,@tsusa you have been requested to review this Pull request/Issue and eventually sign? Thanks

@tvami
Copy link
Contributor

tvami commented Jul 27, 2022

assign reconstruction

@cmsbuild
Copy link
Contributor

New categories assigned: tracking-pog,reconstruction

@jpata,@slava77,@mmusich,@clacaputo,@vmariani you have been requested to review this Pull request/Issue and eventually sign? Thanks

@mmusich
Copy link
Contributor

mmusich commented Jul 27, 2022

@ebrondol

could you please have a look?

No.
I am out of office and will not look at this until maybe 10 days. Since it's an isolated job, just fail it.

@mmusich
Copy link
Contributor

mmusich commented Jul 27, 2022

@ebrondol

for the sake of those that will debug please post also the stack trace of the failing thread, the message you provided is not very informative.

@tvami
Copy link
Contributor

tvami commented Jul 27, 2022

could you please have a look?
No.
I am out of office and will not look at this until maybe 10 days.

@slava77 maybe?

Since it's an isolated job, just fail it.

Already been done.

@slava77
Copy link
Contributor

slava77 commented Jul 27, 2022

was somebody able to reproduce this with a "pointed" config? It would be quite useful/effective to be able to get to the problematic event directly.

@makortel
Copy link
Contributor

Here is the stack trace

Thread 14 (Thread 0x2b4365401700 (LWP 702) "cmsRun"):
#2  0x00002b43177fd2c0 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b4341692149 in analyticalErrorPropagation(FreeTrajectoryState const&, Surface const&, SurfaceSideDefinition::SurfaceSide, GlobalTrajectoryParameters const&, double const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libTrackPropagationRungeKutta.so
#5  0x00002b4341690e0e in RKPropagatorInS::propagateWithPath(FreeTrajectoryState const&, Plane const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libTrackPropagationRungeKutta.so
#6  0x00002b434168cfb4 in Propagator::propagateWithPath(TrajectoryStateOnSurface const&, Plane const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libTrackPropagationRungeKutta.so
#7  0x00002b434167f6f5 in PropagatorWithMaterial::propagateWithPath(TrajectoryStateOnSurface const&, Plane const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libTrackingToolsMaterialEffects.so
#8  0x00002b4339eeebcf in Propagator::propagateWithPath(TrajectoryStateOnSurface const&, Surface const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libTrackingToolsGeomPropagators.so
#9  0x00002b4342d1c9d3 in KFTrajectoryFitter::fitOne(TrajectorySeed const&, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > > const&, TrajectoryStateOnSurface const&, TrajectoryFitter::fitType) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libTrackingToolsTrackFitters.so
#10 0x00002b4342cdc361 in (anonymous namespace)::KFFittingSmoother::fitOne(TrajectorySeed const&, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > > const&, TrajectoryStateOnSurface const&, TrajectoryFitter::fitType) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginTrackingToolsTrackFittersPlugins.so
#11 0x00002b4342cd069c in (anonymous namespace)::FlexibleKFFittingSmoother::fitOne(TrajectorySeed const&, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > > const&, TrajectoryStateOnSurface const&, TrajectoryFitter::fitType) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginTrackingToolsTrackFittersPlugins.so
#12 0x00002b4377663eb6 in TrackProducerAlgorithm<reco::Track>::buildTrack(TrajectoryFitter const*, Propagator const*, std::vector<AlgoProductTraits<reco::Track>::AlgoProduct, std::allocator<AlgoProductTraits<reco::Track>::AlgoProduct> >&, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > >&, TrajectoryStateOnSurface&, TrajectorySeed const&, float, reco::BeamSpot const&, edm::RefToBase<TrajectorySeed>, int, signed char) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoTrackerTrackProducer.so
#13 0x00002b43773c6ea2 in TrackProducerAlgorithm<reco::Track>::runWithCandidate(TrackingGeometry const*, MagneticField const*, std::vector<TrackCandidate, std::allocator<TrackCandidate> > const&, TrajectoryFitter const*, Propagator const*, TransientTrackingRecHitBuilder const*, reco::BeamSpot const&, std::vector<AlgoProductTraits<reco::Track>::AlgoProduct, std::allocator<AlgoProductTraits<reco::Track>::AlgoProduct> >&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoEgammaEgammaPhotonProducers.so
#14 0x00002b437dab8624 in TrackProducer::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoTrackerTrackProducerPlugins.so
#15 0x00002b430f53a783 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libFWCoreFramework.so

Thread 13 (Thread 0x2b4366201700 (LWP 701) "cmsRun"):
#2  0x00002b43177fd2c0 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b434153b9a2 in CellularAutomaton::findTriplets(std::vector<HitDoublets const*, std::allocator<HitDoublets const*> > const&, std::vector<std::vector<unsigned int, std::allocator<unsigned int> >, std::allocator<std::vector<unsigned int, std::allocator<unsigned int> > > >&, TrackingRegion const&, CACut const&, CACut const&, float) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoPixelVertexingPixelTriplets.so
#5  0x00002b4341533304 in CAHitTripletGenerator::hitNtuplets(IntermediateHitDoublets const&, std::vector<OrderedHitSeeds, std::allocator<OrderedHitSeeds> >&, SeedingLayerSetsHits const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoPixelVertexingPixelTriplets.so
#6  0x00002b4386256919 in CAHitNtupletEDProducerT<CAHitTripletGenerator>::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoPixelVertexingPixelTripletsPlugins.so
#7  0x00002b430f53a783 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libFWCoreFramework.so

Thread 12 (Thread 0x2b4364a00700 (LWP 700) "cmsRun"):
#2  0x00002b43177fd2c0 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b43c1edd001 in ?? ()
#5  0x00002b4363a25790 in ?? ()
#6  0x00002b4317aeee6b in TFormula::DoEval(double const*, double const*) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libHist.so
#7  0x00002b433f71fef6 in PerformancePayloadFromTFormula::getResult(PerformanceResult::ResultType, BinningPointByMap const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libCondFormatsPhysicsToolsObjects.so
#8  0x00002b436819c7e0 in PFEnergyCalibration::aBarrel(double) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoParticleFlowPFClusterTools.so
#9  0x00002b436819deb6 in PFEnergyCalibration::energyEmHad(double, double&, double&, double, double) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoParticleFlowPFClusterTools.so
#10 0x00002b4366d37257 in PFAlgo::createCandidatesHCAL(reco::PFBlock const&, std::map<unsigned int, reco::PFBlock::Link, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, reco::PFBlock::Link> > >&, edm::OwnVector<reco::PFBlockElement, edm::ClonePolicy<reco::PFBlockElement> > const&, std::vector<bool, std::allocator<bool> >&, edm::Ref<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock, edm::refhelper::FindUsingAdvance<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock> > const&, ElementIndices&, std::vector<bool, std::allocator<bool> >&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoParticleFlowPFProducer.so
#11 0x00002b4366d3d69b in PFAlgo::processBlock(edm::Ref<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock, edm::refhelper::FindUsingAdvance<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock> > const&, std::__cxx11::list<edm::Ref<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock, edm::refhelper::FindUsingAdvance<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock> >, std::allocator<edm::Ref<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock, edm::refhelper::FindUsingAdvance<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock> > > >&, std::__cxx11::list<edm::Ref<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock, edm::refhelper::FindUsingAdvance<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock> >, std::allocator<edm::Ref<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock, edm::refhelper::FindUsingAdvance<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> >, reco::PFBlock> > > >&, PFEGammaFilters const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoParticleFlowPFProducer.so
#12 0x00002b4366d3ddff in PFAlgo::reconstructParticles(edm::Handle<std::vector<reco::PFBlock, std::allocator<reco::PFBlock> > > const&, PFEGammaFilters const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoParticleFlowPFProducer.so
#13 0x00002b437d0a8788 in PFProducer::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoParticleFlowPFProducerPlugins.so
#14 0x00002b430f53a783 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libFWCoreFramework.so

Thread 11 (Thread 0x2b4363601700 (LWP 699) "cmsRun"):
#2  0x00002b43177fd2c0 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b434a153706 in Eigen::internal::TensorBlockAssignment<float, 2, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_squared_difference_op<float>, Eigen::TensorMap<Eigen::Tensor<float, 2, 1, long> const, 0, Eigen::MakePointer> const, Eigen::TensorMap<Eigen::Tensor<float, 2, 1, long> const, 0, Eigen::MakePointer> const>, long>::Run(Eigen::internal::TensorBlockAssignment<float, 2, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_squared_difference_op<float>, Eigen::TensorMap<Eigen::Tensor<float, 2, 1, long> const, 0, Eigen::MakePointer> const, Eigen::TensorMap<Eigen::Tensor<float, 2, 1, long> const, 0, Eigen::MakePointer> const>, long>::Target const&, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_squared_difference_op<float>, Eigen::TensorMap<Eigen::Tensor<float, 2, 1, long> const, 0, Eigen::MakePointer> const, Eigen::TensorMap<Eigen::Tensor<float, 2, 1, long> const, 0, Eigen::MakePointer> const> const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#5  0x00002b434a182fbc in std::_Function_handler<void (long, long), Eigen::internal::TensorExecutor<Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<float, 2, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_squared_difference_op<float>, Eigen::TensorBroadcastingOp<Eigen::array<long, 2ul> const, Eigen::TensorMap<Eigen::Tensor<float const, 2, 1, long>, 16, Eigen::MakePointer> const> const, Eigen::TensorBroadcastingOp<Eigen::array<long, 2ul> const, Eigen::TensorMap<Eigen::Tensor<float const, 2, 1, long>, 16, Eigen::MakePointer> const> const> const> const, Eigen::ThreadPoolDevice, true, (Eigen::internal::TiledEvaluation)1>::run(Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<float, 2, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_squared_difference_op<float>, Eigen::TensorBroadcastingOp<Eigen::array<long, 2ul> const, Eigen::TensorMap<Eigen::Tensor<float const, 2, 1, long>, 16, Eigen::MakePointer> const> const, Eigen::TensorBroadcastingOp<Eigen::array<long, 2ul> const, Eigen::TensorMap<Eigen::Tensor<float const, 2, 1, long>, 16, Eigen::MakePointer> const> const> const> const&, Eigen::ThreadPoolDevice const&)::{lambda(long, long)#1}>::_M_invoke(std::_Any_data const&, long&&, long&&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#6  0x00002b4345bbff02 in Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::function<long (long)>, std::function<void (long, long)>) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#7  0x00002b4347a09be6 in Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::function<void (long, long)>) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#8  0x00002b434a180df4 in Eigen::internal::TensorExecutor<Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<float, 2, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_squared_difference_op<float>, Eigen::TensorBroadcastingOp<Eigen::array<long, 2ul> const, Eigen::TensorMap<Eigen::Tensor<float const, 2, 1, long>, 16, Eigen::MakePointer> const> const, Eigen::TensorBroadcastingOp<Eigen::array<long, 2ul> const, Eigen::TensorMap<Eigen::Tensor<float const, 2, 1, long>, 16, Eigen::MakePointer> const> const> const> const, Eigen::ThreadPoolDevice, true, (Eigen::internal::TiledEvaluation)1>::run(Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<float, 2, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_squared_difference_op<float>, Eigen::TensorBroadcastingOp<Eigen::array<long, 2ul> const, Eigen::TensorMap<Eigen::Tensor<float const, 2, 1, long>, 16, Eigen::MakePointer> const> const, Eigen::TensorBroadcastingOp<Eigen::array<long, 2ul> const, Eigen::TensorMap<Eigen::Tensor<float const, 2, 1, long>, 16, Eigen::MakePointer> const> const> const> const&, Eigen::ThreadPoolDevice const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#9  0x00002b434a270a41 in tensorflow::BinaryOp<Eigen::ThreadPoolDevice, tensorflow::functor::squared_difference<float> >::Compute(tensorflow::OpKernelContext*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#10 0x00002b43514c15ca in tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::Process(tensorflow::PropagatorState::TaggedNode, long) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#11 0x00002b43514b3741 in std::_Function_handler<void (), tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::RunTask<std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)> >(std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)>&&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#12 0x00002b4345bc26e4 in tensorflow::thread::ThreadPool::Schedule(std::function<void ()>) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#13 0x00002b434e981861 in std::_Function_handler<void (std::function<void ()>), tensorflow::DirectSession::RunInternal(long, tensorflow::RunOptions const&, tensorflow::CallFrameInterface*, tensorflow::DirectSession::ExecutorsAndKeys*, tensorflow::RunMetadata*, tensorflow::thread::ThreadPoolOptions const&)::{lambda(std::function<void ()>)#6}>::_M_invoke(std::_Any_data const&, std::function<void ()>&&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#14 0x00002b43514b4b02 in void tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::RunTask<std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)> >(std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)>&&) [clone .constprop.0] () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#15 0x00002b43514b5b5e in tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::ScheduleReady(absl::lts_20210324::InlinedVector<tensorflow::PropagatorState::TaggedNode, 8ul, std::allocator<tensorflow::PropagatorState::TaggedNode> >*, tensorflow::PropagatorState::TaggedNodeReadyQueue*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#16 0x00002b43514bd904 in tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::NodeDone(tensorflow::Status const&, absl::lts_20210324::InlinedVector<tensorflow::PropagatorState::TaggedNode, 8ul, std::allocator<tensorflow::PropagatorState::TaggedNode> >*, tensorflow::NodeExecStatsInterface*, tensorflow::PropagatorState::TaggedNodeReadyQueue*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#17 0x00002b43514c0b49 in tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::Process(tensorflow::PropagatorState::TaggedNode, long) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#18 0x00002b43514b3741 in std::_Function_handler<void (), tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::RunTask<std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)> >(std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)>&&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#19 0x00002b4345bc26e4 in tensorflow::thread::ThreadPool::Schedule(std::function<void ()>) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#20 0x00002b434e981861 in std::_Function_handler<void (std::function<void ()>), tensorflow::DirectSession::RunInternal(long, tensorflow::RunOptions const&, tensorflow::CallFrameInterface*, tensorflow::DirectSession::ExecutorsAndKeys*, tensorflow::RunMetadata*, tensorflow::thread::ThreadPoolOptions const&)::{lambda(std::function<void ()>)#6}>::_M_invoke(std::_Any_data const&, std::function<void ()>&&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#21 0x00002b43514b4b02 in void tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::RunTask<std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)> >(std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)>&&) [clone .constprop.0] () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#22 0x00002b43514b5b5e in tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::ScheduleReady(absl::lts_20210324::InlinedVector<tensorflow::PropagatorState::TaggedNode, 8ul, std::allocator<tensorflow::PropagatorState::TaggedNode> >*, tensorflow::PropagatorState::TaggedNodeReadyQueue*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#23 0x00002b43514bd904 in tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::NodeDone(tensorflow::Status const&, absl::lts_20210324::InlinedVector<tensorflow::PropagatorState::TaggedNode, 8ul, std::allocator<tensorflow::PropagatorState::TaggedNode> >*, tensorflow::NodeExecStatsInterface*, tensorflow::PropagatorState::TaggedNodeReadyQueue*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#24 0x00002b43514c0b49 in tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::Process(tensorflow::PropagatorState::TaggedNode, long) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#25 0x00002b43514b3741 in std::_Function_handler<void (), tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::RunTask<std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)> >(std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)>&&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#26 0x00002b4345bc26e4 in tensorflow::thread::ThreadPool::Schedule(std::function<void ()>) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#27 0x00002b434e981861 in std::_Function_handler<void (std::function<void ()>), tensorflow::DirectSession::RunInternal(long, tensorflow::RunOptions const&, tensorflow::CallFrameInterface*, tensorflow::DirectSession::ExecutorsAndKeys*, tensorflow::RunMetadata*, tensorflow::thread::ThreadPoolOptions const&)::{lambda(std::function<void ()>)#6}>::_M_invoke(std::_Any_data const&, std::function<void ()>&&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#28 0x00002b43514b4b02 in void tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::RunTask<std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)> >(std::_Bind<void (tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::*(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>*, tensorflow::PropagatorState::TaggedNode, long))(tensorflow::PropagatorState::TaggedNode, long)>&&) [clone .constprop.0] () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#29 0x00002b43514b5b5e in tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::ScheduleReady(absl::lts_20210324::InlinedVector<tensorflow::PropagatorState::TaggedNode, 8ul, std::allocator<tensorflow::PropagatorState::TaggedNode> >*, tensorflow::PropagatorState::TaggedNodeReadyQueue*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#30 0x00002b43514bd904 in tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::NodeDone(tensorflow::Status const&, absl::lts_20210324::InlinedVector<tensorflow::PropagatorState::TaggedNode, 8ul, std::allocator<tensorflow::PropagatorState::TaggedNode> >*, tensorflow::NodeExecStatsInterface*, tensorflow::PropagatorState::TaggedNodeReadyQueue*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#31 0x00002b43514c0b49 in tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::Process(tensorflow::PropagatorState::TaggedNode, long) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#32 0x00002b43514c2038 in std::_Function_handler<void (), tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::RunTask<tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::ScheduleReady(absl::lts_20210324::InlinedVector<tensorflow::PropagatorState::TaggedNode, 8ul, std::allocator<tensorflow::PropagatorState::TaggedNode> >*, tensorflow::PropagatorState::TaggedNodeReadyQueue*)::{lambda()#2}>(tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::ScheduleReady(absl::lts_20210324::InlinedVector<tensorflow::PropagatorState::TaggedNode, 8ul, std::allocator<tensorflow::PropagatorState::TaggedNode> >*, tensorflow::PropagatorState::TaggedNodeReadyQueue*)::{lambda()#2}&&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#33 0x00002b4345bc26e4 in tensorflow::thread::ThreadPool::Schedule(std::function<void ()>) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#34 0x00002b434e981861 in std::_Function_handler<void (std::function<void ()>), tensorflow::DirectSession::RunInternal(long, tensorflow::RunOptions const&, tensorflow::CallFrameInterface*, tensorflow::DirectSession::ExecutorsAndKeys*, tensorflow::RunMetadata*, tensorflow::thread::ThreadPoolOptions const&)::{lambda(std::function<void ()>)#6}>::_M_invoke(std::_Any_data const&, std::function<void ()>&&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#35 0x00002b43514b5743 in tensorflow::(anonymous namespace)::ExecutorState<tensorflow::PropagatorState>::ScheduleReady(absl::lts_20210324::InlinedVector<tensorflow::PropagatorState::TaggedNode, 8ul, std::allocator<tensorflow::PropagatorState::TaggedNode> >*, tensorflow::PropagatorState::TaggedNodeReadyQueue*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#36 0x00002b43514bb5c9 in tensorflow::(anonymous namespace)::ExecutorImpl::RunAsync(tensorflow::Executor::Args const&, std::function<void (tensorflow::Status const&)>) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_framework.so.2
#37 0x00002b434e9938f8 in tensorflow::DirectSession::RunInternal(long, tensorflow::RunOptions const&, tensorflow::CallFrameInterface*, tensorflow::DirectSession::ExecutorsAndKeys*, tensorflow::RunMetadata*, tensorflow::thread::ThreadPoolOptions const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#38 0x00002b434e996332 in tensorflow::DirectSession::Run(tensorflow::RunOptions const&, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::Tensor>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::Tensor> > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >*, tensorflow::RunMetadata*, tensorflow::thread::ThreadPoolOptions const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libtensorflow_cc.so.2
#39 0x00002b434302928a in tensorflow::run(tensorflow::Session*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::Tensor>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::Tensor> > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >*, tensorflow::thread::ThreadPoolOptions const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libPhysicsToolsTensorFlow.so
#40 0x00002b43430292b9 in tensorflow::run(tensorflow::Session*, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::Tensor>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::Tensor> > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >*, tensorflow::thread::ThreadPoolInterface*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libPhysicsToolsTensorFlow.so
#41 0x00002b437e330975 in DeepMETProducer::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoMETMETPUSubtraction_plugins.so
#42 0x00002b430f53a783 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libFWCoreFramework.so

Thread 10 (Thread 0x2b4362c00700 (LWP 698) "cmsRun"):
#3  0x00002b4317800a0b in sig_dostack_then_abort () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x00002b4356241f66 in SiPixelTemplate2D::interpolate(int, float, float, float, float) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libCondFormatsSiPixelTransient.so
#6  0x00002b43561df57f in SiPixelTemplateReco2D::PixelTempReco2D(int, float, float, float, float, int, int, SiPixelTemplateReco2D::ClusMatrix&, SiPixelTemplate2D&, float&, float&, float&, float&, float&, float&, int&, float&, int&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoLocalTrackerSiPixelRecHits.so
#7  0x00002b43561b0bf0 in PixelCPEClusterRepair::callTempReco2D(PixelCPEBase::DetParam const&, PixelCPEClusterRepair::ClusterParamTemplate&, SiPixelTemplateReco2D::ClusMatrix&, int, Point3DBase<float, LocalTag>&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoLocalTrackerSiPixelRecHits.so
#8  0x00002b43561b2355 in PixelCPEClusterRepair::localPosition(PixelCPEBase::DetParam const&, PixelCPEBase::ClusterParam&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoLocalTrackerSiPixelRecHits.so
#9  0x00002b43561ad892 in PixelClusterParameterEstimator::getParameters(SiPixelCluster const&, GeomDet const&, TrajectoryStateOnSurface const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoLocalTrackerSiPixelRecHits.so
#10 0x00002b4340ce90b9 in TkClonerImpl::makeShared(SiPixelRecHit const&, TrajectoryStateOnSurface const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoTrackerTransientTrackingRecHit.so
#11 0x00002b4339ff0a00 in SiPixelRecHit::cloneSH_(TkCloner const&, TrajectoryStateOnSurface const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libDataFormatsTrackerRecHit2D.so
#12 0x00002b4342d21b9f in KFTrajectorySmoother::trajectory(Trajectory const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libTrackingToolsTrackFitters.so
#13 0x00002b4342cdac1f in (anonymous namespace)::KFFittingSmoother::smoothingStep(Trajectory&&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginTrackingToolsTrackFittersPlugins.so
#14 0x00002b4342cdbbe3 in (anonymous namespace)::KFFittingSmoother::fitOne(TrajectorySeed const&, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > > const&, TrajectoryStateOnSurface const&, TrajectoryFitter::fitType) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginTrackingToolsTrackFittersPlugins.so
#15 0x00002b4342cd069c in (anonymous namespace)::FlexibleKFFittingSmoother::fitOne(TrajectorySeed const&, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > > const&, TrajectoryStateOnSurface const&, TrajectoryFitter::fitType) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginTrackingToolsTrackFittersPlugins.so
#16 0x00002b4377663eb6 in TrackProducerAlgorithm<reco::Track>::buildTrack(TrajectoryFitter const*, Propagator const*, std::vector<AlgoProductTraits<reco::Track>::AlgoProduct, std::allocator<AlgoProductTraits<reco::Track>::AlgoProduct> >&, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > >&, TrajectoryStateOnSurface&, TrajectorySeed const&, float, reco::BeamSpot const&, edm::RefToBase<TrajectorySeed>, int, signed char) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoTrackerTrackProducer.so
#17 0x00002b43773c6ea2 in TrackProducerAlgorithm<reco::Track>::runWithCandidate(TrackingGeometry const*, MagneticField const*, std::vector<TrackCandidate, std::allocator<TrackCandidate> > const&, TrajectoryFitter const*, Propagator const*, TransientTrackingRecHitBuilder const*, reco::BeamSpot const&, std::vector<AlgoProductTraits<reco::Track>::AlgoProduct, std::allocator<AlgoProductTraits<reco::Track>::AlgoProduct> >&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoEgammaEgammaPhotonProducers.so
#18 0x00002b437dab8624 in TrackProducer::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoTrackerTrackProducerPlugins.so
#19 0x00002b430f53a783 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libFWCoreFramework.so

Thread 9 (Thread 0x2b4361c01700 (LWP 697) "cmsRun"):
#2  0x00002b43177fd2c0 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b4311c19183 in __memset_avx2_unaligned_erms () from /lib64/libc.so.6
#5  0x00002b43541fcb70 in MahiFit::doFit(std::array<float, 4ul>&, int) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoLocalCaloHcalRecAlgos.so
#6  0x00002b43541fd63f in MahiFit::phase1Apply(HBHEChannelInfo const&, float&, float&, float&, bool&, float&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoLocalCaloHcalRecAlgos.so
#7  0x00002b435421610c in SimpleHBHEPhase1Algo::reconstruct(HBHEChannelInfo const&, HcalRecoParam const*, HcalCalibrations const&, bool) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoLocalCaloHcalRecAlgos.so
#8  0x00002b4367755454 in void HBHEPhase1Reconstructor::processData<QIE11DataFrame, HcalDataFrameContainer<QIE11DataFrame> >(HcalDataFrameContainer<QIE11DataFrame> const&, HcalTopology const&, HcalDbService const&, std::vector<HcalChannelProperties, std::allocator<HcalChannelProperties> > const&, bool, HBHEChannelInfo*, edm::SortedCollection<HBHEChannelInfo, edm::StrictWeakOrdering<HBHEChannelInfo> >*, edm::SortedCollection<HBHERecHit, edm::StrictWeakOrdering<HBHERecHit> >*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoLocalCaloHcalRecProducers.so
#9  0x00002b436774a619 in HBHEPhase1Reconstructor::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoLocalCaloHcalRecProducers.so
#10 0x00002b430f53a783 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libFWCoreFramework.so

Thread 8 (Thread 0x2b4361200700 (LWP 696) "cmsRun"):
#2  0x00002b43177fd2c0 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b4368626824 in onnxruntime::BatchNorm<float>::Compute(onnxruntime::OpKernelContext*) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libonnxruntime.so.1.10.0
#5  0x00002b43689b3977 in onnxruntime::SequentialExecutor::Execute(onnxruntime::SessionState const&, std::vector<int, std::allocator<int> > const&, std::vector<OrtValue, std::allocator<OrtValue> > const&, std::vector<int, std::allocator<int> > const&, std::vector<OrtValue, std::allocator<OrtValue> >&, std::unordered_map<unsigned long, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtMemoryInfo const&, OrtValue&, bool&)>, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtMemoryInfo const&, OrtValue&, bool&)> > > > const&, onnxruntime::logging::Logger const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libonnxruntime.so.1.10.0
#6  0x00002b436899daf0 in onnxruntime::utils::ExecuteGraphImpl(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager const&, std::vector<OrtValue, std::allocator<OrtValue> > const&, std::vector<OrtValue, std::allocator<OrtValue> >&, std::unordered_map<unsigned long, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtMemoryInfo const&, OrtValue&, bool&)>, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtMemoryInfo const&, OrtValue&, bool&)> > > > const&, ExecutionMode, bool const&, onnxruntime::logging::Logger const&, bool) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libonnxruntime.so.1.10.0
#7  0x00002b43689a0254 in onnxruntime::utils::ExecuteGraph(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager&, std::vector<OrtValue, std::allocator<OrtValue> > const&, std::vector<OrtValue, std::allocator<OrtValue> >&, ExecutionMode, bool const&, onnxruntime::logging::Logger const&, bool) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libonnxruntime.so.1.10.0
#8  0x00002b43683d2dfb in onnxruntime::InferenceSession::Run(OrtRunOptions const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<OrtValue, std::allocator<OrtValue> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<OrtValue, std::allocator<OrtValue> >*, std::vector<OrtDevice, std::allocator<OrtDevice> > const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libonnxruntime.so.1.10.0
#9  0x00002b4368383fd2 in OrtApis::Run(OrtSession*, OrtRunOptions const*, char const* const*, OrtValue const* const*, unsigned long, char const* const*, unsigned long, OrtValue**) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/external/el8_amd64_gcc10/lib/libonnxruntime.so.1.10.0
#10 0x00002b437d0f4b8a in cms::Ort::ONNXRuntime::run(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::vector<float, std::allocator<float> >, std::allocator<std::vector<float, std::allocator<float> > > >&, std::vector<std::vector<long, std::allocator<long> >, std::allocator<std::vector<long, std::allocator<long> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, long) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libPhysicsToolsONNXRuntime.so
#11 0x00002b43a4d6eee7 in DeepFlavourONNXJetTagsProducer::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoBTagONNXRuntimePlugins.so
#12 0x00002b430f53a783 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libFWCoreFramework.so

Thread 1 (Thread 0x2b43129a1940 (LWP 563) "cmsRun"):
#2  0x00002b43177fd2c0 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b4340ae6ccf in reco::parser::ExpressionVar::objToDouble(edm::ObjectWithDict const&, reco::method::TypeCode) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libCommonToolsUtils.so
#5  0x00002b4340afe4ef in reco::parser::LazyInvoker::invokeLast(edm::ObjectWithDict const&, std::vector<edm::ObjectWithDict, std::allocator<edm::ObjectWithDict> >&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libCommonToolsUtils.so
#6  0x00002b4340ae62a8 in reco::parser::ExpressionLazyVar::value(edm::ObjectWithDict const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libCommonToolsUtils.so
#7  0x00002b4340b1e6d9 in reco::parser::BinarySelector::operator()(edm::ObjectWithDict const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libCommonToolsUtils.so
#8  0x00002b4376d568bb in ObjectSelectorBase<SingleElementCollectionSelector<edm::View<reco::Candidate>, StringCutObjectSelector<reco::Candidate, true>, edm::PtrVector<reco::Candidate>, edm::PtrVector<reco::Candidate>, helper::SelectionPtrViewAdder<reco::Candidate> >, edm::PtrVector<reco::Candidate>, NonNullNumberSelector, helper::NullPostProcessor<edm::PtrVector<reco::Candidate> >, helper::CollectionStoreManager<edm::PtrVector<reco::Candidate>, helper::IteratorToObjectConverter<edm::PtrVector<reco::Candidate> > >, helper::ObjectSelectorBase<edm::PtrVector<reco::Candidate>, edm::stream::EDFilter<> >, reco::modules::SingleElementCollectionSelectorEventSetupInit<SingleElementCollectionSelector<edm::View<reco::Candidate>, StringCutObjectSelector<reco::Candidate, true>, edm::PtrVector<reco::Candidate>, edm::PtrVector<reco::Candidate>, helper::SelectionPtrViewAdder<reco::Candidate> > > >::filter(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginCommonToolsCandAlgos_plugins.so
#9  0x00002b430f538843 in edm::stream::EDFilterAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libFWCoreFramework.so

Current Modules:
Module: TrackProducer:detachedTripletStepTracks (crashed)
Module: DeepFlavourONNXJetTagsProducer:pfDeepFlavourJetTagsSlimmedDeepFlavour
Module: PFProducer:particleFlowTmp
Module: CAHitTripletEDProducer:lowPtTripletStepHitTriplets
Module: DeepMETProducer:deepMETsResponseTune
Module: CandPtrSelector:TrkCands
Module: HBHEPhase1Reconstructor:hbhereco@cpu
Module: TrackProducer:detachedTripletStepTracks

@mmusich
Copy link
Contributor

mmusich commented Jul 27, 2022

It would be quite useful/effective to be able to get to the problematic event directly.

I think this falls under the responsibilities of the ORM.
Who is it this week?

BTW, do I read correctly this is the problematic thread?

Thread 10 (Thread 0x2b4362c00700 (LWP 698) "cmsRun"):
#3  0x00002b4317800a0b in sig_dostack_then_abort () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x00002b4356241f66 in SiPixelTemplate2D::interpolate(int, float, float, float, float) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libCondFormatsSiPixelTransient.so
#6  0x00002b43561df57f in SiPixelTemplateReco2D::PixelTempReco2D(int, float, float, float, float, int, int, SiPixelTemplateReco2D::ClusMatrix&, SiPixelTemplate2D&, float&, float&, float&, float&, float&, float&, int&, float&, int&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoLocalTrackerSiPixelRecHits.so
#7  0x00002b43561b0bf0 in PixelCPEClusterRepair::callTempReco2D(PixelCPEBase::DetParam const&, PixelCPEClusterRepair::ClusterParamTemplate&, SiPixelTemplateReco2D::ClusMatrix&, int, Point3DBase<float, LocalTag>&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoLocalTrackerSiPixelRecHits.so
#8  0x00002b43561b2355 in PixelCPEClusterRepair::localPosition(PixelCPEBase::DetParam const&, PixelCPEBase::ClusterParam&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoLocalTrackerSiPixelRecHits.so
#9  0x00002b43561ad892 in PixelClusterParameterEstimator::getParameters(SiPixelCluster const&, GeomDet const&, TrajectoryStateOnSurface const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoLocalTrackerSiPixelRecHits.so
#10 0x00002b4340ce90b9 in TkClonerImpl::makeShared(SiPixelRecHit const&, TrajectoryStateOnSurface const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoTrackerTransientTrackingRecHit.so
#11 0x00002b4339ff0a00 in SiPixelRecHit::cloneSH_(TkCloner const&, TrajectoryStateOnSurface const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libDataFormatsTrackerRecHit2D.so
#12 0x00002b4342d21b9f in KFTrajectorySmoother::trajectory(Trajectory const&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libTrackingToolsTrackFitters.so
#13 0x00002b4342cdac1f in (anonymous namespace)::KFFittingSmoother::smoothingStep(Trajectory&&) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginTrackingToolsTrackFittersPlugins.so
#14 0x00002b4342cdbbe3 in (anonymous namespace)::KFFittingSmoother::fitOne(TrajectorySeed const&, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > > const&, TrajectoryStateOnSurface const&, TrajectoryFitter::fitType) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginTrackingToolsTrackFittersPlugins.so
#15 0x00002b4342cd069c in (anonymous namespace)::FlexibleKFFittingSmoother::fitOne(TrajectorySeed const&, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > > const&, TrajectoryStateOnSurface const&, TrajectoryFitter::fitType) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginTrackingToolsTrackFittersPlugins.so
#16 0x00002b4377663eb6 in TrackProducerAlgorithm<reco::Track>::buildTrack(TrajectoryFitter const*, Propagator const*, std::vector<AlgoProductTraits<reco::Track>::AlgoProduct, std::allocator<AlgoProductTraits<reco::Track>::AlgoProduct> >&, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > >&, TrajectoryStateOnSurface&, TrajectorySeed const&, float, reco::BeamSpot const&, edm::RefToBase<TrajectorySeed>, int, signed char) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libRecoTrackerTrackProducer.so
#17 0x00002b43773c6ea2 in TrackProducerAlgorithm<reco::Track>::runWithCandidate(TrackingGeometry const*, MagneticField const*, std::vector<TrackCandidate, std::allocator<TrackCandidate> > const&, TrajectoryFitter const*, Propagator const*, TransientTrackingRecHitBuilder const*, reco::BeamSpot const&, std::vector<AlgoProductTraits<reco::Track>::AlgoProduct, std::allocator<AlgoProductTraits<reco::Track>::AlgoProduct> >&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoEgammaEgammaPhotonProducers.so
#18 0x00002b437dab8624 in TrackProducer::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/pluginRecoTrackerTrackProducerPlugins.so
#19 0x00002b430f53a783 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_3/lib/el8_amd64_gcc10/libFWCoreFramework.so

?

@Dr15Jones
Copy link
Contributor

Yes, thread 10 is where the problem lies.

@mmusich
Copy link
Contributor

mmusich commented Jul 27, 2022

Yes, thread 10 is where the problem lies.

So, the problem is actually with the 2D templates interpolation and not tracking per se.

@mmusich
Copy link
Contributor

mmusich commented Jul 27, 2022

@tvami, please remove the tracking type and assignment.
Incidentally you are among the main template experts in CMS...

@tvami
Copy link
Contributor

tvami commented Jul 27, 2022

Who's is it this week?

@mdmorris and shadow-ORM @ebrondol

@tvami
Copy link
Contributor

tvami commented Jul 27, 2022

unassign tracking-pog

@tvami
Copy link
Contributor

tvami commented Jul 27, 2022

please remove the tracking type and assignment.

Done

Incidentally you are among the main template experts in CMS...

Yes, I'll have a look...

@Dr15Jones
Copy link
Contributor

The simple configuration

skip_cfg.py

import FWCore.ParameterSet.Config as cms
from PSet import process


#3087                                                                                                                                                                                                       
process.source.skipEvents = cms.untracked.uint32(3086)

process.options.numberOfThreads = 1

If run in the same directory as PSet.py an PSet.pkl will fail in the first event to be processed.

@mdmorris
Copy link

2 more paused jobs have been reported in the CMS Talk thread: https://cms-talk.web.cern.ch/t/paused-job-for-promptreco-ephemeralhltphysics-due-to-segfault/13318/3

I am testing these locally to see if the segmentation fault is similar

@mmusich
Copy link
Contributor

mmusich commented Jul 28, 2022

reporting here also some private findings by @ferencek

I can reproduce the problem and in the stack trace I see

#6  0x00007f664179535b in SiPixelTemplate2D::interpolate (this=0x7ffddd544fa0, id=1032, cotalpha=-nan(0x400000), cotbeta=->nan(0x400000), locBz=-0.000118097785, locBx=0.0140094254) at /afs/cern.ch/work/f/ferencek/PixelOffline/SiPixelTemplate2D_Crash/vocms013.cern.ch-391090-3-log/CMSSW_12_4_3/src/CondFormats/SiPixelTransient/src/SiPixelTemplate2D.cc:759

which points to

entry00_ = &thePixelTemp_[index_id_].entry[iy0_][jx0_];

namely

entry00_ = &thePixelTemp_[index_id_].entry[iy0_][jx0_];

cotalpha=-nan(0x400000), cotbeta=-nan(0x400000) as arguments to SiPixelTemplate2D::interpolate look suspicious

It looks like the trajectory local parameters are not well defined which is in line with the warning that is emitted just before in the job:

%MSG-w BasicTrajectoryState:  TrackProducer:detachedTripletStepTracks  26-Jul-2022 20:59:11 CEST Run: 356077 Event: 94087667
BasicTrajectoryState: attempt to access errors when none available  accessing local error..
freestate pointer: parameters
x =       6.70089    -0.113416     -19.3376
p =          -nan         -nan         -nan
no error defined.

local error valid/values :0
[         -nan        -nan        -nan        -nan        -nan
          -nan        -nan        -nan        -nan        -nan
          -nan        -nan        -nan        -nan        -nan
          -nan        -nan        -nan        -nan        -nan
          -nan        -nan        -nan        -nan        -nan ]
%MSG

clearly there is something fishy in the trajectory building, but can we patch protecting against ill defined trajectories?
e.g. with this patch:

diff --git a/CondFormats/SiPixelTransient/src/SiPixelTemplate2D.cc b/CondFormats/SiPixelTransient/src/SiPixelTemplate2D.cc
index d9e3441e357..9380f61f63e 100644
--- a/CondFormats/SiPixelTransient/src/SiPixelTemplate2D.cc
+++ b/CondFormats/SiPixelTransient/src/SiPixelTemplate2D.cc
@@ -626,6 +626,12 @@ bool SiPixelTemplate2D::getid(int id) {
 bool SiPixelTemplate2D::interpolate(int id, float cotalpha, float cotbeta, float locBz, float locBx) {
   // Interpolate for a new set of track angles
 
+  //check for nan's
+  if (!edm::isFinite(cotalpha) || !edm::isFinite(cotbeta)) {
+    success_ = false;
+    return success_;
+  }
+
   // Local variables
 
   float acotb, dcota, dcotb;
@@ -680,12 +686,6 @@ bool SiPixelTemplate2D::interpolate(int id, float cotalpha, float cotbeta, float
 #ifndef SI_PIXEL_TEMPLATE_STANDALONE
       throw cms::Exception("DataCorrupt")
           << "SiPixelTemplate2D::illegal subdetector ID = " << thePixelTemp_[index_id_].head.Dtype << std::endl;
-
-      //check for nan's
-      if (!edm::isFinite(cotalpha) || !edm::isFinite(cotbeta)) {
-        success_ = false;
-        return success_;
-      }
 #else
       std::cout << "SiPixelTemplate:2D:illegal subdetector ID = " << thePixelTemp_[index_id_].head.Dtype << std::endl;
 #endif

the job reported in the initial comment: #38869 (comment) runs successfully.
I am not sure how affordable it is to run in production, while a more proper fix is introduced.

@mmusich
Copy link
Contributor

mmusich commented Jul 28, 2022

EDIT: seems that the protection introduced here: #34846 by @OzAmram happens too late in the code for the 2D case.

@tvami
Copy link
Contributor

tvami commented Jul 28, 2022

I made a PR here: #38881
I also changed the 1D reco, I'm not sure if that was really needed or not, but I could imagine the same problem coming up there too, right?

@Dr15Jones
Copy link
Contributor

I am testing these locally to see if the segmentation fault is similar

looking at the logs I see the traceback to where the segmentation fault occurs are the same.

@ebrondol
Copy link
Contributor Author

I am testing these locally to see if the segmentation fault is similar

looking at the logs I see the traceback to where the segmentation fault occurs are the same.

Yes, the Express_Run356323_StreamExpressAlignment in the 828th record while the Express_Run356323_StreamExpress in the 651th

@germanfgv
Copy link
Contributor

germanfgv commented Aug 1, 2022

Hi all. We have another instance of this segmentation fault issue with EphemeralHLTPhysics. Tarball can be found here:

/afs/cern.ch/user/c/cmst0/public/tarballs/Run2022C/PromptReco_Run356378_EphemeralHLTPhysics13.tar.gz

It is not clear to me if this was completely solved by #38881

@mmusich
Copy link
Contributor

mmusich commented Aug 1, 2022

/afs/cern.ch/user/c/cmst0/public/tarballs/Run2022C/PromptReco_Run356378_EphemeralHLTPhysics13.tar.gz
It is not clear to me if this was completely solved by #38881

@germanfgv please clarify if the Tier0 moved to a new release including #38881, which is impossible given it does not exist yet. Adjust your expectations accordingly

@mmusich
Copy link
Contributor

mmusich commented Aug 1, 2022

@fabiocos
Copy link
Contributor

fabiocos commented Aug 1, 2022

I see another crash that is probably related to this issue, the backport has been already merged, the IB did not show issues related to it for what I can appreciate. Might release managers clarify the possible timescale for a (patch)-release including the fix, or other issues keeping it on hold?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.