Track Processing update #285

tschuh · 2024-08-01T00:01:05Z

This PR updates mainly the modules TrackerTFP, TrackFindingTracklet and TrackTrigger.
It brings the Kalman Filter, Track Multiplexer, Duplicate Removal and Track Quality steps up to date (https://indico.cern.ch/event/1449862/contributions/6116147/attachments/2924526/5133676/thomas.pdf)

Detailed list for Ian:

DataFormats/L1TrackTrigger

TTBV update
added kf chi2 output product (pair of doubles)

L1Trigger/TrackFindingTracklet

HitPatternHelper updated to cope changes, likely broken
KFin deleted
KalmanFilter added
5 parameter fit option added (https://indico.cern.ch/event/1449862/contributions/6116147/attachments/2924526/5133676/thomas.pdf)
ProducerKalmanFilter added
State added
DRin renamed in TrackMultiplexer
L1FPGATrackProducer updated
DR renamed in DuplicateRemoval and updated
KFout removed
TBout removed
ChannelAssignment updated
FitTrack updated
L1TrackNtupleMaker updated with layer encoding specifics

L1Trigger/TrackTrigger

SensorModule updated
Setup updated
L1TrackQuality moved to trackerTFP

L1Trigger/TrackerDTC

DTC updated
LayerEncoding updated
Stub updated

L1Trigger/TrackerTFP

L1TrackQuality renamed in TrackQuality and turned into ESProduct
TrackFindingProcessor added.
CleanTrackBuilder added
DataFormats updated
Demonstrator updated
Duplicate Removal added
GeometricProcessor updated
HoughTransform updated
KalmanFilter updated
LayerEncoding updated
MiniHoughTransform deleted
State updated
ZHoughTransform deleted
KFin deleted

tomalin · 2024-08-05T10:03:26Z

As you're changing 160 files, the text describing this PR needs to be longer and more detailed. What are the changes that have been implemented? And how do they affect tracking performance? Have you given a talk which describes in detail the changes and the effect on performance. If so, please link it from here.

tomalin · 2024-08-05T11:05:40Z

This fails CI quite spectacularly https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7858190 . e.g. For HYBRID_NEWKF, the tracking efficiency is up by 0.3% (good), but the number of reconstructed tracks is down by 80% (very weird, given that no efficiency is lost!), and the z0 resolution is a factor 5 worse (bad).

For comparison, this is the git CI for a dummy PR that changed no code https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7924082 .

tomalin · 2024-08-13T15:40:51Z

This fails CI quite spectacularly https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7858190 . e.g. For HYBRID_NEWKF, the tracking efficiency is up by 0.3% (good), but the number of reconstructed tracks is down by 80% (very weird, given that no efficiency is lost!), and the z0 resolution is a factor 5 worse (bad).

For comparison, this is the git CI for a dummy PR that changed no code https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7924082 .

==== Thomas says CI failure explained by poor z0 resolution. He's retuning the digitisation ranges to fix this. But he believes that the tracklet helix parameters are wrong (inconsistent with seed positions by ~1cm) for a small fraction of the tracklets, which is also contributing to poor z0 resolution.

tschuh · 2024-08-14T12:50:23Z

new KF maths are fixed now, also passing CI using new KF. However, CI using old KF fails due to slightly lower z0 resolution. My PR should not impact those...

tomalin · 2024-09-02T13:17:57Z

I understand that the ProducerTM usually orders tracks according to their seed type. This is important for the functioning of the DR, so should be described in the comment at the top of the class header file, which explains what the class does. And at the line of the code that causes this ordering to happen, an additional comment should appear to highlight it. A comment mentioning this inside ProducerDR would also be wise.

tschuh · 2024-09-02T13:25:44Z

I understand that the ProducerTM usually orders tracks according to their seed type. This is important for the functioning of the DR, so should be described in the comment at the top of the class header file, which explains what the class does. And at the line of the code that causes this ordering to happen, an additional comment should appear to highlight it. A comment mentioning this inside ProducerDR would also be wise.

done

tomalin · 2024-09-03T11:13:11Z

@Chriisbrown & @cgsavard -- Thomas's overview description of this PR says "L1TrackQuality renamed in TrackQuality and turned into ESProduct". Are you both and Andreas happy with this change?

L1Trigger/Phase2L1ParticleFlow/test/make_l1ctLayer1_dumpFiles_fromRAW_cfg.py

tomalin · 2024-09-03T11:30:12Z

At the top of the header of the KF class where you do the maths (KalmanFilter.h?), add the comments copied from L7-31 of https://github.com/cms-L1TK/cmssw/blob/tschuh/L1Trigger/TrackFindingTMTT/interface/KFParamsComb.h#L7 , so the maths is documented. Although as you treat r-phi and r-z planes independently, you may need to modify a few lines of this.

tomalin · 2024-09-03T11:32:13Z

Overview description of this PR should mention that new class TrackFindingTracklet/ProducerKF has been added.

tomalin · 2024-09-03T11:40:29Z

L1Trigger/TrackFindingTracklet/interface/DuplicateRemoval.h

@@ -9,42 +9,39 @@

 namespace trklet {

-  /*! \class  trklet::DR
+  /*! \class  trklet::DuplicateRemoval
   *  \brief  Class to bit- and clock-accurate emulate duplicate removal
   *          DR identifies duplicates based on pairs of tracks that share stubs in at least 3 layers.
   *          It keeps the first such track in each pair.


Modify this comment to make clear that the track order is determined by TrackMultiplexer.

Modify comment to mention ProducerTM, since it's not obvious that the Track Multiplexer can be found there.

L1Trigger/TrackFindingTracklet/src/KalmanFilter.cc

cgsavard

I added the comments I could in the parts that I am familiar with. Unfortunately there are a lot of files I am unfamiliar with and so I couldn't be of much help there. I am not opposed to changing tracker quality to its own EDProducer if you feel that it is a better setup. Although right now the HYBRID version still uses it as a function and not EDProducer. Can we make that consistent?

There are three main concerns I brought up:

The change in naming scheme from chi2rz and chi2rphi to chi20 and chi21 only adds confusion to any others who look in. I think it's better to change these to the naming scheme that is generally accepted.
The MVA1 variable is not being calculated correctly as many of the BDT inputs are wrong. The correct variables that should be input can be seen further down in the TrackQuality.cc file and I made notes on changes need to each variable.
The tttracks being created in TrackFindingProcessor.cc have incorrect chi2 and mva variables. The chi2rz and chi2rphi should not be pdof and the mva should be in the 0-1 range and not binned upon tttrack initialization.

L1Trigger/TrackFindingTracklet/plugins/L1FPGATrackProducer.cc

L1Trigger/TrackFindingTracklet/src/HitPatternHelper.cc

L1Trigger/TrackFindingTracklet/src/TrackMultiplexer.cc

L1Trigger/TrackerTFP/python/Demonstrator_cfi.py

cgsavard · 2024-09-06T21:32:13Z

L1Trigger/TrackerTFP/python/Producer_cfi.py

+  InputLabelKF     = cms.string( "ProducerCTB"   ),  #
+  InputLabelDR     = cms.string( "ProducerKF"    ),  #
+  InputLabelTQ     = cms.string( "ProducerKF"    ),  #


I thought the order was DR -> KF -> TQ -> TFP, is there a mistake here? Is the DR output being used anywhere?

done and that is the tmtt track reco chain which runs KF -> DR -> TQ

I'm a bit confused about this other chain we have. Why is this in the TFP folder if it's only for TMTT? And is this actually ever used then?

I am not good in naming stuff. I put my dtc emulator in TrackerDTC and my tfp emulator in TrackerTFP.

L1Trigger/TrackerTFP/src/TrackQuality.cc

L1Trigger/TrackerTFP/src/TrackFindingProcessor.cc

cgsavard · 2024-09-06T21:58:09Z

L1Trigger/TrackerTFP/src/TrackQuality.cc

+  // TQ MVA bin conversion LUT
+  constexpr array<double, numBinsMVA_> TrackQuality::mvaPreSigBins() const {
+    array<double, numBinsMVA_> lut = {};
+    lut[0] = -16.;
+    for (int i = 1; i < numBinsMVA_; i++)
+      lut[i] = invSigmoid(TTTrack_TrackWord::tqMVABins[i]);
+    return lut;
+  }


There was a discussion in PR #272 about how to set things up such that we avoid ever having to do the inverse sigmoid. I believe this was mainly to better compare with the firmware as we will not do this conversion there. Since it was recently decided there, I think we should stay true to that.

cgsavard · 2024-09-06T21:59:56Z

L1Trigger/TrackerTFP/src/TrackQuality.cc

+    conifer::BDT<ap_fixed<10, 5>, ap_fixed<10, 5>> bdt(tq->model().fullPath());
+    // collect features and classify using bdt
+    const vector<ap_fixed<10, 5>>& output = bdt.decision_function({cot, zT, chi2B, nstub, ninterior, chi20, chi21});
+    const float mva = output[0].to_float();


Once the above changes to the input values are fixed, it should be checked that the MVA distribution still has a peak of fake tracks at 0 and real tracks at 1 to make sure it's working again. Right not the variable is nonsense.

as long as it is not retrained to the current z0, cot and chi2s digitization we should not expect great performance. When we retrain we should change the input variables to zT(layer encoding granularity), chi2B, hitPattern, chi2rphi, chi2rz. And maybe create a version without chi2B to have something we can test in f/w.

Sure I do think the performance won't be as good as before, but we should still expect a fake track peak about 0 and real track peak around 1 so it would be nice to double check this is the case and nothing else is way off. I agree on the retraining and will pass this info along.

tomalin · 2024-09-08T15:17:09Z

L1Trigger/TrackFindingTracklet/interface/TrackMultiplexer.h

-                 tt::StreamsTrack& acceptedTracks,
-                 tt::StreamsStub& lostStubs,
-                 tt::StreamsTrack& lostTracks);
+    void produce(tt::StreamsTrack& streamsTrack, tt::StreamsStub& streamsStub);


Given that consumes, produces & produce are key functions in CMSSW, common to all EDProducers, with definate meaning, I'm not keen on seeing the same function names being used in code that is not an EDProducer. e.g. If one searches the code for "produce" to find where EDProducts are being created, one will get false matches in TrackMultiplexer etc.

Any progress on this comment?

I like using those names, I would recommend to let cmssw reviewer decide.

L1Trigger/TrackFindingTracklet/src/DuplicateRemoval.cc

tomalin · 2024-09-08T15:34:04Z

L1Trigger/TrackFindingTracklet/src/HitPatternHelper.cc

@@ -21,13 +21,13 @@ namespace hph {
        oldKFPSet_(iConfig.getParameter<edm::ParameterSet>("oldKFPSet")),
        setupTT_(setupTT),
        dataFormats_(dataFormats),
-        dfcot_(dataFormats_.format(trackerTFP::Variable::cot, trackerTFP::Process::kfin)),
-        dfzT_(dataFormats_.format(trackerTFP::Variable::zT, trackerTFP::Process::kfin)),
+        dfcot_(dataFormats_.format(trackerTFP::Variable::cot, trackerTFP::Process::gp)),


Is the HitPatternHelper, which is only being used in the Hybrid (or Tracklet) algo, being made dependent on the GP, which is only used in the TMTT algo. What is going on here?

KF depends on GP, Hybrid uses KF.

tomalin · 2024-09-08T16:00:22Z

All your code uses a few classes a lot, such as TTBV, Frame, FrameStub & FrameTrack, StreamStub & StreamTrack, DataFormats & DataFormat, Setup. But they're only documented in a few sentences in /L1Trigger/TrackerTFP/README.md . I suggest adding a dedicated section to this README, explaining these things and giving examples of common manipulations, such as extracting a float from the digi data.

tomalin · 2024-09-08T16:17:58Z

L1Trigger/TrackFindingTracklet/test/L1TrackNtupleMaker_cfg.py

@@ -151,6 +149,8 @@

 # HYBRID: prompt tracking
 if (L1TRKALGO == 'HYBRID'):
+    process.TrackTriggerSetup.GeometricProcessor.ChosenRofZ = 50.0


Why is the Hybrid depending on the TMTT GP? And why is the eta range of the Hybrid being set to 2.4? I seem tp recall the L1TrkNtuple_eff_eta.pdf plot showing non-zero efficiency up to eta 2.5 in the past, although that currently doesn't seem to be the case. Is the eta cut in TrackTriggerSetup used only in the DTC & KF? It's also dodgy changing the default value of parameters for the baseline algo in a cfg.py, as anyone else running this algo will have to copy these parameter mods.

performing those changes (restoring the past configuration) is dodgy to begin with, they are just made to pass CI.

Please make the CI more reluctant so that we don't need these lines of code.

@skinnari do you remember if the Hybrid eta range for track reco was 2.5 or 2.4? It seems to be 2.4 now, but I recall it being 2.5.

tomalin · 2024-09-08T16:35:44Z

Please improve the comments at the top of the various .h and _cfg.py files with "Demonstrator" in the title, so that anyone wishing to use this code to check SW vs FW would understand what the various modules and doing and how they need to set up Vivado etc. to work with them.
e.g. TrackerTFP/test/AnalyzerDemonstrator.cc & TrackerTFP/plugins/ProducerDemonstrator.cc both have the same very short comment, despite the fact that one is an ESProducer and the other is an EDAnalyzer. Please make clear what the difference between them is. e.g. What is the ESProducer for? What is the EDAnalyzer doing?

L1Trigger/TrackFindingTracklet/python/Demonstrator_cfi.py

L1Trigger/TrackTrigger/python/Setup_cff.py

L1Trigger/TrackTrigger/python/Setup_cfi.py

tomalin · 2024-09-08T17:24:40Z

L1Trigger/TrackTrigger/test/CleanRelVal_cfg.py

+################################################################################################
+# Run bit-accurate TMTT L1 tracking emulation. 
+#
+# To run execute do


Why is this job called CleanRelVal if it runs the TMTT tracking? Also, the comment inside it says it is called test_cfg.py.
In any case. we already have TrackerTFP/test/test_cfg.py to run the TMTT tracking, so why add this second job?

will be obsolete soon

When will it be obsolete?

already is.

tomalin · 2024-09-08T17:28:04Z

Expand existing comment at start of LayerEncoding.h, to explain what the encoding actually is. i.e. How are the layers encoded?

tomalin · 2024-09-08T17:32:10Z

L1Trigger/TrackerTFP/interface/TrackFindingProcessor.h

+
+namespace trackerTFP {
+
+  // Class to format final tfp output


The name TrackFindingProcessor suggests that this class runs the entire TFP chain. But the comment here says that it only does some formatting of the TFP output, (which if true means it should be renamed to something suggesting that). I think it actually makes the TTTracks and the stream objects.

Any progress on this?

cgsavard · 2024-10-01T18:42:32Z

I am wondering if one can solves this simply by increasing the number of comparison modules

The increase in duplicate high eta tracks does not go away when doubling the number of DR comparison modules to 64, which I just tried.

tomalin · 2024-10-02T16:32:42Z

Your code uses MessageLogger via edm::LogPrint("L1Trigger/TrackerTFP"). I believe the category has to be a simple string, so a forward slash is not allowed.
e.g. https://gitlab.cern.ch/ejclemen/cmssw_trackfinding_hlsframework/-/blob/master/TrackFindingTrackletHLS/test/L1tracking_cfg.py?ref_type=heads#L34 gives an example of how the level of printout from the "IRProducer" category is controlled. You can see that the category corresponds to the python variable name, and python variable names cant include forward slash. (Also note that whenever the MessageLogger prints the category name, it automatically also prints the name of the EDProducer that called it). Also LogPrint is intended for "Warnings" https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideMessageLogger , so should only be used when something has gone wrong.

* Make combined modules default * tweak * Improve USEHYBRID ifdef range * Fix compiler error for pure Tracklet algo * Move fitpattern.txt refs, so only used for pure Tracklet algo * code format

* add TQ MVA output binning scheme * remove code no longer used * move logistic sigmoid into TTTrack class * fix for proper precision after MVA conversion in track word * label all MVA variables before logistic sigmoid with 'Pre' * formatting * fix filling of track word before running TQ MVA * switch to post-sigmoid mva bins * remove any onnx instances * fix MVA1 initialization * bin TQ MVA in pre-sigmoided bins in KFout * move pre-sigmoid bins to track quality class * change minimum bin

tomalin · 2024-12-04T21:39:34Z

@tschuh as this looks a significant improvement on the previous HYBRID_NEWKF results, I'm inclined to merge this as soon as it passes git CI. However, can you say here if you've managed to solve any of the issues with the HYBRID_NEWKF performance described by @cgsavard in her talk https://indico.cern.ch/event/1467716/contributions/6179940/attachments/2948559/5182327/L1TK_10_16_24_newKF.pdf ?

tomalin · 2024-12-05T17:43:47Z

I think @cgsavard 's comments here #285 (review) have not yet been addressed?

tschuh · 2024-12-05T17:50:17Z

I think @cgsavard 's comments here #285 (review) have not yet been addressed?

those got addressed.

tschuh · 2024-12-05T17:53:40Z

@tschuh as this looks a significant improvement on the previous HYBRID_NEWKF results, I'm inclined to merge this as soon as it passes git CI. However, can you say here if you've managed to solve any of the issues with the HYBRID_NEWKF performance described by @cgsavard in her talk https://indico.cern.ch/event/1467716/contributions/6179940/attachments/2948559/5182327/L1TK_10_16_24_newKF.pdf ?

The issues described by her talk seems to correspond to this issue. I am not a Tracklet expert and not able to solve it.

tomalin · 2024-12-06T01:47:37Z

@tschuh as this looks a significant improvement on the previous HYBRID_NEWKF results, I'm inclined to merge this as soon as it passes git CI. However, can you say here if you've managed to solve any of the issues with the HYBRID_NEWKF performance described by @cgsavard in her talk https://indico.cern.ch/event/1467716/contributions/6179940/attachments/2948559/5182327/L1TK_10_16_24_newKF.pdf ?

The issues described by her talk seems to correspond to this issue. I am not a Tracklet expert and not able to solve it.

I see that this PR changes the python cfg parameter "UseTTStubResiduals" mentioned in issue from False to True. So the KF is no longer by default using the digital output of the Tracklet stage, but instead recalculating the residuals itself. I assume your hypothesis is that the recalculated residuals are still wrong because they are using the helix parameters of the Tracklet seeds, and these you believe are wrong. If we looked at the TTTrack collection produced by the Tracklet part of the chain (i.e. by L1FPGATrackProducer.cc), should we see this? e.g. Should we see the bias in z0 reported in Claire's talk?

tschuh · 2024-12-06T14:46:32Z

The old KF in the Hybrid simulation is not fed with stub parameters calculated by the MatchProcessor. It is fed by stub collections determined by the TrackBuilder, but the stub parameters are taken from DTC stubs. Therefore you can't see the corruption in the hybrid s/w chain.

cgsavard · 2024-12-10T00:53:50Z

I think that the issues surrounding the TQ and the resolution shifts are ok to be left as an issue and not addressed here as they are separate from the motivation of this PR.

@tschuh Have you looked into the increase in duplicate tracks at high eta that is a result of the changes made in this PR? This is shown in slide 6 here. I do think this should at be looked into and addressed a little before this PR is merged as this is a direct result of the PR, whereas the other issues were issues that existed beforehand.

tschuh · 2024-12-11T13:18:35Z

I will make the mux order of seed types programmable in the TrackMultiplexer, that may give us a handle to improve the DR.

tschuh · 2024-12-11T14:05:55Z

I am not sure if that changed a lot, summary printout sees now difference.
In order to judge if the higher duplicate rate at high eta is a good or bad thing we need to understand how efficiency and fake rate has changed at high eta. It could be that the KF improved at high eta picking stub combinations which are more likely to match TPs. That will decrease fake rate but also increase efficiency and potentially duplicate rate.

tomalin

Since this represents a significant improvement in tracking performance over the old KF, I'm merging it. The tracking performance still has clear evidence of bugs, as shown in @cgsavard talk https://indico.cern.ch/event/1467716/contributions/6179940/attachments/2948559/5182327/L1TK_10_16_24_newKF.pdf . Thomas suspects these are caused by this issue #150 , which needs debugging. Claire also suspects the TQ is buggy, but that is unrelated to this PR. She says she's OK with merging this PR too.

tschuh requested a review from tomalin August 1, 2024 00:01

tschuh self-assigned this Aug 1, 2024

tomalin reviewed Sep 3, 2024

View reviewed changes

L1Trigger/Phase2L1ParticleFlow/test/make_l1ctLayer1_dumpFiles_fromRAW_cfg.py Show resolved Hide resolved

tomalin reviewed Sep 3, 2024

View reviewed changes

L1Trigger/TrackFindingTracklet/src/KalmanFilter.cc Outdated Show resolved Hide resolved

tomalin reviewed Sep 3, 2024

View reviewed changes

L1Trigger/TrackFindingTracklet/src/KalmanFilter.cc Outdated Show resolved Hide resolved

tomalin reviewed Sep 3, 2024

View reviewed changes

L1Trigger/TrackFindingTracklet/src/KalmanFilter.cc Show resolved Hide resolved

cgsavard reviewed Sep 6, 2024

View reviewed changes