Recovering efficiency for short tracks #195

cerati · 2018-12-10T16:25:31Z

This issue is to track ideas to try to recover the efficiency for short tracks. Note that these are not monitored in the standard validation, see issue #193 to find the instructions to lower our selection.

A first set of changes can be to change maxHolesPerCand from 12 to 3 and chi2Cut from 30 to 15 (as already studied by Mario). The effect can be seen in the plots here to those posted by Kevin. In particular it's interesting to see the effect on the efficiency vs pt (new, old) and vs eta (new, old). As expected, the fake rate also improves dramatically see e.g.: new vs old.
Another idea is to modify our candidate ranking. I think what we do is not correct for short tracks. If I understand correctly we compute the number of missing hits here. But in this way we are counting all invalid hits, even those after the last valid hit. Instead we should count only the invalid hits 'inside' a track. @mmasciov, can you please take a look? You can probably just loop backwards and count invalid hits only after a first valid hit is found.
We should review our parameters for the building keeping the short tracks in the metric plots: @mmasciov of course and also @areinsvo for the cluster charge cut, this is something you should take a look at...
Placeholder for more ideas!

kmcdermo · 2018-12-10T16:47:07Z

I think what we do is not correct for short tracks. If I understand correctly we compute the number of missing hits here. But in this way we are counting all invalid hits, even those after the last valid hit. Instead we should count only the invalid hits 'inside' a track.

Another way of putting it, we need to count the -1 hits up to the hit with -2. Couple ways of doing this. The first is what Giuseppe suggested. However, since we need to compute the score for every candidate (per seed) on every layer, this may be suboptimal.

Rather than looping over the hits on a track each time, we could continue to abuse the Status variable, and keep a counter (increments for every -1). When we go to addHitIdx(), we would just update the counter there. Then, to not over-count, we take the std::min(nMissed,Config::maxHolesPerCand) here.

OR, simply make the counter go forward, counting -1 up to -2. But this still suffers from having to loop on each candidate every layer.

cerati · 2018-12-10T16:55:36Z

Another way of putting it, we need to count the -1 hits up to the hit with -2.

No, if you count -1's up to the first -2, then you may count a number of -1 (up to maxHolesPerCand) after the last found hit. This penalizes short tracks. We should only count the -1's before the last non-negative index.

kmcdermo · 2018-12-10T17:11:56Z

Ah, so UP TO the last valid hit. Okay, then that case, you really would have to loop backwards as you suggested.

kmcdermo · 2018-12-11T14:44:28Z

I will add then, perhaps we can still abuse the Status variable (that or create another data member) to store the index within the HitOnTrack array of the last found hit (i.e. one with a non-negative hit index). This would prevent looping through the end of the track if it is filled with -1's and such.

makortel · 2018-12-12T19:05:54Z

I took @cerati's first bullet

A first set of changes can be to change maxHolesPerCand from 12 to 3 and chi2Cut from 30 to 15

(thanks for the branch test-short-tracks) and ran it through CMSSW for MTV plots. Full set is here
https://mkortela.web.cern.ch/mkortela/tracking/mkfit/PR/test-short-tracks/
I summarize below the ttbar+50PU efficiency (blue is CMSSW, red mkFit before the branch, black mkFit with the branch)

and fake rate

that show substantial improvement (main migration from fake to true tracks I suppose)

kmcdermo · 2018-12-12T19:21:17Z

This is great. So we definitely move in the right direction. Do you have the efficiency vs nLayers?

kmcdermo · 2018-12-12T19:23:17Z

also, is the eff. vs eta with some pT cut (pT > 0.9)?

makortel · 2018-12-12T19:25:07Z

also, is the eff. vs eta with some pT cut (pT > 0.9)?

Yes, it has pT > 0.9 (and the eff vs pT has |eta| < 2.5).

areinsvo · 2018-12-12T21:18:33Z

Another thing to keep in mind is that the cluster charge cut is not applied in the integrated CMSSW code. I'll start working on implementing that, because it might give us another jump in efficiency in the MTV plots by preventing us from adding extra hits to short tracks.

makortel · 2018-12-12T21:24:06Z

And here is the efficiency vs. layers (since @kmcdermo asked :)

We are now pretty good for => 8 layers.

cerati · 2019-04-12T17:45:40Z

Given the latest developments, I would like to try to outline what is missing in this respect to the short track efficiency issue.

fix treatment of tracks ending with '-2' for CE (we think SD is now doing things right)
check how much of the short tracks are lost due to the seed cleaning
further tune the parameters to recover more efficiency
verify that the efficiency loss is still due to picking too many hits (for instance we could look at how it scales with PU).

With respect to point 4., it is useful to keep in mind that CMSSW uses more handles to reject hits (cluster size, more refined CCC). We may try to tighten a bit our CCC cut (was not really tuned). Also, I think a basic cut on the cluster size should not be difficult to implement (we would have to store the number of strips in the Hit object using a few more bits and then compare this with the track angle with respect to that tangent plane at the hit radius or hit z). But of course I'd be happier if the tunings are enough to recover the efficiency...

kmcdermo · 2019-05-03T20:50:09Z

Mostly solved by PR #214, will wait fo retuning of parameters to close this.

kmcdermo · 2019-08-29T03:44:31Z

This can now be safely closed, as PR #239 has been merged. Glad to see this one go.

kmcdermo mentioned this issue Dec 12, 2018

Bug in counting of invalid hits for stopping a track and cand score #196

Closed

kmcdermo mentioned this issue May 8, 2019

Recovering short tracks continuing saga: adding overlapping hits + outlier rejection #223

Open

areinsvo mentioned this issue Aug 20, 2019

Update score calc, add warning for score overflow #239

Merged

kmcdermo closed this as completed Aug 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recovering efficiency for short tracks #195

Recovering efficiency for short tracks #195

cerati commented Dec 10, 2018 •

edited

Loading

kmcdermo commented Dec 10, 2018

cerati commented Dec 10, 2018 •

edited

Loading

kmcdermo commented Dec 10, 2018

kmcdermo commented Dec 11, 2018

makortel commented Dec 12, 2018

kmcdermo commented Dec 12, 2018 •

edited

Loading

kmcdermo commented Dec 12, 2018

makortel commented Dec 12, 2018

areinsvo commented Dec 12, 2018

makortel commented Dec 12, 2018

cerati commented Apr 12, 2019

kmcdermo commented May 3, 2019

kmcdermo commented Aug 29, 2019

Recovering efficiency for short tracks #195

Recovering efficiency for short tracks #195

Comments

cerati commented Dec 10, 2018 • edited Loading

kmcdermo commented Dec 10, 2018

cerati commented Dec 10, 2018 • edited Loading

kmcdermo commented Dec 10, 2018

kmcdermo commented Dec 11, 2018

makortel commented Dec 12, 2018

kmcdermo commented Dec 12, 2018 • edited Loading

kmcdermo commented Dec 12, 2018

makortel commented Dec 12, 2018

areinsvo commented Dec 12, 2018

makortel commented Dec 12, 2018

cerati commented Apr 12, 2019

kmcdermo commented May 3, 2019

kmcdermo commented Aug 29, 2019

cerati commented Dec 10, 2018 •

edited

Loading

cerati commented Dec 10, 2018 •

edited

Loading

kmcdermo commented Dec 12, 2018 •

edited

Loading