Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recovering efficiency for short tracks #195

Closed
cerati opened this issue Dec 10, 2018 · 13 comments
Closed

Recovering efficiency for short tracks #195

cerati opened this issue Dec 10, 2018 · 13 comments

Comments

@cerati
Copy link
Collaborator

cerati commented Dec 10, 2018

This issue is to track ideas to try to recover the efficiency for short tracks. Note that these are not monitored in the standard validation, see issue #193 to find the instructions to lower our selection.

  • A first set of changes can be to change maxHolesPerCand from 12 to 3 and chi2Cut from 30 to 15 (as already studied by Mario). The effect can be seen in the plots here to those posted by Kevin. In particular it's interesting to see the effect on the efficiency vs pt (new, old) and vs eta (new, old). As expected, the fake rate also improves dramatically see e.g.: new vs old.

  • Another idea is to modify our candidate ranking. I think what we do is not correct for short tracks. If I understand correctly we compute the number of missing hits here. But in this way we are counting all invalid hits, even those after the last valid hit. Instead we should count only the invalid hits 'inside' a track. @mmasciov, can you please take a look? You can probably just loop backwards and count invalid hits only after a first valid hit is found.

  • We should review our parameters for the building keeping the short tracks in the metric plots: @mmasciov of course and also @areinsvo for the cluster charge cut, this is something you should take a look at...

  • Placeholder for more ideas!

@kmcdermo
Copy link
Collaborator

I think what we do is not correct for short tracks. If I understand correctly we compute the number of missing hits here. But in this way we are counting all invalid hits, even those after the last valid hit. Instead we should count only the invalid hits 'inside' a track.

Another way of putting it, we need to count the -1 hits up to the hit with -2. Couple ways of doing this. The first is what Giuseppe suggested. However, since we need to compute the score for every candidate (per seed) on every layer, this may be suboptimal.

Rather than looping over the hits on a track each time, we could continue to abuse the Status variable, and keep a counter (increments for every -1). When we go to addHitIdx(), we would just update the counter there. Then, to not over-count, we take the std::min(nMissed,Config::maxHolesPerCand) here.

OR, simply make the counter go forward, counting -1 up to -2. But this still suffers from having to loop on each candidate every layer.

@cerati
Copy link
Collaborator Author

cerati commented Dec 10, 2018

Another way of putting it, we need to count the -1 hits up to the hit with -2.

No, if you count -1's up to the first -2, then you may count a number of -1 (up to maxHolesPerCand) after the last found hit. This penalizes short tracks. We should only count the -1's before the last non-negative index.

@kmcdermo
Copy link
Collaborator

Ah, so UP TO the last valid hit. Okay, then that case, you really would have to loop backwards as you suggested.

@kmcdermo
Copy link
Collaborator

I will add then, perhaps we can still abuse the Status variable (that or create another data member) to store the index within the HitOnTrack array of the last found hit (i.e. one with a non-negative hit index). This would prevent looping through the end of the track if it is filled with -1's and such.

@makortel
Copy link
Collaborator

I took @cerati's first bullet

A first set of changes can be to change maxHolesPerCand from 12 to 3 and chi2Cut from 30 to 15

(thanks for the branch test-short-tracks) and ran it through CMSSW for MTV plots. Full set is here
https://mkortela.web.cern.ch/mkortela/tracking/mkfit/PR/test-short-tracks/
I summarize below the ttbar+50PU efficiency (blue is CMSSW, red mkFit before the branch, black mkFit with the branch)
image
and fake rate
image
that show substantial improvement (main migration from fake to true tracks I suppose)

@kmcdermo
Copy link
Collaborator

kmcdermo commented Dec 12, 2018

This is great. So we definitely move in the right direction. Do you have the efficiency vs nLayers?

@kmcdermo
Copy link
Collaborator

also, is the eff. vs eta with some pT cut (pT > 0.9)?

@makortel
Copy link
Collaborator

also, is the eff. vs eta with some pT cut (pT > 0.9)?

Yes, it has pT > 0.9 (and the eff vs pT has |eta| < 2.5).

@areinsvo
Copy link
Collaborator

Another thing to keep in mind is that the cluster charge cut is not applied in the integrated CMSSW code. I'll start working on implementing that, because it might give us another jump in efficiency in the MTV plots by preventing us from adding extra hits to short tracks.

@makortel
Copy link
Collaborator

And here is the efficiency vs. layers (since @kmcdermo asked :)
image

We are now pretty good for => 8 layers.

@cerati
Copy link
Collaborator Author

cerati commented Apr 12, 2019

Given the latest developments, I would like to try to outline what is missing in this respect to the short track efficiency issue.

  1. fix treatment of tracks ending with '-2' for CE (we think SD is now doing things right)
  2. check how much of the short tracks are lost due to the seed cleaning
  3. further tune the parameters to recover more efficiency
  4. verify that the efficiency loss is still due to picking too many hits (for instance we could look at how it scales with PU).

With respect to point 4., it is useful to keep in mind that CMSSW uses more handles to reject hits (cluster size, more refined CCC). We may try to tighten a bit our CCC cut (was not really tuned). Also, I think a basic cut on the cluster size should not be difficult to implement (we would have to store the number of strips in the Hit object using a few more bits and then compare this with the track angle with respect to that tangent plane at the hit radius or hit z). But of course I'd be happier if the tunings are enough to recover the efficiency...

@kmcdermo
Copy link
Collaborator

kmcdermo commented May 3, 2019

Mostly solved by PR #214, will wait fo retuning of parameters to close this.

@kmcdermo
Copy link
Collaborator

This can now be safely closed, as PR #239 has been merged. Glad to see this one go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants