Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficiency loss for high pt tracks #248

Closed
areinsvo opened this issue Jan 6, 2020 · 4 comments
Closed

Efficiency loss for high pt tracks #248

areinsvo opened this issue Jan 6, 2020 · 4 comments

Comments

@areinsvo
Copy link
Collaborator

areinsvo commented Jan 6, 2020

As discussed, there is an ~20% efficiency loss for mkFit relative to CMSSW for tracks with pt > ~50 GeV.

This can be seen in the standalone, MTV-like validation plots where sim tracks are required to have a corresponding seed (plots here are with the high pt 10μ sample using the offline quadruplet seeds).

This was also seen in Matti’s plots using the HLT quadruplet seeds (compare black and red curves).

One (unproven) theory for the efficiency loss might be that perhaps the sizes of the search windows are too small for straight tracks that have have small errors. For instance, as far as I can tell, there is a minimum dq value set, but there is no minimum window size in dphi. See code here.

It is important to note that for the SIMVAL_MTV_SEED plots, the ONLY way to lose efficiency is if we add incorrect hits to the track. The sim tracks in the denominator of the efficiency are required to be matched to a seed, meaning that all 4 hits in the seed belong to the sim track. See here. This means that if we add 0 hits to the seed, the 4-hit track will by definition be matched to the sim track and be in the numerator of our efficiency. Therefore, the efficiency loss shown in the plots above has to have a different origin than what Matevz was investigating for triplets in PR242.

To investigate this issue, I will make a list of specific tracks that are affected, using the high pt 10μ sample. Slava is working on remaking the 10μ sample using HLT quadruplets for seeds, so we can confirm that fixing the issue offline also fixes the issue in the HLT configuration.

@osschar
Copy link
Collaborator

osschar commented Jan 6, 2020

dphi_min is taken in, see:

const auto calcdphi = [&](float dphi2) {

This all got a bit convoluted following several rounds of optimizations :)

@areinsvo
Copy link
Collaborator Author

areinsvo commented Jan 6, 2020

Good to know. Thanks for pointing that out!

@areinsvo
Copy link
Collaborator Author

For posterity:
Matevz discovered that the affected tracks had negative entries in the covariance matrix. Manually changing the negative covariances to 1 allowed the seeds to get processed, which fixed the efficiency issue, at the expense of an increase in duplicate rate:
11.4% before, 18.4% after for offline high pt 10muon events
8.5% before, 9.4% after for ttbar PU50 events

All of the plots can be seen in PR250. Once that PR gets merged, we can probably close this issue, unless people prefer to keep it open to remind ourselves to revisit the duplicate rate increase.

@areinsvo
Copy link
Collaborator Author

We no longer have a loss of efficiency at high pt, so I’m going to close this issue. I opened a follow-up issue for implementing a proper fix for the negative entries in the covariance matrix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants