-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support pixel triplet seeds #242
Conversation
The benchmarks are here |
This PR (as of now) works fine in the standalone program. When ran from CMSSW with triplet seeds (made from quadruplet pixel tracks, as they are provided within the HLT menu), I get a segfault in
i.e. Lines 149 to 155 in d9dfbda
The Nhits is set inLine 60 in d9dfbda
Lines 775 to 780 in d9dfbda
where Config::nlayers_per_seed ultimately comes fromLine 19 in d9dfbda
Do I understand correctly that we are setting globally the exact number of hits per seed instead of a maximum? Any suggestions how to proceed? |
an "easy" option now would be to get CMSSW (configurably) do what we want. |
Why are we fitting seeds here -- this is where the crash happens? For patatrack seeds in our bin files we use the existing track params as input ... we do not refit them. The seed fit is only used for sim track seeds ... and there we do expect them to have 4 layers, indeed, as we construct the seeds by picking the first four hits from sim-track that lay on unique layers, dropping overlaps. Saying this, there are apparently two things for me to look at:
@kmcdermo no ... this has nothing to do with Track Kaboom |
I apologize for leading people astray on this. The last step of MkBuilder::PrepareSeeds() is MkBuilder::fit_seeds() (which is always called), and inside that is MkBuilder::fit_one_seed_set()... But as already mentioned, inside that last function there is a check if we are using CMSSW seeds, in which case if we are then the fitting is skipped and the seeds are imported directly into MkFitters. Sorry about that. The crash comes from the MkFitter::InputTracksAndHits() in MkBuilder::find_one_seed_set(). |
I disabled fitting for cmssw seeds at an earlier stage (so we never get to the piece of code where we had the crash) and fixed steering-params so that seeds are picked up also on earlier layers (this now expects the seeds to end on layer 2 (or later), i.e., if there is a doublet seed with hits on layers 0 and 1 it will not be picked up). https://github.com/trackreco/mkFit/tree/store-hits-per-seed @makortel can you give it a try together with your changes here and with your triplet dataset? |
Grr, I posted a wrong branch: |
@osschar Thanks. I (in my private repo) cherry-picked that on top of this branch (and rebased on top of
But that seems to be caused by #243 (and happens in the offline reconstruction configuration as well). Or, at least if I take my branch of #244, the offline reco works, if I take |
Probably because I did not do the necessary updates on the CMSSW side yet... |
Do you need me to do / check something? :) |
So let me try again
I (in my private repo) cherry-picked that on top of this branch (and rebased on top of #244) and tried again. At least technically the HLT workflow works now, need to check the physics performance next. |
Let me remind myself first has to be done from #243 first :) |
@osschar I finally got to complete the test, and https://github.com/trackreco/mkFit/tree/seed-fit-fix combined with this PR seems to work fine with pixel triplet seeds. How sould we proceed? (cherry-pick your commit here, two separate PRs, something else?) |
Hi Matti.
This is really good. Do we have a sense of (physics) performance of these?
… On Oct 9, 2019, at 8:59 AM, Matti Kortelainen ***@***.***> wrote:
@osschar <https://github.com/osschar> I finally got to complete the test, and https://github.com/trackreco/mkFit/tree/seed-fit-fix <https://github.com/trackreco/mkFit/tree/seed-fit-fix> combined with this PR seems to work fine with pixel triplet seeds. How sould we proceed? (cherry-pick your commit here, two separate PRs, something else?)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#242?email_source=notifications&email_token=ABDVH4NY7YL3SN3ONJGHSTLQNX5UZA5CNFSM4IVMK5TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAYMNEA#issuecomment-540067472>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABDVH4LEXV7TN3JAFYAQJCLQNX5UZANCNFSM4IVMK5TA>.
|
@makortel your call, I can make my PR right away ... but it probably makes more sense for you to cherry-pick the commit and do the full validation. |
@osschar Thanks. I'll then cherry-pick it here and provide the full validation. |
Matti,
Can you please provide the precise definitions of numerator and denominator for these as now that we have two types of seeds its not clear to me if this is for 3+4 hitters or just 3 etc..
Thanks, Avi
… On Oct 9, 2019, at 9:27 AM, Matti Kortelainen ***@***.*** ***@***.***>> wrote:
@osschar <https://github.com/osschar> Thanks. I'll then cherry-pick it here and provide the full validation.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#242?email_source=notifications&email_token=ABDVH4PFUUSOOVSUD4OM6T3QNYA5NA5CNFSM4IVMK5TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAYPIPQ#issuecomment-540079166>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDVH4KBYFKZOW5JJ2FRVQ3QNYA5NANCNFSM4IVMK5TA>.
|
Both have 3-hit seeds (made from 4-hit pixel tracks by dropping the outermost hit). |
Thank you!
This makes the pt>10GeV note interesting. I am wondering what this may be telling us.
… On Oct 9, 2019, at 9:39 AM, Matti Kortelainen ***@***.*** ***@***.***>> wrote:
Both have 3-hit seeds (made from 4-hit pixel tracks by dropping the outermost hit).
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#242?email_source=notifications&email_token=ABDVH4N2EL6LMHJYQA6XQGDQNYCL5A5CNFSM4IVMK5TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAYQQJA#issuecomment-540084260>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDVH4L4NPLUYAVJGCD5ZCDQNYCL5ANCNFSM4IVMK5TA>.
|
4c6e25c
to
298b66e
Compare
I cherry-picked the commit of @osschar. Here are the standalone benchmarks Here is MTV for 10k ttbar+PU50 events comparing HLT iter0 from CMSSW with HLT iter0 with mkFit |
this actually looks quite depressing. |
right. The interesting question is why.
Given the size of the effect, one hopes it would not be hard to pin down.
… On Oct 9, 2019, at 11:52 AM, Slava Krutelyov ***@***.***> wrote:
This makes the pt>10GeV note interesting. I am wondering what this may be telling us.
this actually looks quite depressing.
Single percent efficiency drop is usually a big deal. Here we have 20% by 100GeV.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#242?email_source=notifications&email_token=ABDVH4IML5RHAORUQT6G4NTQNYR6DA5CNFSM4IVMK5TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAY554Q#issuecomment-540139250>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDVH4KERD3EKIKVW2GXKN3QNYR6DANCNFSM4IVMK5TA>.
|
maybe it is not the cause, but are we using the duplicate seed removal? we should disable it for triplets. |
The duplicate seed cleaning is enabled. I can test again disabling it. |
I updated the plots in https://mkortela.web.cern.ch/mkortela/tracking/mkfit/PR/PR242_mtv/ and the seed cleaning seems to have no effect on efficiency (black=disabled, red=enabled) |
About the hit selection windows for pixel layers:
That is, all was done for pixel layers was to slightly reduce the phi selection window for the pixel endcaps. Naively, I am not sure whether this could be the reason for the inefficiency at large pT. |
Yup, now I have everything, thanks! :) |
I went over comparing seed/sim/cmssw/mkfit tracks of first 20 events (about 100 seeds). I found 3 tracks for which mkfit only finds about 50% of hits (cms finds about 75%) and will look into those in detail. Then there are a couple more that are not so dramatic and I might look into later. All in all, this is about 10 tracks out of 100, so a 10% efficiency drop which seems consistent within +-10% error bar here :) Another thing I noticed is that about 1% of seeds get rejected in nan_and_silly check because they have negative values on element (5,5) of covariance (theta), e.g.:
There are also two cases with similar values on (4,4) (phi). pT is also rather poor but the error is not negative. Right now our settings are to discard such seeds ( Do we want to go into this business tight now? It's a 1% effect at most. |
I went through one track, pT=8.49, eta=-2.39 To get mkfit to find a track of about the same quality as cmssw (10 found hits, pT 8.25, ch2 ~ 60) I had to:
And still, a short track with 3 seed hits and a wrong hit picked on the 4th layer, followed by three missed hits wins at the end, mostly because of the chi2 penalty to candidate score (the "good" track has 6 more hits but chi2 of 60 makes it worse). Clearly, our scoring isn't optimal for noisy 3-hit seeds :) It also seems our detection of hit/miss of a layer is not fuzzy enough ... we might have to consider extending the might-have-missed-the-layer region or increasing N_max_holes. I will look at two more tracks tomorrow. Then I might have a recommendation for a couple of test runs to see how this shows on a larger statistics. |
was this still from the muon gun sample? |
Yes, muons. i was able to trace it to another track. You can get full event dump by uncommenting all the DUMP_XXX defines in Event.cc:456 and run
and then the hit that gets picked up is the only hit on layer 54, belogning to track label 1. The track that I was looking at had label 6. |
I looked at three more tracks, here's a short report on what I had to fix to improve mkFit track.
So, there isn't a single thing that's wrong but rather our whole scoring and best-short-track handling seems to be optimized for 4-hit seeds with better initial parameters. We need to figure out how to handle this but it seems we need:
|
I'm not sure I understand the numbers before chi2. Please clarify what is printed. |
EventNo Label hits_found/total_hits
|
is the difference between hits_found and total_hits only in the number of holes between the innermost and the outermost found hit? |
Here are MTV plots comparing CMSSW with triplet seeds (blue), mkFit with quadruplet seeds (red), and mkFit with triplet seeds (black) (and to remind, in this specific case the triplet seeds contain the first three hits of the quadruplet seeds) |
So sadly, we seem to suck at every level above 10 GeV…
Even our quads seem worse at high pT than CMSSW w only three pixel hits used in the seed. This means that CMSSW manages to pick the 4th pixel hit and/or extend the track correctly based on the three inner most hits while we don’t.
Moreover, even when we do have them, we still manage to get confused at high(er) pT.
This may again hint at errors shrinking and possible chi2 exploding for high pT tracks?
Does this make sense??
… On Oct 25, 2019, at 11:52 AM, Matti Kortelainen ***@***.*** ***@***.***>> wrote:
Here are MTV plots comparing CMSSW with triplet seeds (blue), mkFit with quadruplet seeds (red), and mkFit with triplet seeds (black) (and to remind, in this specific case the triplet seeds contain the first three hits of the quadruplet seeds)
https://mkortela.web.cern.ch/mkortela/tracking/mkfit/PR/PR242_mtv_quad/ <https://mkortela.web.cern.ch/mkortela/tracking/mkfit/PR/PR242_mtv_quad/>
<https://user-images.githubusercontent.com/33993/67596567-9f9e0380-f72e-11e9-80a2-d085d3420155.png>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#242?email_source=notifications&email_token=ABDVH4MIB3QHDPQZOGURR5DQQM6ATA5CNFSM4IVMK5TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECJHWVI#issuecomment-546470741>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDVH4KPEE4NJVZSHPUVDYLQQM6ATANCNFSM4IVMK5TA>.
|
No, it's all the not found hits, including the trailing ones. In this case this just tells you the best selected track was the first short track we found, with 4 not found hits at the end. |
One issue we can have at high pT is that our track parameters does not allow to change the charge, while CMSSW allows it (Matti just verified it). So if the seed has the wrong charge we are lost. This is something we should fix in the code (remove the charge as a standalone parameter and get it is the sign of the 1/pt parameter, now q/pt). |
Interesting!!
Can we check for the size of this effect by counting the tracks found in cmssw w opposite sign to seed?
…Sent from my iPhone
On Oct 25, 2019, at 14:30, cerati ***@***.***> wrote:
One issue we can have at high pT is that our track parameters does not allow to change the charge, while CMSSW allows it (Matti just verified it). So if the seed has the wrong charge we are lost. This is something we should fix in the code (remove the charge as a standalone parameter and get it is the sign of the 1/pt parameter, now q/pt).
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
we know that on the 10 mu sample (0.5-10 GeV) it ~1% overall but ~8% when looking only at pT>7 GeV and |eta|>2.3. We are now trying to assess the effect on the efficiency, but may need a larger sample to get final conclusions... |
("we" are FNAL+Slava having a small f2f instead of beer on Friday) |
Thanks!
…Sent from my iPhone
On Oct 25, 2019, at 15:03, cerati ***@***.***> wrote:
("we" are FNAL+Slava having a small f2f instead of beer on Friday)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
We had some discussion whether we had seen the high-pT inefficiency before (#242 (comment), red (mkFit with quads). vs blue (CMSSW with triplets) I ran both CMSSW and mkFit in the offline initialStep with the full 10kevent ttbar+PU50 sample, MTV plots here There is some inefficiency in mkFit vs. CMSSW starting around 80 GeV, but the effect is smaller than in the HLT case. Thinking out loud, one possible cause for the difference is the duplicate cleaning within pixel tracking. I could try what happens in the HLT case for mkFit if I disable that. |
Plots are here
Disabling the pixel track cleaning improves the efficiency closer to offline quadruplets, but at the cost of increasing fakes and duplicates. (mkFit's seed cleaning was enabled for both red and black) |
Hi Matti, I have a hard time reconciling the efficiency plot vs pT with the one vs eta for HLT: |
Thanks @cerati for pointing it out, there is clearly something wrong in the blue reference. I re-ran it and made a new set of plots
|
This commit is mainly to work around a "feature" of CMSSW EDProducer that converts pixel reco::Tracks to TrajectorySeeds that ignores the fourth hit of the seed. On ther other hand, any seed selection (that depends purely on the seed properties) should really be done by an independent producer in CMSSW (or by non-CMSSW-called code within the standalone application) as that would be much more flexible.
- Do not even enter seed-fit for cmssw seeds. Before the check if we are using cmssw seeds was within the inner seed-fit loop after loading of seeds into Matriplexes (which bombed for non-quadruplet seeds). - Note: the seed-fit is still not fixed and expects Config::nlayers_per_seed hits to be present in incoming seeds. - Extend steering-parametrs to start seed-candidate pickup on the third layer.
c7dbb7e
to
02160db
Compare
Rebased on top of HEAD of |
This PR is to add support for pixel triplet seeds in order to work around a "feature" of CMSSW EDProducer that converts pixel reco::Tracks to TrajectorySeeds (used in the HLT menu today) that ignores the fourth hit of the seed (in the context of #236).
First step is to remove the "minimum number of hits in a seed" cut from the seed cleaner, but that turned out to be insufficient. The story continues in the comments.