Skip to content
This repository has been archived by the owner on Dec 9, 2024. It is now read-only.

Optimization for pT5 building and for pLS-T5 cross cleaning needed #364

Open
VourMa opened this issue Jan 25, 2024 · 3 comments
Open

Optimization for pT5 building and for pLS-T5 cross cleaning needed #364

VourMa opened this issue Jan 25, 2024 · 3 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@VourMa
Copy link
Contributor

VourMa commented Jan 25, 2024

Lately, I have been working on central configurations in CMSSW (|η| < 2.5). In this region, we expect that the LST does the full building job, i.e. efficiency and low fake+duplicate rate, without any additional CKF aid. The following plot shows 4 different configurations, all of which apply duplicate cleaning for pLS TCs and include only quad pLS TCs:

  • BLUE: LST-only building without T5s
  • RED: LST-only building with T5s
  • BLACK: LST+CKF building without T5s
  • ORANGE: LST+CKF building with T5s

Screenshot from 2024-01-25 11-16-09

Comparing RED to ORANGE, it seems like there is an explosion of the fake+duplicate rate when the LST-only building is used. The breakdown shows that this is due to an increased duplicate rate. Pixel track plots show that the quad pLS, which are the only ones used here, have a very low duplicate rate. That means that the increased duplicate rate of LST comes from imperfect cross cleaning with OT objects.

The first suspect are T5s, so BLUE and BLACK compare configurations without T5. Ignoring the loss of efficiency, it is observed that the fake+duplicate rate (according to the breakdown, the latter) is decreased by a lot in this case, and in fact becomes better for LST-only building (BLUE) than the LST+CKF building (BLACK). This leads me to the following hypothesis:
We have a lot of pLS-T5 duplicates. When CKF is used to complement the building, the pLSs are merged in a single object with their duplicate T5s, hence dramatically reducing the duplicate rate. On top of that, even though the efficiency is similar in the LST-only and the LST+CKF configurations, tracks are on average longer and have better resolution in the latter case. All of this is supported by the MTV plots. This implies that the pT5 building in LST can be improved.

The following plot shows how the physics performance changes when the ΔR^2 value used for the pLS-T5 cross cleaning changes from the default of 0.001 to 0.01. That implies that there is possibly some room for optimization also in the cross cleaning step.

image

image

@VourMa VourMa added the enhancement New feature or request label Jan 25, 2024
@slava77
Copy link
Contributor

slava77 commented Jan 25, 2024

The following plot shows how the physics performance changes when the ΔR^2 value used for the pLS-T5 cross cleaning changes from the default of 0.001 to 0.01. That implies that there is possibly some room for optimization also in the cross cleaning step.

What are the definitions of eta and phi in each case ... and where is the cross-cleaning code (just for a quick reference)? These have to be done at the same reference point.

@VourMa
Copy link
Contributor Author

VourMa commented Jan 25, 2024

where is the cross-cleaning code (just for a quick reference)?

struct crossCleanpLS {
template <typename TAcc>
ALPAKA_FN_ACC void operator()(TAcc const& acc,
struct SDL::modules modulesInGPU,
struct SDL::objectRanges rangesInGPU,
struct SDL::pixelTriplets pixelTripletsInGPU,
struct SDL::trackCandidates trackCandidatesInGPU,
struct SDL::segments segmentsInGPU,
struct SDL::miniDoublets mdsInGPU,
struct SDL::hits hitsInGPU,
struct SDL::quintuplets quintupletsInGPU) const {
using Dim = alpaka::Dim<TAcc>;
using Idx = alpaka::Idx<TAcc>;
using Vec = alpaka::Vec<Dim, Idx>;
Vec const globalThreadIdx = alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc);
Vec const gridThreadExtent = alpaka::getWorkDiv<alpaka::Grid, alpaka::Threads>(acc);
int pixelModuleIndex = *modulesInGPU.nLowerModules;
unsigned int nPixels = segmentsInGPU.nSegments[pixelModuleIndex];
for (int pixelArrayIndex = globalThreadIdx[2]; pixelArrayIndex < nPixels;
pixelArrayIndex += gridThreadExtent[2]) {
if (!segmentsInGPU.isQuad[pixelArrayIndex] || segmentsInGPU.isDup[pixelArrayIndex])
continue;
float eta1 = segmentsInGPU.eta[pixelArrayIndex];
float phi1 = segmentsInGPU.phi[pixelArrayIndex];
unsigned int prefix = rangesInGPU.segmentModuleIndices[pixelModuleIndex];
int nTrackCandidates = *(trackCandidatesInGPU.nTrackCandidates);
for (int trackCandidateIndex = globalThreadIdx[1]; trackCandidateIndex < nTrackCandidates;
trackCandidateIndex += gridThreadExtent[1]) {
short type = trackCandidatesInGPU.trackCandidateType[trackCandidateIndex];
unsigned int innerTrackletIdx = trackCandidatesInGPU.objectIndices[2 * trackCandidateIndex];
if (type == 4) // T5
{
unsigned int quintupletIndex = innerTrackletIdx; // T5 index
float eta2 = __H2F(quintupletsInGPU.eta[quintupletIndex]);
float phi2 = __H2F(quintupletsInGPU.phi[quintupletIndex]);
float dEta = alpaka::math::abs(acc, eta1 - eta2);
float dPhi = SDL::calculate_dPhi(phi1, phi2);
float dR2 = dEta * dEta + dPhi * dPhi;
if (dR2 < 1e-3f)
segmentsInGPU.isDup[pixelArrayIndex] = true;
}
if (type == 5) // pT3
{
int pLSIndex = pixelTripletsInGPU.pixelSegmentIndices[innerTrackletIdx];
int npMatched = checkPixelHits(prefix + pixelArrayIndex, pLSIndex, mdsInGPU, segmentsInGPU, hitsInGPU);
if (npMatched > 0)
segmentsInGPU.isDup[pixelArrayIndex] = true;
int pT3Index = innerTrackletIdx;
float eta2 = __H2F(pixelTripletsInGPU.eta_pix[pT3Index]);
float phi2 = __H2F(pixelTripletsInGPU.phi_pix[pT3Index]);
float dEta = alpaka::math::abs(acc, eta1 - eta2);
float dPhi = SDL::calculate_dPhi(phi1, phi2);
float dR2 = dEta * dEta + dPhi * dPhi;
if (dR2 < 0.000001f)
segmentsInGPU.isDup[pixelArrayIndex] = true;
}
if (type == 7) // pT5
{
unsigned int pLSIndex = innerTrackletIdx;
int npMatched = checkPixelHits(prefix + pixelArrayIndex, pLSIndex, mdsInGPU, segmentsInGPU, hitsInGPU);
if (npMatched > 0) {
segmentsInGPU.isDup[pixelArrayIndex] = true;
}
float eta2 = segmentsInGPU.eta[pLSIndex - prefix];
float phi2 = segmentsInGPU.phi[pLSIndex - prefix];
float dEta = alpaka::math::abs(acc, eta1 - eta2);
float dPhi = SDL::calculate_dPhi(phi1, phi2);
float dR2 = dEta * dEta + dPhi * dPhi;
if (dR2 < 0.000001f)
segmentsInGPU.isDup[pixelArrayIndex] = true;
}
}
}
}
};

The ΔR^2 I changed is here:
if (dR2 < 1e-3f)

What are the definitions of eta and phi in each case?

That's a good question, and one I will need to look around to answer, as the code is probably scattered throughout multiple files. If anyone knows off-hand, please come to the rescue, otherwise I will look for it within my day.

@VourMa VourMa added the good first issue Good for newcomers label Apr 3, 2024
@VourMa
Copy link
Contributor Author

VourMa commented Apr 3, 2024

Another idea about how we could deal with this issue is to run another linking iteration between the T5s and pLSs to be added in the TC collection with loosened selections compared to the original pT5 creation selections.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants