-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A new digitization model for the ETL detector #43762
Conversation
cms-bot internal usage |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43762/38482
|
A new Pull Request was created by @parbol for master. It involves the following packages:
@srimanob, @civanch, @mdhildreth, @mandrenguyen, @subirsarkar, @jfernan2, @cmsbuild can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test workflow 27634.0 |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43762/38483
|
+1 |
+1 |
@cms-sw/upgrade-l2 any objection to the integration of this code? It missed unfortunately 14_0_X, it would be useful to validate it at the first occasion |
+Upgrade |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @rappoccio, @antoniovilela, @sextonkennedy (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
It seems like this PR would be causing crashes in some multithreaded Phase2 workflows, see #43942 |
if (!topo.isInPixel(simscaled)) { | ||
LogDebug("ETLDeviceSim") << "Skipping hit ouf of pixel # " << hitidx << " DetId " << etlid.rawId() << " tof " | ||
<< toa; | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the removal of this continue
when topo.isInPixel(simscaled)
could be causing the out-of-bounds crash investigated in #43942
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@makortel thanks for the investigation. ETL row-colum is assigned here, based on the https://github.com/cms-sw/cmssw/blob/master/Geometry/MTDGeometryBuilder/src/RectangularMTDTopology.cc class. On the other hand the statement was modified so as to take into account a possible signal also in the interpixel gaps, i.e. the non-fully-active part of the logical partition of the surface of the detector. BTW the wrong definition of the gap size was responsible for a reduced efficiency, and was recently fixed (.i.e. too may simhits were skipped).
The point is to my mind s whether removing that continue
can produce unphysical row-column values when topo.pixel(simscaled)
is called. Triggered by #43942 (comment) , I added an assert and printed the maximum allowed range in the MTD clusterizer https://github.com/cms-sw/cmssw/blob/master/RecoLocalFastTime/FTLClusterizer/src/MTDThresholdClusterizer.cc#L73 , and this shows
Buffer size [33,17]
i.e. 32,16 should be an allowed value that does not trigger the assert. Anyway since the assertion fires, there is clearly something wrong, by reproducing the crash on PU jobs we should be able to finally catch it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an assert and printed the maximum allowed range in the MTD clusterizer https://github.com/cms-sw/cmssw/blob/master/RecoLocalFastTime/FTLClusterizer/src/MTDThresholdClusterizer.cc#L73 , and this shows
Buffer size [33,17]
i.e. 32,16 should be an allowed value that does not trigger the assert.
The printout
LogDebug("MTDThresholdClusterizer") << "Buffer size [" << theNumOfRows + 1 << "," << theNumOfCols + 1 << "]"; |
adds 1 to both the number of rows and columns. The
MTDArrayBuffer::setSize()
, however, uses the number of rows and columns directly for the spacecmssw/RecoLocalFastTime/FTLClusterizer/interface/MTDArrayBuffer.h
Lines 134 to 143 in db9aceb
void MTDArrayBuffer::setSize(uint rows, uint cols) { | |
hitSubDet_vec.resize(rows * cols, GeomDetEnumerators::invalidLoc); | |
hitEnergy_vec.resize(rows * cols, 0); | |
hitTime_vec.resize(rows * cols, 0); | |
hitTimeError_vec.resize(rows * cols, 0); | |
hitGP_vec.resize(rows * cols); | |
hitLE_vec.resize(rows * cols); | |
xpos_vec.resize(rows * cols), nrows = rows; | |
ncols = cols; | |
} |
and otherwise seems to use the usual 0-based indexing
cmssw/RecoLocalFastTime/FTLClusterizer/interface/MTDArrayBuffer.h
Lines 170 to 179 in db9aceb
void MTDArrayBuffer::set(uint row, | |
uint col, | |
GeomDetEnumerators::Location subDet, | |
float energy, | |
float time, | |
float time_error, | |
const LocalError& local_error, | |
const LocalPoint& local_point, | |
float xpos) { | |
hitSubDet_vec[index(row, col)] = subDet; |
uint index(uint row, uint col) const { return col * nrows + row; } |
so I'd be tempted to conclude the buffer has size for 32 x 16 elements, i.e. ranges of
[0, 31] x [0, 15]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@makortel you are correct and I was wrong. My bad, I should have cross checked earlier with the geometry tests I have setup, in particular https://github.com/cms-sw/cmssw/blob/master/Geometry/MTDGeometryBuilder/test/MTDDigiGeometryAnalyzer.cc
The output for scenario D098 is quite clear:
Subdetector 19 MTD Det EModule_Timingactive
Rows 32 Columns 16 ROCS X 2 ROCS Y 1 Rows/ROC 16 Cols/ROC 16 Pitch X 0.13125 Pitch Y 0.1325 Sensor Interpad X 0.005 Sensor Interpad Y 0.005 Sensor Border X 0.005 Sensor Border Y 0.005
Pixel center location:
0 0 (-2.03437,-0.99375,0)
0 1 (-2.03437,-0.86125,0)
0 2 (-2.03437,-0.72875,0)
0 3 (-2.03437,-0.59625,0)
0 4 (-2.03437,-0.46375,0)
0 5 (-2.03437,-0.33125,0)
0 6 (-2.03437,-0.19875,0)
0 7 (-2.03437,-0.06625,0)
0 8 (-2.03437,0.06625,0)
0 9 (-2.03437,0.19875,0)
0 10 (-2.03437,0.33125,0)
0 11 (-2.03437,0.46375,0)
0 12 (-2.03437,0.59625,0)
0 13 (-2.03437,0.72875,0)
0 14 (-2.03437,0.86125,0)
0 15 (-2.03437,0.99375,0)
1 0 (-1.90312,-0.99375,0)
1 1 (-1.90312,-0.86125,0)
1 2 (-1.90312,-0.72875,0)
1 3 (-1.90312,-0.59625,0)
1 4 (-1.90312,-0.46375,0)
1 5 (-1.90312,-0.33125,0)
1 6 (-1.90312,-0.19875,0)
1 7 (-1.90312,-0.06625,0)
1 8 (-1.90312,0.06625,0)
1 9 (-1.90312,0.19875,0)
1 10 (-1.90312,0.33125,0)
1 11 (-1.90312,0.46375,0)
1 12 (-1.90312,0.59625,0)
1 13 (-1.90312,0.72875,0)
1 14 (-1.90312,0.86125,0)
1 15 (-1.90312,0.99375,0)
2 0 (-1.77187,-0.99375,0)
2 1 (-1.77187,-0.86125,0)
(...)
30 14 (1.90312,0.86125,0)
30 15 (1.90312,0.99375,0)
31 0 (2.03437,-0.99375,0)
31 1 (2.03437,-0.86125,0)
31 2 (2.03437,-0.72875,0)
31 3 (2.03437,-0.59625,0)
31 4 (2.03437,-0.46375,0)
31 5 (2.03437,-0.33125,0)
31 6 (2.03437,-0.19875,0)
31 7 (2.03437,-0.06625,0)
31 8 (2.03437,0.06625,0)
31 9 (2.03437,0.19875,0)
31 10 (2.03437,0.33125,0)
31 11 (2.03437,0.46375,0)
31 12 (2.03437,0.59625,0)
31 13 (2.03437,0.72875,0)
31 14 (2.03437,0.86125,0)
31 15 (2.03437,0.99375,0)
as I was initially suggesting in #43942 (comment) . The clusterizer buffer is larger than the allowed real range, this will have to be possibly reviewed, and in any case adding an assert check is useful to catch immediately possible issues. Even in the RectangularMTDTopology
in first place.
I'll check how to preserve the treatment of the inter-pixel active area gap that @parbol has added without spoiling the index of the pixel itself.
PR description:
This PR implements a new digitization model for the MTD ETL detector. This new model has been presented several times in the MTD DPG meetings. You can find a summary at [1].
This PR introduces potential changes in the digi efficiency (since thresholds have been slightly changed), and in the time of arrival of the charged particles (since the time resolution model has been updated).
The Digi structure (ETLSample) has been also updated to accomodate for one additional parameter: the time over threshold. The ETLUncalibratedRecHit has also been modified maintaining backwards compatibility. If the time over threshold is zero, meaning that it's not present in the DIGI, then the old local reconstruction is run. On the contrary, if the time over threshold is not zero, a new local reconstruction is run, including the time-walk calibration.
[1] https://indico.cern.ch/event/1254512/contributions/5270127/attachments/2595708/4480662/DPG-MTD-Feb17.pdf
PR validation:
1.- A full ttbar workshop has been run up to the DQM stage and the output of the MTD DQM has been inspected.
2.- The same has been done starting from old digis to make sure the code is backwards compatible.
3.- The tests suggested in PRWorkflow.html have been run.