Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SiStripApproximateClusterCollection as a simple format for RAW (re-vamped) #42495

Merged
merged 7 commits into from
Aug 15, 2023

Conversation

mmusich
Copy link
Contributor

@mmusich mmusich commented Aug 7, 2023

Originally in #42022 from @jeongeun.

PR description:

Based on the issue "Dataformat compatibility issue for HI SiStrip cluster in RAW" (#39106)

Aim : changing Data-format (edmNew::DetSetVector) in RAW to be simple-enough for infinite backwards compatibility. -> It has to be readable by all future CMSSW releases.
Re-defining the corresponding final data-types directly in the ApproximatedClusters.
Need to be straightforward to convert from edmNew::DetSetVector.

The simplified data format has updated (recommended by Matti in 2022 Sep)
(master...makortel:cmssw:siStripApproximateClusterCollection_v2)

Target : 13_2_X release for the 2023 HeavyIon data-taking.

PR validation:

Tested in CMSSW_13_3_X_2023-08-02-1100, the basic test passed in the CMSSW PR instructions:

  • passes runTheMatrix.py -l 140.58 -t 4 -j 8 --ibeos

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Not a backport, but needs to be backported to CMSSW_13_2_X.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2023

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42495/36499

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2023

A new Pull Request was created by @mmusich (Marco Musich) for master.

It involves the following packages:

  • DQM/SiStripMonitorApproximateCluster (dqm)
  • DataFormats/SiStripCluster (reconstruction)
  • RecoLocalTracker/SiStripClusterizer (reconstruction)

@micsucmed, @nothingface0, @emanueleusai, @clacaputo, @cmsbuild, @pmandrik, @syuvivida, @tjavaid, @mandrenguyen, @rvenditti can you please review it and eventually sign? Thanks.
@echabert, @felicepantaleo, @VourMa, @gbenelli, @missirol, @GiacomoSguazzoni, @yduhm, @robervalwalsh, @JanFSchulte, @rovere, @VinInn, @alesaggio, @gpetruc, @fioriNTU, @jandrea, @mtosi, @idebruyn, @mmusich, @threus, @jlidrych this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Contributor Author

mmusich commented Aug 7, 2023

test parameters:

  • workflow = 140.58

@mmusich
Copy link
Contributor Author

mmusich commented Aug 7, 2023

@cmsbuild, please test

@mmusich
Copy link
Contributor Author

mmusich commented Aug 7, 2023

As expected, 140.6, step2 fails because the RAW' dataformat in its 2022 layout is different from the one proposed in this PR. I am wondering what are the plans for that (i.e. discard RAW' data from 2022 entirely, or try to support it)

@mandrenguyen
Copy link
Contributor

@mmusich As far as I'm concerned the 2022 test data was only for testing. We have all of that data in the regular RAW format. So IMO we do not need to support it.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2023

-1

Failed Tests: RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-186115/34140/summary.html
COMMIT: cef181a
CMSSW: CMSSW_13_3_X_2023-08-07-1100/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/42495/34140/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

  • 140.6140.6_RunHI2022/step2_RunHI2022.log

Comparison Summary

Summary:

  • You potentially added 175 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 48 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3212500
  • DQMHistoTests: Total failures: 278
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3212200
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 211 log files, 161 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Contributor Author

mmusich commented Aug 8, 2023

@mandrenguyen

As far as I'm concerned the 2022 test data was only for testing. We have all of that data in the regular RAW format. So IMO we do not need to support it.

shall I just remove 140.6 from the matrix?

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 8, 2023

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42495/36512

@srimanob
Copy link
Contributor

+Upgrade

From the upgrade side, only removing workflow and step.

@emanueleusai
Copy link
Member

+1

@sunilUIET
Copy link
Contributor

+pdmv

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@rappoccio
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 9876a9d into cms-sw:master Aug 15, 2023
@mmusich mmusich deleted the mm_ApproxCluster_dataformat branch August 15, 2023 15:06
@mandrenguyen
Copy link
Contributor

@mandrenguyen

As far as I'm concerned the 2022 test data was only for testing. We have all of that data in the regular RAW format. So IMO we do not need to support it.

shall I just remove 140.6 from the matrix?

@mmusich Coming back to this issue, I've managed to confuse myself about this incompatibility.
Indeed, as I said, I don't think we'll care about reading rawprime from 2022 in new releases.
However 140.6 is actually reading RAW and running REPACK:DigiToApproxClusterRaw to turn it into RAW'.
Since RAW is unchanged, shouldn't we still be able to do this in 13_2_X?
I think we need some kind of offline workflow that will mimic what the HLT will be doing in order to make sure we're ready for the new data.
We do seem to have that in place for MC (160.02), but not for real data.

@mandrenguyen
Copy link
Contributor

I just realized that wf 140.6 did REPACK:DigiToApproxClusterRaw in CMSSW_12_5_X, but not in newer releases.
In any case, we have a 140.58 which runs REPACK:DigiToApproxClusterRaw on 2018 PbPb data.
What should have a Run 3 wf that does the same.
However, if I try to run the Run 3 equivalent of 140.58:
cmsDriver.py repack --scenario pp --conditions auto:run3_data_prompt -s REPACK:DigiToApproxClusterRaw --datatier GEN-SIM-DIGI-RAW-HLTDEBUG --eventcontent REPACKRAW --era Run3_pp_on_PbPb_approxSiStripClusters -n 10 --procModifiers approxSiStripClusters --repacked --process REHLT --filein /store/hidata/HIRun2022A/HITestRaw6/RAW/v1/000/362/321/00000/76c3e7a8-896e-4671-9563-d6c596da5252.root --customise_commands "process.rawPrimeDataRepacker.src='rawDataRepacker'"

I'm getting the following errors:
`
%MSG-w SiStripRawToDigi: SiStripRawToDigiModule:siStripDigisHLT 18-Aug-2023 01:55:35 CEST Run: 362321 Event: 79195409
NULL pointer to FEDRawData for FED: id 135
Note: further warnings of this type will be suppressed (this can be changed by enabling debugging printout)
%MSG
----- Begin Fatal Exception 18-Aug-2023 01:55:35 CEST-----------------------
An exception of category 'NoRecord' occurred while
[0] Processing Event run: 362321 lumi: 165 event: 79195409 stream: 0
[1] Running path 'REPACKRAWoutput_step'
[2] Prefetching for module PoolOutputModule/'REPACKRAWoutput'
[3] Prefetching for module SiStripClusters2ApproxClusters/'hltSiStripClusters2ApproxClusters'
[4] Calling method for module BeamSpotOnlineProducer/'hltBeamSpotProducer'
Exception Message:
No "BeamSpotTransientObjectsRcd" record found in the EventSetup.

Please add an ESSource or ESProducer that delivers such a record.
`
Any idea how to fix this?

@mmusich
Copy link
Contributor Author

mmusich commented Aug 18, 2023

@mandrenguyen

Any idea how to fix this?

yes. The following patch

diff --git a/Configuration/StandardSequences/python/DigiToRaw_Repack_cff.py b/Configuration/StandardSequences/python/DigiToRaw_Repack_cff.py
index a712ecaf6fc..6adc8c873da 100644
--- a/Configuration/StandardSequences/python/DigiToRaw_Repack_cff.py
+++ b/Configuration/StandardSequences/python/DigiToRaw_Repack_cff.py
@@ -84,5 +84,10 @@ hltScalersRawToDigi =  cms.EDProducer( "ScalersRawToDigi",
    scalersInputTag = cms.InputTag( "rawDataRepacker" )
 )
 
+import RecoVertex.BeamSpotProducer.onlineBeamSpotESProducer_cfi as _mod
+BeamSpotESProducer = _mod.onlineBeamSpotESProducer.clone(
+    timeThreshold = 999999 # for express allow >48h old payloads for replays. DO NOT CHANGE
+)
+
 DigiToApproxClusterRawTask = cms.Task(hltSiStripRawToDigi,siStripZeroSuppressionHLT,hltScalersRawToDigi,hltBeamSpotProducer,siStripClustersHLT,hltSiStripClusters2ApproxClusters,rawPrimeDataRepacker)
 DigiToApproxClusterRaw = cms.Sequence(DigiToApproxClusterRawTask)

allows:

cmsDriver.py repack --scenario pp --conditions auto:run3_data_prompt -s REPACK:DigiToApproxClusterRaw --datatier GEN-SIM-DIGI-RAW-HLTDEBUG --eventcontent REPACKRAW --era Run3_pp_on_PbPb_approxSiStripClusters -n 10 --procModifiers approxSiStripClusters --repacked --process REHLT --filein /store/hidata/HIRun2022A/HITestRaw6/RAW/v1/000/362/321/00000/76c3e7a8-896e-4671-9563-d6c596da5252.root --customise_commands "process.rawPrimeDataRepacker.src='rawDataRepacker'"

to run to completion.

@mmusich
Copy link
Contributor Author

mmusich commented Aug 18, 2023

What should have a Run 3 wf that does the same.

hopefully #42600 should address this.

@fwyzard
Copy link
Contributor

fwyzard commented Aug 24, 2023

The EDCollection, on the other hand, looks like something that could be easily replaced with just std::vector. Therefore it may potentially be removed in the future, and I'm hesitant to extend the backwards-compatibility guarantee for that. An std::vector<DetId> would be fine (we just need to document its inclusion to "RAW umbrella" and add the necessary reading ability tests).

Why does the EDCollection template even exist ?
Issues with ROOT supporting std::vector as a top level data format (17 years ago) ?

@mmusich
Copy link
Contributor Author

mmusich commented Aug 25, 2023

Why does the EDCollection template even exist ?

I am not sure, but (also discussing with @missirol) I think we could just get rid of it entirely and drop-in replace it instead an std::vector<DetId> in the Strip unpacker and use that in all clients instead of creating conversions back and forth to persist the data in the RAW' samples.

@missirol
Copy link
Contributor

drop-in replace it instead an std::vector<DetId> in the Strip unpacker and use that in all clients

This is attempted in #42662 (thanks to pointers I got from @mmusich).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.