Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Alpaka implementation of PFRecHitProducer #42807

Merged
merged 6 commits into from
Oct 27, 2023

Conversation

fllor
Copy link
Contributor

@fllor fllor commented Sep 18, 2023

PR description:

This pull-request is the first step towards making the ParticleFlow algorithm portable and heterogeneous using Alpaka. It adds a PFRecHitSoA data format and a producer that constructs it. The module can process HCAL and ECAL hits and runs on the CPU and CUDA backends of Alpaka. The ROCm backend compiles, but is untested.
Since calorimeter reconstruction is not available using Alpaka yet, this PR also introduces a CaloRecHitSoA format and a producer that converts calorimeter hits from the CPU modules to the SoA. This format and producer will become obsolete, when then HCAL/ECAL groups directly provide their own SoA formats.
A producer that converts to new PFRecHitSoA format back to the existing PFRecHit collection format is included. It can be used for validation and to run down-stream modules.
This PR does not alter any existing workflow in any way. The Alpaka module is provided but not executed for HLT reconstruction. For now, it can be used for testing and for down-stream development. The transition from the CPU implementation to Alpaka is intended to happen later, when an Alpaka implementation of the PFClusterProducer is also available.

PR validation:

Extensive validation of the Alpaka implementation is provided, see commit message of 3f4d123.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

This is not a backport.

@hatakeyamak @jsamudio

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42807/36921

  • This PR adds an extra 204KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fllor for master.

It involves the following packages:

  • DataFormats/ParticleFlowReco (reconstruction)
  • RecoParticleFlow/PFRecHitProducer (****)

The following packages do not have a category, yet:

RecoParticleFlow/PFRecHitProducer
Please create a PR for https://github.com/cms-sw/cms-bot/blob/master/categories_map.py to assign category

@cmsbuild, @jfernan2, @mandrenguyen can you please review it and eventually sign? Thanks.
@mmarionncern, @rchatter, @wang0jin, @rovere, @argiro, @missirol, @thomreis, @hatakeyamak, @lgray, @seemasharmafnal this is something you requested to watch as well.
@rappoccio, @antoniovilela, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

fllor added a commit to fllor/cms-bot that referenced this pull request Sep 18, 2023
@fllor
Copy link
Contributor Author

fllor commented Sep 18, 2023

The following packages do not have a category, yet:
RecoParticleFlow/PFRecHitProducer Please create a PR for https://github.com/cms-sw/cms-bot/blob/master/categories_map.py to assign category

PR created: cms-sw/cms-bot/pull/2070

@makortel
Copy link
Contributor

assign heterogeneous

@cmsbuild
Copy link
Contributor

New categories assigned: heterogeneous

@fwyzard,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

@hatakeyamak
Copy link
Contributor

It may be worthwhile to mention that @filor will talk about the work that went into this PR tomorrow in the particle flow group meeting.
https://indico.cern.ch/event/1326067/
@swagata87 [I don't know Riccardo's github username.]

DataFormats/ParticleFlowReco/BuildFile.xml Show resolved Hide resolved
DataFormats/ParticleFlowReco/BuildFile.xml Outdated Show resolved Hide resolved
DataFormats/ParticleFlowReco/src/alpaka/classes_serial.h Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a new package is warranted. IIUC the current PFRecHitProducer lives in RecoParticleFlow/PFClusterProducer. On the other hand, the set of dependencies is quite different between the packages.

@swagata87
Copy link
Contributor

FYI @cms-sw/pf-l2 @bellan

@swagata87
Copy link
Contributor

type pf

fllor added a commit to fllor/cmssw that referenced this pull request Oct 1, 2023
1) Merge files src/alpaka/classes_serial{.h,_def.xml} into src/classes{.h,_def.xml} and update BuildFile.xml
2) Split class definitions into two files, undoing commit 42fd4a7. This is possible due to PR cms-sw/cmsdist/pull/8705. Switching to this version required a small change to the Python configuration.
These two changes were suggested for PR cms-sw#42807.
@hatakeyamak
Copy link
Contributor

hatakeyamak commented Oct 16, 2023

@fllor Just checking in where we stand.
Do you think you addressed all suggestions from @makortel et al? If so, it will be good to explicitly clarify that.

@hatakeyamak
Copy link
Contributor

BTW, we should also adopt GT based HCAL threshold in the alpaka version too. Maybe in a follow-up PR.
@fllor @jsamudio

@fwyzard
Copy link
Contributor

fwyzard commented Oct 17, 2023

@fllor I will send you some commits to address some technical and stylish issues:

  • fix the dictionaries under DataFormats/ParticleFlowReco;
  • when the host types are in the reco namespace, move the device types to the ALPAKA_ACCELERATOR_NAMESPACE::reco namespace;
  • make the host types available in the ALPAKA_ACCELERATOR_NAMESPACE::reco namespace;
  • clean up the header guards;
  • update the names of structs and constants to follow the CMS coding rules (types are in Uppercase, variables and constants are in lowercase).

@fwyzard
Copy link
Contributor

fwyzard commented Oct 17, 2023

Please pick the three four commits from https://github.com/fwyzard/cmssw/tree/pfRecHitProducerAlpaka .

fllor added a commit to fllor/cmssw that referenced this pull request Oct 17, 2023
…package

1) Allocate buffers used for PFRecHitProducerKernel per-event. This allows making the producer global.
2) Change ::Grids synchronisation to ::Blocks synchronisation for GPU backend. This removes the need to distinguish between CPU and GPU backends in the kernel code.
3) Simplify code; add comments; more meaningful names; adapt CMSSW code style
These two changes were suggested for PR cms-sw#42807.
@cmsbuild
Copy link
Contributor

Pull request #42807 was updated. @mandrenguyen, @fwyzard, @jfernan2, @makortel, @cmsbuild can you please check and sign again.

@fllor
Copy link
Contributor Author

fllor commented Oct 25, 2023

I combined all the commits from the review into the original ones. I tried to put everything into a logical sequence, with different parts of the module being introduced in separate commits. Hopefully this can help to understand the code, in case someone looks at it in the future.
I think I have now addressed every suggestion from the review. Please let me know if I missed anything.

@fwyzard
Copy link
Contributor

fwyzard commented Oct 25, 2023

+heterogeneous

@fwyzard
Copy link
Contributor

fwyzard commented Oct 25, 2023

please test

@hatakeyamak
Copy link
Contributor

Thank you. So, in terms of signatures, we still need it from reconstruction and pf?

@fwyzard
Copy link
Contributor

fwyzard commented Oct 25, 2023

Thank you. So, in terms of signatures, we still need it from reconstruction and pf?

Technically, only from @cms-sw/reconstruction-l2, while @cms-sw/pf-l2 is only "for information" and does not require a signature.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cb4096/35413/summary.html
COMMIT: 3f4d123
CMSSW: CMSSW_13_3_X_2023-10-24-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/42807/35413/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 69 lines to the logs
  • Reco comparison results: 29 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3351458
  • DQMHistoTests: Total failures: 1073
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3350363
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 213 log files, 166 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

jsamudio pushed a commit to jsamudio/cmssw that referenced this pull request Oct 25, 2023
1) Merge files src/alpaka/classes_serial{.h,_def.xml} into src/classes{.h,_def.xml} and update BuildFile.xml
2) Split class definitions into two files, undoing commit 42fd4a7. This is possible due to PR cms-sw/cmsdist/pull/8705. Switching to this version required a small change to the Python configuration.
These two changes were suggested for PR cms-sw#42807.
jsamudio pushed a commit to jsamudio/cmssw that referenced this pull request Oct 25, 2023
…package

1) Allocate buffers used for PFRecHitProducerKernel per-event. This allows making the producer global.
2) Change ::Grids synchronisation to ::Blocks synchronisation for GPU backend. This removes the need to distinguish between CPU and GPU backends in the kernel code.
3) Simplify code; add comments; more meaningful names; adapt CMSSW code style
These two changes were suggested for PR cms-sw#42807.
@mandrenguyen
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @rappoccio, @sextonkennedy (and backports should be raised in the release meeting by the corresponding L2)

jsamudio added a commit to jsamudio/cmssw that referenced this pull request Oct 26, 2023
@antoniovilela
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants