Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support manual addition/exclusion of volumes in HitManager #772

Merged
merged 14 commits into from
Jun 15, 2023

Conversation

sethrj
Copy link
Member

@sethrj sethrj commented May 23, 2023

This is an attempt to address the multithreading problem where CMSSW defines SDs only on workers but not on the master thread. This prevents the export of the SD tag by the GDML exporter, and it also prevents our HitManager from automatically detecting the LVs that correspond to sensitive volumes.

This MR attempts to replicate the behavior with the skipMasterSD option. However, defining SDs on workers but not master is unfortunately forbidden by the regular Geant4 run manager so we can't reproduce it in-house: the SDs defined on the worker thread are subsequently deleted.

Even though we can't reproduce the exact CMSSW behavior, this MR should provide an alternative path: there is a new SD setup option that lets you define a set of LV pointers which will be later queried for SDs on the worker threads.

Additionally I've added a "skip" option so that we can selectively turn off hit mapping on a per-detector basis, so that we could e.g. have a custom HGCal GPU-based detector but send any other hits back to the CPU.

@drbenmorgan this is the issue I was discussing in today's meeting.

@sethrj sethrj added enhancement New feature or request external Dependencies and framework-oriented features labels May 23, 2023
@sethrj sethrj requested a review from whokion May 23, 2023 16:25
@whokion
Copy link
Contributor

whokion commented May 24, 2023

Can you clear compilation errors from HitProcessor.test.cc?

@sethrj
Copy link
Member Author

sethrj commented May 24, 2023

@whokion I pushed a fix just minutes before you asked ;)

@whokion
Copy link
Contributor

whokion commented May 24, 2023

@sethrj I failed to test this PR with CMSSW/cmsRun as there is a crash from VecGeom (unrelated to this MR, see
worker-only-sensdet-cmssw.log with skipMasterSD=false/true, expecting a crash from HitManager with skipMasterSD=false). Testing the standalone app/demo-geant-integration with both this branch and develop, I see a same crash (but somewhat different one from CMSSW), from GeantGeoConverter::convert (see
run3-cms-test.log). So, I guess that we need to address the GeantGeoConverter error first in the develop branch (which I thought that all issues were fixed by @mrguilima). We may merge this now without testing the new feature with CMSSW, but test it later once the Convert related issue is resolved or postpone to merge until then - please advise! @mrguilima, can you confirm that you are able to run demo-geant-integration with cms-run3.gdml or reproduce the crash (you mentioned that some of "!" in volume names were fixed in the gdml. If it is a real issue and related to the crash, can you update the gdml file in the repository?)?

@sethrj
Copy link
Member Author

sethrj commented Jun 14, 2023

WTF CI:

/var/jenkins/workspace/Celeritas_PR-772/src/celeritas/user/DetectorSteps.cu -o src/celeritas/CMakeFiles/celeritas.dir/user/DetectorSteps.cu.o
Warning: Function too large, generated debug information may not be accurate.
Error: warning treated as error

@whokion
Copy link
Contributor

whokion commented Jun 14, 2023

@sethrj Is this branch rebased with the remote develop? I want to test this PR with the sd option on as CMSSW+celeritas with the run3 geometry is now working again with all recent updates in VecGeom and celeritas.

Defining SDs on workers but not master is unfortunately forbidden by the
regular Geant4 run manager so we can't reproduce it in-house: the SDs
defined on the worker thread are subsequently deleted.
@sethrj
Copy link
Member Author

sethrj commented Jun 14, 2023

@whokion Yes, it's got all the features that develop has. It would be great if you could test it! I'm writing a unit test now but that can happen asynchronously with your testing. Thanks!

@whokion
Copy link
Contributor

whokion commented Jun 14, 2023

accel/detail/HitManager.cc:144 should follow a similar recipe like if-else blocks in SharedParams.cc as CMS does not use G4RunManager.

@sethrj
Copy link
Member Author

sethrj commented Jun 14, 2023

Ooh, good point. I'll fix that right away. Thanks!

Copy link
Contributor

@whokion whokion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this for v0.3. Thanks for all good works!

test/accel/SDTestBase.hh Outdated Show resolved Hide resolved
Co-authored-by: Soon Yung Jun <syjun@fnal.gov>
@sethrj sethrj merged commit e6d67ac into celeritas-project:develop Jun 15, 2023
@sethrj sethrj deleted the worker-only-sensdet branch June 15, 2023 02:26
@@ -35,12 +35,26 @@ class HitProcessor;
* exclusions for SDs that are implemented natively on GPU)
* - Maps those volumes to VecGeom geometry
* - Creates a HitProcessor for each Geant4 thread
*
* \warning Because of low-level problems with Geant4 allocators, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐿️

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll fix it in the v0.3 branch.

@pcanal
Copy link
Contributor

pcanal commented Jun 15, 2023

/var/jenkins/workspace/Celeritas_PR-772/src/celeritas/user/DetectorSteps.cu -o src/celeritas/CMakeFiles/celeritas.dir/user/DetectorSteps.cu.o
Warning: Function too large, generated debug information may not be accurate.
Error: warning treated as error

Is the function really large or is it indirectly the case due to in-lining?

@sethrj
Copy link
Member Author

sethrj commented Jun 15, 2023

@pcanal I'm guessing it's inlining with debug code that's doing it? The function itself is super short and I was surprised that it's the only one that triggered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request external Dependencies and framework-oriented features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants