-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In Concurrent*Filters, upgrade to allow a stream to skip a lumi #43816
In Concurrent*Filters, upgrade to allow a stream to skip a lumi #43816
Conversation
cms-bot internal usage |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43816/38628
|
A new Pull Request was created by @wddgit (W. David Dagenhart) for master. It involves the following packages:
@bbilin, @menglu21, @GurpreetSinghChahal, @SiewYan, @cmsbuild, @alberto-sanchez, @mkirsano can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
enable threading |
please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-644838/37118/summary.html Comparison SummarySummary:
|
@makortel @Dr15Jones Since this is more a technical concurrency change than a real generator issue, it might be worth one or both of you reviewing this one in addition to the generator expert. It involves only about 20 line of code... |
@@ -158,12 +158,13 @@ namespace edm { | |||
void initLumi(gen::GenStreamCache<HAD, DEC>* cache, LuminosityBlock const& index, EventSetup const& es) const; | |||
ParameterSet config_; | |||
mutable std::atomic<gen::GenStreamCache<HAD, DEC>*> useInLumi_{nullptr}; | |||
mutable std::atomic<unsigned long long> greatestNStreamEndLumis_{0}; | |||
mutable std::atomic<unsigned long long> nextNGlobalBeginLumis_{1}; | |||
mutable std::atomic<bool> streamEndRunComplete_{true}; | |||
// The next two data members are thread safe and can be safely mutable because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this become
// The next two data members are thread safe and can be safely mutable because | |
// The next three data members are thread safe and can be safely mutable because |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Fixed the comment in both files. Thanks. I will push the changes momentarily.
if (greatestNStreamEndLumis_.compare_exchange_strong(expected, streamCachePtr->nStreamEndLumis_)) { | ||
unsigned long long expected = this->luminosityBlockCache(lumi.index())->nGlobalBeginLumis_; | ||
unsigned long long nextValue = expected + 1; | ||
if (nextNGlobalBeginLumis_.compare_exchange_strong(expected, nextValue)) { | ||
streamEndRunComplete_ = false; | ||
useInLumi_ = streamCachePtr; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure I understood, is the idea that the first streamEndLuminosityBlock (of a Lumi) to arrive here would change the streamEndRunComplete_
and useInLumi_
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment at this point in both files.
@@ -158,12 +158,13 @@ namespace edm { | |||
void initLumi(gen::GenStreamCache<HAD, DEC>* cache, LuminosityBlock const& index, EventSetup const& es) const; | |||
ParameterSet config_; | |||
mutable std::atomic<gen::GenStreamCache<HAD, DEC>*> useInLumi_{nullptr}; | |||
mutable std::atomic<unsigned long long> greatestNStreamEndLumis_{0}; | |||
mutable std::atomic<unsigned long long> nextNGlobalBeginLumis_{1}; | |||
mutable std::atomic<bool> streamEndRunComplete_{true}; | |||
// The next two data members are thread safe and can be safely mutable because | |||
// they are only modified/read in globalBeginRun and globalBeginLuminosityBlock. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm beginning to wonder if this is actually a safe assumption going forward. Previously, because every stream had to see every luminosity block, it meant we could never have concurrent global concurrent LuminosityBlock and/or Runs as the global transition had to fully finish and then start the stream transitions before we'd ask the Source to see what is the next transition.
But now since we allow skipping a stream begin LuminosityBlock transition, it might be possible to change the scheduler to start the global begin LuminosityBlock transition, check the source and see there are no Events in that LuminosityBlock and then start another global begin LuminosityBlock transition concurrently.
I came to this conclusion after re-reading our global module documentation
https://twiki.cern.ch/twiki/bin/view/CMSPublic/FWMultithreadedFrameworkGlobalModuleInterface#edm_LuminosityBlockCacheT
where no guarantees are made that concurrent calls to global* transitions will not happen (except globalBegin always finish before streamBegin would happen, and the reverse for End transitions as well as some guarantees related to Summary calls).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first thought on this comment is that we should go ahead with this PR as is. This PR is relatively small and accomplishes the immediate goal with minimal changes.
If we decide to add the ability to handle concurrent global begin transitions, then we should handle that as a separate effort. For this module to support that additional concurrency would require significantly more changes than the 20 or so lines changed here. For example, the useInLumi_ pointer would not work as is because multiple global lumi transitions could be using it in concurrently. So it would be significantly more development work to allow this module to support concurrent global begin lumi transitions.
And I think that assumption is likely built into other things as well. I know as I was working on EventProcessor implementing run concurrency I was taking that as an assumption I could rely on. Just figuring out what we would need to modify would take some non-trivial effort.
I also think we would need to consider whether that additional concurrency gives us a performance improvement that is significant enough to justify the effort to develop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the least I think the code should document the assumptions about what it expects the frameworks behavior to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment on top of that section of six member variables in a row and removed the 2 line comment that started this discussion. Let me know if there is a wording that you would like better. Pushed and squashed.
Note that this module forces globalBeginLuminosityBlock to wait until one stream has proceeded all the way to the end stream lumi transition for the previous lumi before it starts...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also there is a long comment at the top of the file describing the unusual behavior and why it is necessary and a couple alternatives.
e1ace0a
to
249f066
Compare
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43816/38688
|
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43816/38690
|
Pull request #43816 was updated. @mkirsano, @SiewYan, @GurpreetSinghChahal, @menglu21, @bbilin, @cmsbuild, @alberto-sanchez can you please check and sign again. |
please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-644838/37194/summary.html Comparison SummarySummary:
|
Milestone for this pull request has been moved to CMSSW_14_1_X. Please open a backport if it should also go in to CMSSW_14_0_X. |
From |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @antoniovilela, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
Improve ConcurrentHadronizerFilter and ConcurrentGeneratorFilter to handle the new Framework behavior where a stream can skip a lumi if before that stream starts at least one other stream already processed the lumi and other streams already started processing all events in the lumi.
PR #43522 implements the new Framework behavior. It is waiting for this and other pull requests that prepare modules outside the Framework for this behavior to be merged.
PR validation:
Relies on existing tests. The behavior of the modules should not change. The unit test
TestGeneratorInterfacePythia8ConcurrentGeneratorFilter
is most relevant because it has multiple lumis. It passes and I stepped through that one with a debugger and the correct things seem to be happening. These are used in a lot of runTheMatrix tests also (although usually with 1 lumi, which isn't as useful as a test). The changes are almost identical in the two modules.