-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add StreamId to allow thread-safe data access in Actions #693
Conversation
src/celeritas/global/CoreParams.hh
Outdated
//! True if all params are assigned | ||
explicit operator bool() const | ||
{ | ||
return geometry && material && geomaterial && particle && cutoff | ||
&& physics && rng && init && action_reg; | ||
&& physics && rng && init && action_reg && max_streams; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the default assigned value of max_streams
is 1
, the doc is now slightly off compare to the implementation. I.e. it sounds like it is now "True if all params are assigned except max_streams, where true means user has either not assigned it or assigned it to a value not zero" :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll change from "assigned" to "valid".
#undef CP_VALIDATE_INPUT | ||
|
||
CELER_EXPECT(input_); | ||
|
||
CoreScalars scalars; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a potential usecase of IIFE:
auto const scalars{[this]() -> CoreScalars {
// Construct geometry action
auto const boundary_action{input_.action_reg->next_id()};
input_.action_reg->insert(
std::make_shared<celeritas::generated::BoundaryAction>(
boundary_action, "geo-boundary", "Boundary crossing"));
// Construct implicit limit for propagator pausing midstep
auto const propagation_limit_action{input_.action_reg->next_id()};
input_.action_reg->insert(std::make_shared<ImplicitGeometryAction>(
propagation_limit_action,
"geo-propagation-limit",
"Propagation substep/range limit"));
return {boundary_action, propagation_limit_action, input_.max_streams};
}()};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the use of initializers here reduces readability and increases the chance of an error: if someone switches the order of the boundary/propagation in the struct this will cause a nasty subtle bug. Also, one of the main reasons I like IIFE is to limit the scope of temporaries that are just used to calculate another parameter; and the original code doesn't have any temporaries (it saves directly to the scalars).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair, while we don't have temporaries here I guess other advantages are that you can declare scalars const and I would say it also clearly outlines the code initializing scalars. This could be a more readable, less error-prone middle-ground:
auto const scalars = [this] {
CoreScalars scalars;
// Construct geometry action
scalars.boundary_action = input_.action_reg->next_id();
input_.action_reg->insert(
std::make_shared<celeritas::generated::BoundaryAction>(
scalars.boundary_action, "geo-boundary", "Boundary crossing"));
// Construct implicit limit for propagator pausing midstep
scalars.propagation_limit_action = input_.action_reg->next_id();
input_.action_reg->insert(std::make_shared<ImplicitGeometryAction>(
scalars.propagation_limit_action,
"geo-propagation-limit",
"Propagation substep/range limit"));
// Construct action for killed looping tracks
scalars.abandon_looping_action = input_.action_reg->next_id();
input_.action_reg->insert(
std::make_shared<ImplicitSimAction>(scalars.abandon_looping_action,
"abandon-looping",
"Abandoned looping track"));
// Save maximum number of streams
scalars.max_streams = input_.max_streams;
return scalars;
}();
StepperInput input; | ||
input.params = input_.params; | ||
input.num_track_slots = input_.num_track_slots; | ||
// TODO: change when doing multithreading on the front end | ||
input.stream_id = StreamId{0}; | ||
input.sync = input_.sync; | ||
Stepper<M> step(std::move(input)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Philippe's comment could apply here as well. I think we get copy elision here instead of ctor/move (not that it matters much)
StepperInput input; | |
input.params = input_.params; | |
input.num_track_slots = input_.num_track_slots; | |
// TODO: change when doing multithreading on the front end | |
input.stream_id = StreamId{0}; | |
input.sync = input_.sync; | |
Stepper<M> step(std::move(input)); | |
// TODO: change StreamId{0} when doing multithreading on the front end | |
Stepper<M> step( | |
{input_.params, StreamId{0}, input_.num_track_slots, input_.sync}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this could save a handful of inexpensive CPU instructions that are called once per code execution 🙃 but I think the reduced code clarity doesn't compensate.
Looks like avoid the move saves two instructions out of ~50, a mov
and an punpcklqdq
: https://godbolt.org/z/9PozcGYza so definitely not worth it IMHO.
(hip-ndebug test failure is from #685) |
This is the first infrastructure step (no meaningful thread safety changes) to fixing the massive thread unsafety in the HitCollector (see #691) and enabling CUDA streams etc. from multiple threads (see #418).