Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StreamId to allow thread-safe data access in Actions #693

Merged
merged 7 commits into from
Mar 24, 2023

Conversation

sethrj
Copy link
Member

@sethrj sethrj commented Mar 21, 2023

This is the first infrastructure step (no meaningful thread safety changes) to fixing the massive thread unsafety in the HitCollector (see #691) and enabling CUDA streams etc. from multiple threads (see #418).

@sethrj sethrj added enhancement New feature or request core Software engineering infrastructure labels Mar 21, 2023
@sethrj sethrj requested review from esseivaju and amandalund March 21, 2023 22:11
//! True if all params are assigned
explicit operator bool() const
{
return geometry && material && geomaterial && particle && cutoff
&& physics && rng && init && action_reg;
&& physics && rng && init && action_reg && max_streams;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the default assigned value of max_streams is 1, the doc is now slightly off compare to the implementation. I.e. it sounds like it is now "True if all params are assigned except max_streams, where true means user has either not assigned it or assigned it to a value not zero" :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll change from "assigned" to "valid".

#undef CP_VALIDATE_INPUT

CELER_EXPECT(input_);

CoreScalars scalars;
Copy link
Contributor

@esseivaju esseivaju Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a potential usecase of IIFE:

    auto const scalars{[this]() -> CoreScalars {
        // Construct geometry action
        auto const boundary_action{input_.action_reg->next_id()};
        input_.action_reg->insert(
            std::make_shared<celeritas::generated::BoundaryAction>(
                boundary_action, "geo-boundary", "Boundary crossing"));

        // Construct implicit limit for propagator pausing midstep
        auto const propagation_limit_action{input_.action_reg->next_id()};
        input_.action_reg->insert(std::make_shared<ImplicitGeometryAction>(
            propagation_limit_action,
            "geo-propagation-limit",
            "Propagation substep/range limit"));

        return {boundary_action, propagation_limit_action, input_.max_streams};
    }()};

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the use of initializers here reduces readability and increases the chance of an error: if someone switches the order of the boundary/propagation in the struct this will cause a nasty subtle bug. Also, one of the main reasons I like IIFE is to limit the scope of temporaries that are just used to calculate another parameter; and the original code doesn't have any temporaries (it saves directly to the scalars).

Copy link
Contributor

@esseivaju esseivaju Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, while we don't have temporaries here I guess other advantages are that you can declare scalars const and I would say it also clearly outlines the code initializing scalars. This could be a more readable, less error-prone middle-ground:

    auto const scalars = [this] {
        CoreScalars scalars;
        // Construct geometry action
        scalars.boundary_action = input_.action_reg->next_id();
        input_.action_reg->insert(
            std::make_shared<celeritas::generated::BoundaryAction>(
                scalars.boundary_action, "geo-boundary", "Boundary crossing"));

        // Construct implicit limit for propagator pausing midstep
        scalars.propagation_limit_action = input_.action_reg->next_id();
        input_.action_reg->insert(std::make_shared<ImplicitGeometryAction>(
            scalars.propagation_limit_action,
            "geo-propagation-limit",
            "Propagation substep/range limit"));

        // Construct action for killed looping tracks
        scalars.abandon_looping_action = input_.action_reg->next_id();
        input_.action_reg->insert(
            std::make_shared<ImplicitSimAction>(scalars.abandon_looping_action,
                                                "abandon-looping",
                                                "Abandoned looping track"));

        // Save maximum number of streams
        scalars.max_streams = input_.max_streams;
        return scalars;
    }();

Comment on lines 185 to 191
StepperInput input;
input.params = input_.params;
input.num_track_slots = input_.num_track_slots;
// TODO: change when doing multithreading on the front end
input.stream_id = StreamId{0};
input.sync = input_.sync;
Stepper<M> step(std::move(input));
Copy link
Contributor

@esseivaju esseivaju Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Philippe's comment could apply here as well. I think we get copy elision here instead of ctor/move (not that it matters much)

Suggested change
StepperInput input;
input.params = input_.params;
input.num_track_slots = input_.num_track_slots;
// TODO: change when doing multithreading on the front end
input.stream_id = StreamId{0};
input.sync = input_.sync;
Stepper<M> step(std::move(input));
// TODO: change StreamId{0} when doing multithreading on the front end
Stepper<M> step(
{input_.params, StreamId{0}, input_.num_track_slots, input_.sync});

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this could save a handful of inexpensive CPU instructions that are called once per code execution 🙃 but I think the reduced code clarity doesn't compensate.

Looks like avoid the move saves two instructions out of ~50, a mov and an punpcklqdq: https://godbolt.org/z/9PozcGYza so definitely not worth it IMHO.

@amandalund
Copy link
Contributor

(hip-ndebug test failure is from #685)

@sethrj sethrj merged commit 19440db into celeritas-project:develop Mar 24, 2023
@sethrj sethrj deleted the stream-id branch May 27, 2023 10:18
@sethrj sethrj added performance Changes for performance optimization and removed core Software engineering infrastructure labels Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance Changes for performance optimization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants