Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log and kill geometry/propagation errors #1290

Closed
wants to merge 3 commits into from

Conversation

sethrj
Copy link
Member

@sethrj sethrj commented Jun 24, 2024

This replaces the "kill looping tracks" with an explicit kernel for killing tracks, and replaces a host-only "validate" for tracks being inside the geometry and having a valid material ID. It logs descriptions and error messages for each of these cases.

A follow-on PR will add internal error states to the geometry tracking so we can catch issues with the current ORANGE implementation.

See #687 .

@sethrj sethrj added enhancement New feature or request physics Particles, processes, and stepping algorithms labels Jun 24, 2024
@sethrj sethrj requested review from amandalund and esseivaju June 24, 2024 17:29
Copy link
Contributor

@amandalund amandalund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sethrj! I think this looks good overall. I'm not sure I would classify looping tracks as a "geometry/propagation error", except in the (hopefully rare) case a track that is actually stuck is marked as looping. I think it's really more of a tracking cut: there's nothing wrong with their behavior and we could continue transporting them to completion, but we choose to kill them because they make progress so slowly and the computational cost of transporting them would be too high. But anyway, nomenclature aside the logic for killing looping tracks or tracks with geometry errors should be the same.

{
//---------------------------------------------------------------------------//
/*!
* Launch the boundary action on device.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Launch the boundary action on device.
* Launch the geometry error action on device.


//---------------------------------------------------------------------------//
/*!
* Launch the boundary action on host.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Launch the boundary action on host.
* Launch the geometry error action on host.

}
else
{
msg << "lost " << deposited << " energy";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
msg << "lost " << deposited << " energy";
msg << "lost " << deposited << " " << Energy::unit_type::label() << " energy";

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm apparently I forgot to push!

Comment on lines +64 to +65
CELER_LOG_LOCAL(error) << "Track entered a volume without an "
"associated material";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances might this occur?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our GeoMaterial input takes a map of volume name/material IDs and fills the rest with invalid IDs. Those IDs can be legitimate if the volume isn't reachable by the tracking routine (e.g., the [EXTERIOR] volume, or other "imaginary" volumes defined for convenience by vecgeom/g4). However if there's an error in the input or importing or something, you can end up with undefined materials...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, makes sense. In what cases do you think we should be asserting/validating vs. logging an error and killing on the CPU/silently killing the track on the GPU?

# define CELER_CHECK_POSITION 0
# if CELERITAS_DEBUG
# undef CELER_CHECK_POSITION
# define CELER_CHECK_POSITION 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be defined in this case?

Suggested change
# define CELER_CHECK_POSITION 0
# define CELER_CHECK_POSITION 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂️

*/
CELER_FUNCTION void SimTrackView::status(TrackStatus status)
{
CELER_EXPECT(status != this->status());
CELER_EXPECT(status != this->status() || status == TrackStatus::killed);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When might we set the status to killed more than once?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It happens if the track is killed during initialization; we can't leave it as "alive" if it has an undefined volume/material so I have to kill it there. This is kind of messy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so if it's killed during initialization and then killed again sometime later (I guess in either the interaction or eloss applier)?

@sethrj
Copy link
Member Author

sethrj commented Jun 24, 2024

This was a bit of a half-baked idea, you're perfectly right about the looping track not being the same as a geometry error. Maybe we should just make this a "tracking cut" action? Should we print log messages immediately when the error occurs?

It also exposes some of the fragility in our implementation... there's still a test failure due to a hardcoded condition not being met...

@amandalund
Copy link
Contributor

Yeah, a "tracking cut" or more generic kind of "track killer" action might make more sense, same with printing the whole error message immediately.

@sethrj sethrj marked this pull request as draft June 25, 2024 12:48
@sethrj
Copy link
Member Author

sethrj commented Jun 25, 2024

OK I think I'm going to rework this: maybe a combination of

  • a tracking cut action, plus
  • a helper function that can choose between logging a message and throwing an exception on host

@sethrj
Copy link
Member Author

sethrj commented Jul 3, 2024

I'm going to close this in favor of a fresh PR.

@sethrj sethrj closed this Jul 3, 2024
@sethrj sethrj deleted the geo-error branch July 3, 2024 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request physics Particles, processes, and stepping algorithms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants