Log and kill geometry/propagation errors #1290

sethrj · 2024-06-24T17:29:16Z

This replaces the "kill looping tracks" with an explicit kernel for killing tracks, and replaces a host-only "validate" for tracks being inside the geometry and having a valid material ID. It logs descriptions and error messages for each of these cases.

A follow-on PR will add internal error states to the geometry tracking so we can catch issues with the current ORANGE implementation.

See #687 .

amandalund

Thanks @sethrj! I think this looks good overall. I'm not sure I would classify looping tracks as a "geometry/propagation error", except in the (hopefully rare) case a track that is actually stuck is marked as looping. I think it's really more of a tracking cut: there's nothing wrong with their behavior and we could continue transporting them to completion, but we choose to kill them because they make progress so slowly and the computational cost of transporting them would be too high. But anyway, nomenclature aside the logic for killing looping tracks or tracks with geometry errors should be the same.

amandalund · 2024-06-24T18:59:51Z

src/celeritas/geo/detail/GeoErrorAction.cu

+{
+//---------------------------------------------------------------------------//
+/*!
+ * Launch the boundary action on device.


Suggested change

* Launch the boundary action on device.

* Launch the geometry error action on device.

amandalund · 2024-06-24T19:00:09Z

src/celeritas/geo/detail/GeoErrorAction.cc

+
+//---------------------------------------------------------------------------//
+/*!
+ * Launch the boundary action on host.


Suggested change

* Launch the boundary action on host.

* Launch the geometry error action on host.

amandalund · 2024-06-24T19:27:56Z

src/celeritas/geo/detail/GeoErrorExecutor.hh

+    }
+    else
+    {
+        msg << "lost " << deposited << " energy";


Suggested change

msg << "lost " << deposited << " energy";

msg << "lost " << deposited << " " << Energy::unit_type::label() << " energy";

Hmmm apparently I forgot to push!

amandalund · 2024-06-24T19:40:14Z

src/celeritas/geo/detail/BoundaryExecutor.hh

+            CELER_LOG_LOCAL(error) << "Track entered a volume without an "
+                                      "associated material";


Under what circumstances might this occur?

Our GeoMaterial input takes a map of volume name/material IDs and fills the rest with invalid IDs. Those IDs can be legitimate if the volume isn't reachable by the tracking routine (e.g., the [EXTERIOR] volume, or other "imaginary" volumes defined for convenience by vecgeom/g4). However if there's an error in the input or importing or something, you can end up with undefined materials...

Thanks, makes sense. In what cases do you think we should be asserting/validating vs. logging an error and killing on the CPU/silently killing the track on the GPU?

amandalund · 2024-06-24T19:45:06Z

src/celeritas/global/alongstep/detail/PropagationApplier.hh

-#    define CELER_CHECK_POSITION 0
+#    if CELERITAS_DEBUG
+#        undef CELER_CHECK_POSITION
+#        define CELER_CHECK_POSITION 0


Should this be defined in this case?

Suggested change

# define CELER_CHECK_POSITION 0

# define CELER_CHECK_POSITION 1

🤦‍♂️

amandalund · 2024-06-24T20:20:36Z

src/celeritas/track/SimTrackView.hh

 */
 CELER_FUNCTION void SimTrackView::status(TrackStatus status)
 {
-    CELER_EXPECT(status != this->status());
+    CELER_EXPECT(status != this->status() || status == TrackStatus::killed);


When might we set the status to killed more than once?

It happens if the track is killed during initialization; we can't leave it as "alive" if it has an undefined volume/material so I have to kill it there. This is kind of messy.

Ok, so if it's killed during initialization and then killed again sometime later (I guess in either the interaction or eloss applier)?

sethrj · 2024-06-24T21:37:43Z

This was a bit of a half-baked idea, you're perfectly right about the looping track not being the same as a geometry error. Maybe we should just make this a "tracking cut" action? Should we print log messages immediately when the error occurs?

It also exposes some of the fragility in our implementation... there's still a test failure due to a hardcoded condition not being met...

amandalund · 2024-06-24T22:44:23Z

Yeah, a "tracking cut" or more generic kind of "track killer" action might make more sense, same with printing the whole error message immediately.

sethrj · 2024-06-25T18:15:05Z

OK I think I'm going to rework this: maybe a combination of

a tracking cut action, plus
a helper function that can choose between logging a message and throwing an exception on host

sethrj · 2024-07-03T21:17:39Z

I'm going to close this in favor of a fresh PR.

sethrj added 3 commits June 24, 2024 13:05

Replace implicit "kill looping" with explicit "geo error" action

63d8323

Kill incorrectly initialized tracks

1894c6f

Log locally, fix tests, and also error when entering undefined material

90ba4b3

sethrj added enhancement New feature or request physics Particles, processes, and stepping algorithms labels Jun 24, 2024

sethrj requested review from amandalund and esseivaju June 24, 2024 17:29

amandalund reviewed Jun 24, 2024

View reviewed changes

sethrj marked this pull request as draft June 25, 2024 12:48

sethrj mentioned this pull request Jun 27, 2024

Add new track status and support user "initialization" #1294

Merged

sethrj closed this Jul 3, 2024

sethrj deleted the geo-error branch July 3, 2024 21:18

sethrj mentioned this pull request Jul 6, 2024

Add "tracking cut" to handle errors and kill tracks #1311

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log and kill geometry/propagation errors #1290

Log and kill geometry/propagation errors #1290

sethrj commented Jun 24, 2024

amandalund left a comment

amandalund Jun 24, 2024

amandalund Jun 24, 2024

amandalund Jun 24, 2024

sethrj Jun 24, 2024

amandalund Jun 24, 2024

sethrj Jun 24, 2024

amandalund Jun 24, 2024

amandalund Jun 24, 2024

sethrj Jun 24, 2024

amandalund Jun 24, 2024

sethrj Jun 24, 2024

amandalund Jun 24, 2024

sethrj commented Jun 24, 2024

amandalund commented Jun 24, 2024

sethrj commented Jun 25, 2024

sethrj commented Jul 3, 2024

	* Launch the boundary action on device.
	* Launch the geometry error action on device.

	* Launch the boundary action on host.
	* Launch the geometry error action on host.

	msg << "lost " << deposited << " energy";
	msg << "lost " << deposited << " " << Energy::unit_type::label() << " energy";

		CELER_LOG_LOCAL(error) << "Track entered a volume without an "
		"associated material";

	# define CELER_CHECK_POSITION 0
	# define CELER_CHECK_POSITION 1

Log and kill geometry/propagation errors #1290

Log and kill geometry/propagation errors #1290

Conversation

sethrj commented Jun 24, 2024

amandalund left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sethrj commented Jun 24, 2024

amandalund commented Jun 24, 2024

sethrj commented Jun 25, 2024

sethrj commented Jul 3, 2024