-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add debugging output to the propagation kernel #902
base: main
Are you sure you want to change the base?
Conversation
8dd1c48
to
0490746
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the PR in general except the naming of failure_mode
. Couldn't this be exit_mode
or sth like that?
device/common/include/traccc/finding/device/impl/propagate_to_next_surface.ipp
Outdated
Show resolved
Hide resolved
0490746
to
e6d7f70
Compare
I've renamed it to exit mode as you requested! |
This commit adds some optional debugging output to the propagation kernel, allowing it to output more precise statistics on why tracks were killed.
e6d7f70
to
96a87d3
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could imagine a setup in which some shared helper functions would be used by both the CUDA and SYCL code for allocating the debug object in memory, getting it back, and printing it. Since I'm a bit worried about only adding this feature to the CUDA code at the moment. (Not even the host code...?)
void log_propagate_to_next_surface_debug( | ||
const propagate_to_next_surface_debug& obj, const Logger& logger); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what our convention should be for printing custom types. 🤔
I believe if you define
std::ostream& operator<<(std::ostream&, const propagate_to_next_surface_debug&);
, that could be used to print such objects to a traccc::Logger
(Acts::Logger
) object. Like we do here for instance:
- https://github.com/acts-project/traccc/blob/main/performance/include/traccc/performance/timing_info.hpp#L39
- https://github.com/acts-project/traccc/blob/main/performance/include/traccc/performance/throughput.hpp#L37
- https://github.com/acts-project/traccc/blob/main/examples/run/common/throughput_st.ipp#L214-L225
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, defining operator<<
for a type is an antipattern unless there is a single, ubiquitous way to turn a value of the type into a string. In this case, there are many different wordings and ways to print this data, so operator<<
is a no-go. Of course, we can debate that there might be better names for this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's pretty safe to define an operator for the "standard way" of printing the object. Which will also allow clients to use the code more flexibly. Printing not just to a logger, but to any kind of stream. And also not just at a DEBUG
level, but also any other that they want. (Very realistically we may end up using such a printout in a warning or error message as well. In case the debug information points at some issue.)
So this I feel fairly strongly about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider what happens if we want to print some exit modes at a different severity level than others; then your approach breaks down. I think it's much better to have a function which calls the logger itself, rather than producing a string for the whole object.
enum class propagate_to_next_surface_exit_mode { | ||
HOLE_LIMIT_REACHED, | ||
BRANCH_LIMIT_REACHED, | ||
MAX | ||
}; | ||
|
||
struct propagate_to_next_surface_debug { | ||
void reset(); | ||
|
||
unsigned int m_failure_count[static_cast<std::size_t>( | ||
propagate_to_next_surface_exit_mode::MAX)]; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fairly subjective, but I'd prefer to do this:
namespace traccc::details {
struct propagate_to_next_surface_debug_info {
enum class exit_mode {...};
unsigned int m_failure_count...;
void reset();
};
}
While talking about this... I assume reset()
is not meant to be called in device code. But if we explicitly want to prevent users from doing it, it might be helpful to use TRACCC_HOST
on it. If not, if technically it may be usable in device code in some circumstances, I would put TRACCC_HOST_DEVICE
on it right away. 🤔
@@ -48,6 +48,9 @@ struct finding_config { | |||
/// Particle hypothesis | |||
detray::pdg_particle<traccc::scalar> ptc_hypothesis = | |||
detray::muon<traccc::scalar>(); | |||
|
|||
/// Provide more verbose output for debugging; may reduce performance | |||
bool verbose = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if debug
may be a better name here. But I'm really not sure...
namespace traccc { | ||
void propagate_to_next_surface_debug::reset() { | ||
for (std::size_t i = 0; | ||
i < static_cast<std::size_t>(propagate_to_next_surface_exit_mode::MAX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't std::size(m_failure_count)
be more robust here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mwah, given that the size of m_failure_count
is taken directly from exit_mode::MAX
I'd say these are equivalent in terms of robustness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mainly to reduce the number of characters needed to express the same. I agree that it's not more robust code. But it's certainly shorter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay sure, I can use std::size
. 👍
if (unsigned int n = obj.m_failure_count[static_cast<std::size_t>( | ||
propagate_to_next_surface_exit_mode::HOLE_LIMIT_REACHED)]; | ||
n > 0) { | ||
TRACCC_DEBUG(n << " tracks were killed due to reaching the hole limit"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using an output operator would also allow us to not hardcode the output level here... 🤔
/** | ||
* @brief Optional debug object | ||
*/ | ||
propagate_to_next_surface_debug* debug; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
propagate_to_next_surface_debug* debug; | |
propagate_to_next_surface_debug* debug = nullptr; |
Then you wouldn't need to touch the SYCL code if you don't want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. 👍
@@ -102,6 +107,8 @@ std::unique_ptr<configuration_printable> track_finding::as_printable() const { | |||
std::to_string(m_config.max_num_skipping_per_cand))); | |||
cat->add_child(std::make_unique<configuration_kv_pair>( | |||
"PDG number", std::to_string(m_pdg_number))); | |||
cat->add_child(std::make_unique<configuration_kv_pair>( | |||
"Debug output", std::format("{}", m_config.verbose))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you call it "debug" here as well, then let's make the variable name debug
...
This commit adds some optional debugging output to the propagation kernel, allowing it to output more precise statistics on why tracks were killed.