Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename max_num_tracks to num_track_slots and divide by num_streams #785

Merged
merged 7 commits into from
Jun 1, 2023

Conversation

sethrj
Copy link
Member

@sethrj sethrj commented May 31, 2023

Since #774 the total number of track slots that had to be allocated was the product of the number of streams and the max_num_tracks. Now the more accurate/descriptive num_track_slots is used, and it indicates the total number of slots per process, not per thread.

Also, following CUDA best practices, we call cudaSetDevice inside the #pragma omp parallel loop since the device context is a thread-local variable.

@sethrj sethrj added enhancement New feature or request internal labels May 31, 2023
@sethrj sethrj requested a review from amandalund May 31, 2023 17:57
Copy link
Contributor

@amandalund amandalund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sethrj!

@sethrj sethrj merged commit 8ddd87b into celeritas-project:develop Jun 1, 2023
@sethrj sethrj deleted the track-slots branch June 1, 2023 19:22
@sethrj sethrj added performance Changes for performance optimization app Application front ends and removed performance Changes for performance optimization labels Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app Application front ends enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants