Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profile (and speed up) the madevent scalar overhead to parallel ME calculations #495

Open
valassi opened this issue Jun 21, 2022 · 0 comments

Comments

@valassi
Copy link
Member

valassi commented Jun 21, 2022

In PR #494 a first summary table of madevent+cudacpp results is being assembled.

For a complex process like ggttggg, speeding up cpp by a factor 4 almost results in an overall factor 4 speedup, because the scalar part still takes a limited time. In CUDA however the overhead from the scalar madevent part is the limiting factor, and most of the CUDA speedup gets lost because the overall workflow time is dominated by the madevent overhead.

We should profile this (eg flamegraphs as Stefan suggested) and reduce it. It may be possible that patrts of this are related to the rancom choice of color #402 and helicity #403, so we should reasses when those are done. But the time spent may be elsewhere, eg in phase space sampling (in which case one option would be to move pats of this to GPU, as we do with rambo in the standalone part?)...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant