-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore some type-erasure to transform_mpi
to avoid debug bloat
#1390
base: main
Are you sure you want to change the base?
Conversation
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesCodacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more |
auto f_completion = [f = std::forward<F>(f), mode, completions_inline, p]( | ||
auto&... args) mutable -> unique_any_sender<> { | ||
unique_any_sender<> s = just(std::forward_as_tuple(args...)) | unpack() | | ||
dispatch_mpi(std::move(f)) | trigger_mpi(mode); | ||
if (completions_inline) { return s; } | ||
else { return std::move(s) | continues_on(default_pool_scheduler(p)); } | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inline branch returns s as an any sender - the non inline returns an any sender of an any sender - do they get collapsed into one create, or can/should we just expand the other branch to just return one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In if (completions_inline) { return s; }
, s
is not going to get wrapped again and even a move construction should be avoided because of NRVO.
In the second branch there will indeed be two unique_any_senders
. It's a tradeoff between debug bloat and wrapping. The first commit on this PR (aae6046) doesn't type-erase s
; it only type-erases whatever is returned. That on its own already significantly reduces bloat, and then type-erasing s
reduces bloat a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks a lot for the investigation! Have you checked if it solves the linking problem with DLA-F?
That is indeed the important question, and on top of |
#1346 simplified
transform_mpi
, but at the same time it removed almost all type-erasure of senders. This increased the debug symbol sizes caused bytransform_mpi
quite significantly. This PR proposes to restore some of the type-erasure, which further simplifies the implementation a bit and reduces debug bloat.For reference, here are the binary sizes of the
transform_mpi
test (with build typeDebug
) at different points in history starting from just before #1321:transform_mpi
values #1321 was merged): 10,910,456 bytestransform_mpi
values #1321): 18,794,288 bytesset_value
CPO to use member functions instead oftag_invoke
#1295): 19,029,672 bytesPIKA_MOVE
andPIKA_FORWARD
#1325): 20,175,936 bytestransform_mpi
refactoring #1346): 35,766,456 bytesany_operation_state
to avoid dynamic allocation for receiver #1354): 40,082,936 bytesI've included a few selected commits above that also affect the binary size. #1321 is a bugfix, so the big increase from that PR we can't directly fix/revert. A few other PRs also increase the binary size, but not as much as #1346, so I'm targeting that first. With the changes in this PR we get back to the same size we had after #1321.