-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ch4/am: Caching buffer attribute in request and use typerep fast path for H2H #7082
base: main
Are you sure you want to change the base?
Conversation
Add new flag MPIR_TYPEREP_FLAG_H2H can be passed to choose this path. If typerep user knows the buffers are all host, they can using this to bypass buffer attribute checking and related branches. This can be useful for SHM pipelined transfer where send/recv buffer attribute can be cached in request.
The checking on whether a buffer is on host can be simplified to one check because the enum of unregistered host and registered host can be combined.
We check and cache the recv buffer attribute when posting recv request. This info is used to use the H2H fast path in typerep.
test:mpich/ch4/ofi |
Here is the context of this PR. I was testing the performance of fbox and found that building with yaksa vs dataloop has a 0.02-0.03 us difference in latency for 1B message. Since no derived datatype is involved, the data copying should mostly go through the memcpy path in the typerep_copy/pack. The difference between typerep_yaksa and typerep_dataloop is the checks on buffer pointer attributes. I have not check GPU build yet. But I think it would be a good thing that we cache these info instead of checking them every time. |
Check if source buffer is on host and choose typerep fast path for H2H.
2d50266
to
c4d1e02
Compare
test:mpich/ch4/ofi |
@@ -209,6 +210,7 @@ typedef struct MPIDIG_req_t { | |||
void *buffer; | |||
MPI_Aint count; | |||
MPI_Datatype datatype; | |||
MPL_pointer_attr_t buf_attr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes the compile-time assertion about the size of the extended request object to fail in Jenkins tests. Do we need the whole attribute struct or can we get by with just the type?
Pull Request Description
This PR add three things
Request for comments (@hzhou @raffenet ):
Author Checklist
Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
Commits are self-contained and do not do two things at once.
Commit message is of the form:
module: short description
Commit message explains what's in the commit.
Whitespace checker. Warnings test. Additional tests via comments.
For non-Argonne authors, check contribution agreement.
If necessary, request an explicit comment from your companies PR approval manager.