-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coll: add coll_attr and comm subgroups #6590
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
hzhou
force-pushed
the
2301_collattr
branch
2 times, most recently
from
August 13, 2024 00:56
c63d810
to
7ad8696
Compare
test:mpich/ch3/tcp All ✔️ |
hzhou
force-pushed
the
2301_collattr
branch
2 times, most recently
from
August 13, 2024 14:09
8366cd2
to
48cff65
Compare
hzhou
changed the title
coll: add coll_attr parameter
coll: add coll_attr and comm subgroups
Aug 13, 2024
4 tasks
My desktop caught more spelling errors than the Jenkins check, likely due to newer version of the codespell package.
Make these two err flags bit patterns independent of actual values of MPI_ERR_OTHER and MPIX_ERR_PROC_FAILED. This allows the errflags easily fit into attribute bits. We'll fix the a few usages in the next commit.
They are no longer the same as MPI_ERR_OTHER and MPI_ERR_PROC_FAILED.
Define MPIR_ERR_NONE, MPIR_ERR_PROC_FAILED, and MPIR_ERR_OTHER as macros and remove the definition of MPIR_Errflag_t enum. We will replace MPIR_Errflag_t with "int coll_attr" in the next commit. Using coll_attr gives us more flexibility in extending the implementation with additional attributes such as sub-group, memory kinds, and algorithm hints. Sub-group can be an index pointing to a group list in comm.
As the title, trivial but messy. Collectives use coll_attr, but pt2pt apis use "int errflag". Notably, MPIR_ERR_COLL_CHECKANDCONT works directly since it only does bit or. Reviewers: pay attention to changes to files outside src/mpi/coll/.
The multi-lead algorithm not only assumes the same number of ranks per node, it also has to be ordered in exact round-robin order. In particular, MPII_Comm_is_node_balanced is not sufficient check.
Store num_local and num_external in MPIR_Comm. Along with internode_table, they help construct internode subgroups.
This is the same as num_external.
Lightweight struct to describe sub-groups of a communicator. They intend to replace the subcomms.
Add coll_attr parameter to the MPIC_Recv and MPIC_Irecv so that we can enable subgroup collectives later.
Let MPIC Send/Recv routines check coll_attr for potential subgroup attributes, effectively enabling group collectives.
Enhance the macro MPIR_THREADCOMM_RANK_SIZE to check coll_attr for rank and size.
A copy of MPIR_Bcast_intra_smp (to MPIR_Bcast_intra_smp_new) that uses MPIR_Subgroup instead of subcomms.
When root is not local rank 0, instead of adding a extra intra-node send/recv or bcast, construct an inter group that includes the root process.
In MPIR_nodeid_init use MPIR_Allgather_fallback and MPIR_Bcast_fallback to avoid the complication of collective algorithm selection. It causes issue here because the bcast smp_new algorithm does not have proper CVAR fallback check yet. The proper fix need add coll_attr to most communicator checking routines, and will need coll_attr to be universally added to all collective interfaces including nonblocking and persistent collectives. Let's postpone that big change for now.
test:mpich/ch3/most |
Superseded by #7103 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Description
Add a
coll_attr
parameter to replace theerrflag
parameter in internal collective interfaces. Make the lower 8-bit ofcoll_attr
compatible to the lower 8-bit of pt2ptattr
, which will avoid extra code to translate bits such aserrflags
when passing from collective to point-to-point. The next 8-bit is used for subgroup indexes, enabling group collectives without extra subcomms (which are expensive to maintain). We may extend in futurecoll_attr
for passing hints such as memory alloc kinds and algorithm choices.Add a bcast
smp_new
algorithm that are similar to bcastsmp
but uses comm subgroups instead. Because we can construct lightweight custom subgroup, we can avoid the extra local send/recv or bcast step whenroot
is not one of the "node roots". Instead of thenode_roots_comm
, we can construct ainter_group
made of the actual local roots.NOTE: the bcast
smp_new
is covered in the collective cvar tests.[skip warnings]
Author Checklist
Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
Commits are self-contained and do not do two things at once.
Commit message is of the form:
module: short description
Commit message explains what's in the commit.
Whitespace checker. Warnings test. Additional tests via comments.
For non-Argonne authors, check contribution agreement.
If necessary, request an explicit comment from your companies PR approval manager.