-
Notifications
You must be signed in to change notification settings - Fork 894
comm/cid: use ibcast to distribute result in intercomm case #2061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@ggouaillardet This should resolve the hang. Can you verify? |
Thsnks, will try. Btw, why do we iallreduce ? |
I think i understand better now ... I am confident ireduce can be used instead of iallreduce, and so we can only allocate tmpbuf on roots Also, allreduce_fn functions take a count parameter, but they are only invoked with count=1, so we might want to simplify that too I will test all these tomorrow |
@ggouaillardet I agree. I think the count was used at some point in the past. I agree about the reduce vs allreduce. Will make that change. |
I don't think that using bcast on an inter-communicator is equivalent to using allgatherv. The result of this operation is that only one group will get the data. |
@bosilca We first do a reduce on the intercommunicator. This gives the leaders (rank 0 in each comm) the result from the other group. Then we exchange the info between the leaders and do the reduction. At this point both leaders have the same result. They then bcast to their local groups. |
@bosilca Just to be clear the ibcast is on the intra-communicator we keep with every inter-communicator. |
@hjelmn the bcast on a intercommunicator is not symmetric. Only one of the leaders (which is the process providing MPI_ROOT as the root) will broadcast it's data to the remote group. |
if the ibcast is done on a intra-communicator, then the result should be correct. |
If we have an intra-communicator then why all the pain to deal with the reduce on an inter-communicator ? |
@bosilca Not sure why the code was so complicated before. Once we added the c_local_comm we should have updated the algorithm to do what this PR makes it do. This is much less complicated than we made it out to be :). |
@ggouaillardet Nevermind on the not using reduce. I forgot that I had already changed the initial reduce to be on the intra-communicator. Made the change from allreduce->reduce. |
This commit updates the intercomm allgather to do a local comm bcast as the final step. This should resolve a hang seen in intercomm tests. Signed-off-by: Nathan Hjelm <hjelmn@me.com>
Might go ahead an merge this before 9pm EDT to get it into MTT tonight. |
It works great for me, thanks ! |
I don't see any change in MTT, though - still a ton of hangs. However, the tests that are now hanging are comm_dup. |
use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs #2061
3b968ec fixes |
just wondering - are these some of the same problems we are seeing in 2.x? I don't know if these fixes need to go over or not. |
these two issues came from the refactor, and it was not brought to v2.x |
several hangs occur with the that should be fixed by open-mpi/ompi-tests@53ad6f22e1f9ceda580a06bf8e00324a5f15c6c0
|
use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs open-mpi#2061
use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs open-mpi#2061 (cherry picked from commit 3b968ec) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs open-mpi#2061 (cherry picked from commit 3b968ec)
use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs open-mpi#2061 (cherry picked from commit 3b968ec) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit updates the intercomm allgather to do a local comm bcast
as the final step. This should resolve a hang seen in intercomm
tests.
Signed-off-by: Nathan Hjelm hjelmn@me.com