-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v4.2 GDS issue with singleton spawn #2705
Comments
Is it the singleton or the child that is complaining? |
Oh, and what key are they looking for? |
Just off the top of my head: I'd guess that the error comes from the child and that it is looking for a modex key. The problem looks to be in the dpm code - once you have spawned the |
It looks like modex information (if I force |
It looks like we are calling |
Okay, here's the easiest solution. Once you detect that you are a singleton, push See if that works. |
Yeah, that works. I'll post a PR to OMPI to make that change. It bugged me that it worked with OpenPMIx |
Correct. The problem with dstore is that it is way behind in how to handle anything other than the original job info. The hash component is far more versatile. Once we get to PMIx v5, dstore will go away and we will have a shmem hash component instead - will solve many problems. |
* See openpmix/openpmix#2705 Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* See openpmix/openpmix#2705 Signed-off-by: Joshua Hursey <jhursey@us.ibm.com> (cherry picked from commit 245424c)
* See openpmix/openpmix#2705 Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* See openpmix/openpmix#2705 Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* See openpmix/openpmix#2705 Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
Background information
What version of the PMIx Reference Library are you using?
v4.2
branch at 8cb6f58master
branch at openpmix/prrte@8d735f5 andv3.0
branch at openpmix/prrte@b29abdemain
branch at open-mpi/ompi@5575c86 (with both Fix Singletons and Singleton Spawn open-mpi/ompi#10688 and Fix the dpm to fwd IO to the spawning parent open-mpi/ompi#10695 cherry-picked into the branch since they have not been merged in yet)Testing note on this comment
Describe how PMIx was installed
Built with Open MPI
main
and manually adjusted the submodule pointers.Please describe the system on which you are running
Details of the problem
Using PRRTE
master
and OpenPMIxmaster
we have been able to get singletonMPI_Comm_spawn
working. However, if we move to OpenPMixv4.2
then it fails. See open-mpi/ompi#10688The MPI tests I'm using are located here
I noticed that if I force the
hash
GDS component that it works correctlySo this seems to only impact the singleton spawn case, and is related to the GDS component (verbose output indicates that it is using
ds21
)The text was updated successfully, but these errors were encountered: