-
Notifications
You must be signed in to change notification settings - Fork 903
Pass oversubscribe status to MPI layer #8998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jsquyres I believe this resolves your question on openpmix/openpmix#2192. PMIx and PRRTE are correctly passing the oversubscribe flag to OMPI. I have made an attempt at modifying the MPI layer logic to properly handle the flag - it seems to be working, but I defer to you to verify it. Feel free to modify it as necessary. |
Looks like this had little effect on mpi4py testsuite running under GitHub Actions workers. The VMs have only 2 cores, the case At this point I think I should run these tests locally setting CPU cores offline to try to reproduce. |
@rhc54 I confirm what @dalcinl was seeing: I pushed a 2nd commit to this PR: I changed the logic of how If my 2nd commit is correct, let's squash it to the first commit and merge. As of 1 June 2021, the PMIX and PRTE git submodule pointers on this PR are still newer than what are on |
@jsquyres It looks fine to me. I was trying to allow the user to override it if they chose to do so, though I can't imagine why someone would do that. I'll squash and re-push. |
The default value is passed up by the threads module. For pthreads, the default is false and the initial value is false, so it never used the value that came in from PMIx. Since this PR removed the MCA param (which was internal, anyway), there's no exposed mechanism for the user to change this value. I think that's ok. |
Update PMIx/PRRTE pointers to pass oversubscribe status to child processes. Update OMPI to check for PMIx attribute and set `ompi_mpi_oversubscribe` accordingly. Move logic for setting yield_when_idle to a place after the oversubscribe flag has been checked. - change logic of setting ompi_mpi_yield_when_idle - nit: change `ompi_mpi_oversubscribe` to `ompi_mpi_oversubscribed` - add comment in ompi/runtime/params.h Signed-off-by: Ralph Castain <rhc@pmix.org> Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
@dalcinl Can you try this PR again? |
@rhc54 I'm unsure how to cherry pick this to the v5.0.x branch, because I think the Open MPI v5.0.x branch is tracking different PMIx / PRRTE branches than Open MPI master...? |
Add the MPI part to #9026 |
Update PMIx/PRRTE pointers to pass oversubscribe status
to child processes. Update OMPI to check for PMIx
attribute and set
ompi_mpi_oversubscribe
accordingly.Move logic for setting yield_when_idle to a place after
the oversubscribe flag has been checked.
Signed-off-by: Ralph Castain rhc@pmix.org