-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
developers: Doc how to build against external PMIx/PRTE #12946
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Jeff Squyres <jeff@squyres.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is some confusion about PRRTE - see comments.
MPI this way, you may need to install the package manager's | ||
"developer" Hwloc, Libevent, OpenPMIx, and/or PRRTE packages. | ||
|
||
1. Open MPI and PRRTE must be built against the **same** installation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, not true - PMIx is designed to handle cross-version messaging. Only issue would be ensuring that the PMIx being used by PRRTE is new enough to support any OMPI-used features. Given the minimums we set, that shouldn't be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fair point here. I was really thinking about Hwloc and Libevent when I was writing this bullet. I'll update.
|
||
Open MPI, OpenPMIx, and PRRTE must all use the same Hwloc and | ||
Libevent libraries at run time (e.g., they must all resolve to | ||
the same run-time loadable libraries at run time). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not totally correct - PRRTE doesn't have to use the same as OMPI never loads "libprrte", so there is no potential for confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By transitive property, though, isn't this true? OMPI must use the same hwloc + libevent as PMIX, and PRTE must use the same hwloc + libevent as PMIx, so therefore don't they all have to be the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can have two versions of PMIx installed. I don't think this is super common for a developer, but given how Slurm is packaged, may be more common there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PBS does it too now - in fact, they include PRRTE and PMIx, so not uncommon to see multiple installs there.
|
||
1. Open MPI, OpenPMIx, and PRRTE must all be built against the | ||
**same** installation of Hwloc and Libevent. Meaning: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not fully correct - PRRTE doesn't need to have the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can (will) run into super weird errors if PMIx, PRTE, and/or Open MPI are using different libevents and/or hwlocs at run-time. They should be completely orthogonal, but we have definitely seen cases where they are not. I'm ok documenting it as a must even if there are some cases where it actually does work to -- for example -- have PMIx run-time link against hwloc version X and PRTE run-time link against hwloc version Y.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is where you are getting into trouble. Whatever PMIx PRRTE is using, that PMIx and PRRTE must use the same hwloc and libevent.
However, as I said elsewhere, there is no requirement that PRRTE and OMPI use the same PMIx. So there is no transitive property involved here - there is a complete airbreak between PRRTE and OMPI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so something like:
- Open MPI and the OpenPMIx library Open MPI links against must be built against the same installation of hwloc and Libevent. It is not required that the runtime and Open MPI be built with the same version of PMIx, but the same hwloc/libevent linking rules also apply to PRRTE and its OPenPMIx library.
manager. Assuming that the package-manager installs of OpenPMIx | ||
and PRRTE were built against the package-manager-provider Hwloc | ||
and Libevent, then Open MPI will *also* need to be built against | ||
the package-manager-provided Hwloc and Libevent. To build Open |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, true for PMIx but not for PRRTE
This has been on my to-do list for a while.
Reviewers: you can read the rendered version of this PR here: https://ompi--12946.org.readthedocs.build/en/12946/developers/building-open-mpi.html#building-against-external-openpmix-prrte