-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix finding "prte" for singleton comm_spawn #12363
Conversation
Is there a way to restart the NVIDIA CI when it fails to deploy? |
cd57e0a
to
397f9f1
Compare
After looking at PRRTE code, it looks like the |
You are overthinking this - just leave it be, please. It is correct and will work correctly as written. |
ompi/dpm/dpm.c
Outdated
@@ -1974,6 +1975,46 @@ static void set_handler_default(int sig) | |||
sigaction(sig, &act, (struct sigaction *)0); | |||
} | |||
|
|||
static char *find_prte(void) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a function doing the exact same thing in ompi/tools/mpirun/main.c
. Instead of duplicating code maybe we should unify the two and have a consistent approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look forward to your PR. This seemed a rather trivial duplication for a very limited purpose.
I have mixed feelings about this solution. Basically, I think that ignoring |
Ummm...that isn't actually true. Ignoring |
Correct, |
I'm happy to simply withdraw the PR - it will leave singleton comm-spawn broken when the installation is relocated, but I guess you folks can figure out how you want to resolve that problem (or not). |
We need to discuss this seriously and come up with a sane approach (maybe not allowing singletons). |
I'm happy to leave this here for your consideration. If it is of any help, I believe it was Brian who pointed out one time that the Standard doesn't say you have to support singleton comm_spawn in the absence of a supporting RTE. If so, it could be sufficient for comm_spawn to return an error message indicating that a supporting RTE wasn't found, perhaps suggesting the setup a PRRTE DVM and try again. |
And then all the Python folks using mpi4py.futures within an IPython notebook running in the browser will hit my inbox asking why the thing does not work. @bosilca What if the OPAL_PREFIX to PMIX/PRRTE translation business is done at MPI_Init time for the singleton init case? At least the thing will happen consistently and irrespective of whether you use spawn or not. Of course modifying a process's environment mid-flight (i.e at MPI_Init time) is maybe not great design, but IMHO that's better than just not supporting a feature or have it broken as now. |
Hmmm...I don't think that will help, I'm afraid. However, I'm puzzling over the actual impact of setting Which means that the application process is getting the exact same behavior as the singleton. They both only see Perhaps even more interesting is that PRRTE doesn't even look for |
Actually, I think I have to correct myself here - had to look further back in the codepath. For OMPI v5, the key is that code which translates Sadly, if using an external PRRTE, |
So I got curious and did some sleuthing, and here is what is going on. In order for a singleton to start, it has to find In that instance, then this code is doing all the right things. PRRTE and PMIx will likewise use The only case where there is a potential problem is when (a) there are multiple OMPI installations on the system, and (b) So perhaps the correct solution to this problem is to have an internal check of |
In my particular use case, I'm not setting
Except that being able to override the internal PMIX/PRTE with an external install may be an useful feature, allowing full reuse of pre-built binaries. However, perhaps at that point it is better to just override the whole thing, including |
In my last comment, I forgot to say that, with my limited knowledge of ompi internals, I agree with Ralph's observations about what the changes in this PR are good. However, I'm still thinking whether forcibly overriding PRTE/PMIX_PREFIX with |
I think we are getting a tad confused here, so let's separate out the two concerns. One: how to ensure that the MPI library is self-consistent. Whether you manually set Two: what to do about singleton comm-spawn's need to find So what the OMPI folks need to do is figure out how they want to deal with the first concern. |
Sorry, but I have to insist, although this time with a question.
Would singleton init still work as expected, despite the fact that the tool's basename is not |
Then they set it incorrectly, and it will fail. One does need to follow the documentation and do things correctly - it is impossible to deal with every user error. Setting it this way would cause Nobody wants to expose the details of OMPI's internals to the user. We have a documented procedure that works - let's just stick to it, please? I'd rather not continue the debate. |
For you OMPI folks: please note that the direct launchers out there (e.g. On the plus side, all the launched procs will see the same conflict. 🤷♂️ |
After discussion at the OMPI RM meeting today, it was decided that this PR fixes the identified issue and really doesn't create any new ones. The expressed concerns over Accordingly, this should be good to go. |
I disagree, this is absolutely not the same thing. The issue we have today is screwing ALL processes, while this patch will only screw up all newly spawned processes creating a distributed run with potentially multiple versions of OMPI. |
I invite you to walk thru the logic of that statement and explain to the OMPI community how either situation results in a correctly operating application. Whether the error causes the entire job to collapse, or only causes the interaction between the singleton and its children to fail, it still fails. You are free to argue this at tomorrow's devel meeting - I have no further opinion on the matter. |
When it affect the entire job the entire user application will be using a different MPI library that expected, but it will not fail, or at least I do not see any obvious reason for this to fail as they will all use the same wire protocols. Yes, it is possible to load components from a different installation, but as long as they are binary compatible (which we ensure in the MCA loading scheme) all is good. When it affects only the spawnees, it will create a heterogeneous setup where the singleton and the rest of the processes will use a different MPI library. Not components, but a different MPI library. This is more than unexpected and can lead to weird errors as the wire protocols might differ. Now, if the user know what she's doing and the LD_LIBRARY_PATH (or the rpath) and the OPAL_PREFIX matches, all will be good in all cases. These users don't need protections, the others do. |
The feeling was that the documentation explains this in detail, and has been sufficient for nearly 20 years. What you appear to be advocating for is a change in that position by adding internal detection that the prefix and library path are in conflict. I don't have any opinion on that, but it is beyond the scope of this PR. I'm only trying to fix the filed issue where we don't find |
The "old" method relies on PMIx publish/lookup followed by a call to PMIx_Connect. It then does a "next cid" method to get the next communicator ID, which has multiple algorithms including one that calls PMIx_Group. Simplify this by using PMIx_Group_construct in place of PMIx_Connect, and have it return the next communicator ID. This is more scalable and reliable than the prior method. Retain the "old" method for now as this is new code. Create a new MCA param "OMPI_MCA_dpm_enable_new_method" to switch between the two approaches. Default it to "true" for now for ease of debugging. NOTE: this includes an update to the submodule pointers for PMIx and PRRTE to obtain the required updates to those code bases. Signed-off-by: Ralph Castain <rhc@pmix.org>
Oops - accidentally overwrote this branch. Oh well - nobody seemed inclined to take it anyway. |
@rhc54 Any chance you can recover it? IIUC, the objection was mostly about forcibly setting |
Yeah, I had it laying around on another place, so easy to recover it. Still, if it doesn't get committed in the next week, I'll probably kill it anyway. I really dislike letting PR's sit around for months - or in some cases, years. |
Reopen it for consideration - will remove it permanently on Mar 11 if not committed. |
Guess I can't reopen it because it was recreated - so I opened #12390 as a replacement |
Copy the search code from mpirun to use when starting "prte" in support of comm_spawn. This respects things like OPAL_PREFIX.
Refs #12349