-
Notifications
You must be signed in to change notification settings - Fork 878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing dlopen() the MPI shared library with RTLD_LOCAL #3705
Comments
It looks like mca_patcher_overwrite.so is really looking for libopen-pal, so perhaps we should add the dependency to THAT rather than libmpi which has the sideeffect of loading libopen-pal. |
Well, yes, of course. Ideally, each plugin should link to the exact libs they use. But IIUC, some other components do depend on libmpi. As I don'k now well the plugin dependencies, I abused a little and linked libmpi to all plugins. This is not the cleanest way of doing it, but certainly the easiest to maintain as start add new plugins. Please note the output I pasted is just the fist line. Fixing just mca_patcher_overwrite.so will not solve the problems, you will get an error later, at the point some other plugin is loaded. If you guys decide to fix this issue in any way (either linking libmpi to all plugins, or linking the exact lib each plugin depends on), I can contribute a pure Python script with no external dependencies to be added to your test suite to prevent from any regressions in the future. |
@dalcinl I have added this to our discussion topics for our July developer's conference. We seem to recall there was a reason we didn't do the linkage, but maybe we can provide a configure flag to make your life easier. |
@rhc54 This is not about making my life easier, but the life of all users that typically have a pre-installed Open MPI they don't have control on. If the configure flag is not on by default, installers of Open MPI will likely miss to turn on the flag, and this issue will continue harming end users ad eternum. In general, I see two possible ways to fix things:
I would really focus on option (2) and try to understand under which scenarios it would break. Moreover, experienced users and sysadmins have a chance to hotfix things by using ELF or Mach-O binary editing tools like |
I believe you misunderstood me. I was not minimizing your concern, but only echoing a conversation we had on the weekly telecon where I raised this issue. My point was that there was a reason for not doing the linkage, and so we have to be careful here that we don't break other things. Hence, I added it to the conference agenda to ensure we give adequate consideration to the problem before deciding on a path forward. |
I just did a little spelunking to dig into the history of this a bit (as @rhc54 noted, we talked about this on Tuesday at our weekly webex, where @bwbarrett and I felt sure that we made it so that components do not link against their respective project libraries a while ago -- so I dug into the past to find out why we did this). Here's what I found:
|
After working thru that page, I'm wondering if we can extract a combination that works. If we are in a Linux environment and are not building static, then it appears (if I read the tables correctly) that linking the components to their base library might work and resolve the issue. As I said, we can ponder this more at the conference. |
@jsquyres From your tables, in the second one, entry 15, result (I), is this "other linkers will not work" just speculation or you actually have a concrete example about a platform where things are broken? BTW, I'm almost sure this sould also work on macOS, though I did not actually tried (do you want me to try it and confirm?) |
@dalcinl Yeah, I noticed that same phrase again this morning when I was re-reading that table. I had two thoughts about it:
|
We talked about this in detail yesterday at the face to face meeting in Chicago. The prevailing thought was: we don't remember the system(s) that were a problem back in Oct 2010. Doh! We updated https://github.com/open-mpi/ompi/wiki/Linkers yesterday to say:
So the thought was that we should (re)write the test(s) that were used to generate the tables on that wiki page (e.g., main() calls a library function in a .so that dlopen's a DSO and then calls something in the DSO, and we do some checks to make sure that the shared library doesn't exist in the process memory more than once). Do this with and without linking the DSO to the .so. And then re-run this on all the platforms that we care about today, and see if it's a problem anywhere. I uploaded a first cut of this test to ompi-tests/simple/dlopen. Prelim results:
|
Wanted to figure out which platform caused problems in the past. Other platforms, please test and update. |
Per 1 Aug webex: I asked @siegmargross to run on his various flavors of Solaris/Linux on x86/SPARC. @bwbarrett will be testing on all the variety of platforms available at Amazon. |
I tested on Amazon Linux 17.03 (newish kernel), FreeBSD 11, and NetBSD 7.0, and all three worked. |
@shamisp confirms it works on two different Linux/ARM platforms. |
Works on ppc64le on Red Hat Enterprise 7.3 |
Works on
|
@siegmargross confirms:
|
In off-issue discussion (i.e., email), @kawashima-fj mentions the following is necessary to make it work with their compilers (I just want to capture all this data in one place):
For the moment, it may be sufficient to document this somewhere. @ggouaillardet didn't think it was worth adding additional logic into our |
@hppritcha Points out that if we link the Java bindings against lib OMPI / ORTE / OPAL (i.e., whichever is necessary), the hinkyness of ompi/mpi/java/c/mpi_MPI.c |
@jsquyres I mentioned Java when I opened this issue, and indeed the Java issues should be fixed. Linking |
How much work would it be for someone well versed in configure to add the -lmpi to the MPC components to make this "just work"? That's all that's needed correct? |
I have a branch with some changes that is passing the posted test. Let me clean it up a bit this morning and I'll post a PR for review today. I might need help cleaning up the Java bindings (I just haven't looked at what needs to happen there yet). |
@jjhursey I think you just need to fix Can you point me to your branch? I would like to give it a try with mpi4py. |
* Resolves open-mpi#3705 * Components should link against the project level library to better support `dlopen` with `RTLD_LOCAL`. * Different project levels link to different sets of libraries by using the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`. ``` MCA components in ompi/ $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la \ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la \ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in orte/ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la \ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in opal/ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la ```
* Resolves open-mpi#3705 * Components should link against the project level library to better support `dlopen` with `RTLD_LOCAL`. * Different project levels link to different sets of libraries by using the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`. ``` MCA components in ompi/ $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la \ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la \ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in orte/ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la \ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in opal/ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la ``` Note: The changes in this commit were automated. Some components were not included because they are staticly built only.
* Resolves open-mpi#3705 * Components should link against the project level library to better support `dlopen` with `RTLD_LOCAL`. * Different project levels link to different sets of libraries by using the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`. ``` MCA components in ompi/ $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la \ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la \ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in orte/ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la \ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in opal/ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la ``` Note: The changes in this commit were automated. Some components were not included because they are staticly built only. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* Resolves open-mpi#3705 * Components should link against the project level library to better support `dlopen` with `RTLD_LOCAL`. * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am` with the appropriate project level library: ``` MCA components in ompi/ $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la MCA components in orte/ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la MCA components in opal/ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in oshmem/ $(top_builddir)/oshmem/liboshmem.la" ``` Note: The changes in this commit were automated by the script in the commit that proceeds it with the `libadd_mca_comp_update.py` script. Some components were not included in this change because they are statically built only. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* See discussion on Issue open-mpi#3705 regarding why this is no longer needed. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* Resolves open-mpi#3705 * Components should link against the project level library to better support `dlopen` with `RTLD_LOCAL`. * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am` with the appropriate project level library: ``` MCA components in ompi/ $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la MCA components in orte/ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la MCA components in opal/ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in oshmem/ $(top_builddir)/oshmem/liboshmem.la" ``` Note: The changes in this commit were automated by the script in the commit that proceeds it with the `libadd_mca_comp_update.py` script. Some components were not included in this change because they are statically built only. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* See discussion on Issue open-mpi#3705 regarding why this is no longer needed. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* See discussion on Issue open-mpi#3705 regarding why this is no longer needed. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* Resolves open-mpi#3705 * Components should link against the project level library to better support `dlopen` with `RTLD_LOCAL`. * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am` with the appropriate project level library: ``` MCA components in ompi/ $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la MCA components in orte/ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la MCA components in opal/ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in oshmem/ $(top_builddir)/oshmem/liboshmem.la" ``` Note: The changes in this commit were automated by the script in the commit that proceeds it with the `libadd_mca_comp_update.py` script. Some components were not included in this change because they are statically built only. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* See discussion on Issue open-mpi#3705 regarding why this is no longer needed. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
PR #4121 merged into master. Re-opening so we can consider it for v3.0 release |
* Resolves open-mpi#3705 * Components should link against the project level library to better support `dlopen` with `RTLD_LOCAL`. * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am` with the appropriate project level library: ``` MCA components in ompi/ $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la MCA components in orte/ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la MCA components in opal/ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in oshmem/ $(top_builddir)/oshmem/liboshmem.la" ``` Note: The changes in this commit were automated by the script in the commit that proceeds it with the `libadd_mca_comp_update.py` script. Some components were not included in this change because they are statically built only. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com> Local application of the LIBADD script: `./contrib/libadd_mca_comp_update.py` Reference `master` commit e1d0795
* See discussion on Issue open-mpi#3705 regarding why this is no longer needed. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com> (cherry picked from commit 49c40f0) Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* Resolves open-mpi#3705 * Components should link against the project level library to better support `dlopen` with `RTLD_LOCAL`. * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am` with the appropriate project level library: ``` MCA components in ompi/ $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la MCA components in orte/ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la MCA components in opal/ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in oshmem/ $(top_builddir)/oshmem/liboshmem.la" ``` Note: The changes in this commit were automated by the script in the commit that proceeds it with the `libadd_mca_comp_update.py` script. Some components were not included in this change because they are statically built only. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com> Local application of the LIBADD script: `./contrib/libadd_mca_comp_update.py` Reference `master` commit e1d0795
* See discussion on Issue open-mpi#3705 regarding why this is no longer needed. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com> (cherry picked from commit 49c40f0) Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
@jsquyres @hppritcha Is this something that you would consider for the v2.x series? If so then I can create a PR there. Otherwise we can just work on getting it in the v3.0.x series. |
I think going with v3.0.x should be fine. |
This is now done in v3.0.x and forward. |
So -mpi not required for nrniv launch (but -nobanner useful) Can use with python without NEURON_INIT_MPI enviornment variable. Dynamic loading of libnrnmpi.so on linux does not need LD_LIBRARY_PATH Note that on linux, openmpi prior to version 3 but not configured with --with-paranrn=dynamic will produce an error with pc.mpi_init() when python was launched that begins with: mca_base_component_repository_open: unable to open mca_patcher_overwrite: folowed by lots of output. Solution is to get openmpi 3 or above or follow the patch instructions for openmpi at open-mpi/ompi#3705
It is not needed anymore since open-mpi/ompi#3705. Related: #3812, where we document this trick.
The lack of support for dlopen()ing the MPI shared library within a local namespace has been a recurrent issue for implementors of MPI bindings in dynamic languages like R, Julia, Python. Even the Java JNI bindings you guys distribute have to resort to the hack of re-dlopening the library with
RTLD_GLOBAL
before the invocation ofMPI_Init()
. I complained about this ages ago, it was never fixed, and eventually I rolled my own re-delopen hack for mpi4py. However, I really HATE it because is quite fragile. For example, macOS changed at some point the rules. Also, in Linux distros, if the user does not install theopenmpi-devel
package, the symlinklibmpi.so -> libmpi.so.<version>
is not available, then the code implementing the hackery has to be updated from time to time every time the library version is bumped. BTW, you Java JNI bindings are broken because of this, you should notdlopen("libmpi.so",...)
, you have todlopen("libmpi.so.20",...)
in Linux to not depend on theopenmpi-devel
package at runtime.As a reproducer, you have this piece of Python code.
Running it of course fails:
$ python ompi-dlopen.py [kw14821:30046] mca_base_component_repository_open: unable to open mca_patcher_overwrite: /home/devel/mpi/openmpi/2.1.1/lib/openmpi/mca_patcher_overwrite.so: undefined symbol: mca_patcher_base_patch_t_class (ignored) ... lots of output ... [kw14821:30046] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
The root problem is that all the
*.so
files in$prefix/lib/openmpi/
do not explicitly link to the MPI library. The following script hot-fix the issues usingpatchelf
(version 0.9 required):Please note the special rpath I'm adding
$ORIGIN\..
, for macOS it should be@loader_path/..
and the script should be based ininstall_name_tool
.After running the shell script above and hot-fixing the plugins, the Python script runs just fine:
In short, I think fixing the issue once and for all (at least for Linux, macOS, and Solaris) is to link the plugins in
$prefix/lib/openmpi
with-L$BUILDDIR/lib -lmpi -Wl,-rpath,\$ORIGIN/..
in Linux/Solaris and-L$BUILDDIR/lib -lmpi -Wl,-rpath,@loader_path/..
in macOS.Unfortunately, I cannot offer a patch, I'm not an expert on autotools and I have no idea how to implement these changes. However, I can offer my help to review changes and test them in Linux and macOS.
The text was updated successfully, but these errors were encountered: