-
Notifications
You must be signed in to change notification settings - Fork 882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open MPI MCA Param file not applied. #7737
Comments
Fixed by #7744 |
Hi @rhc54 , maybe this is newbie question but, |
It went into what is now the |
Thank you @rhc54, will be back ported to v4.1? The default OpenMPI on Rocky Linux 8 and 9 is v4.1 and I used following workaround. But this is not ideal, especially when we want to add/modify settings system-wide. How do you guys apply system-wide settings on RHEL8/9 equivalent systems? I understand this is not the place for QA though.. for new users
existing user
|
It doesn't look like it can be cleanly backported but if it's broken it probably should be fixed. I don't know if it will though as the v4.1.x series has slowed down a lot. @jsquyres @bwbarrett are the RM's for v4.1.x |
Thank you @wckzhang! It would be ideal if upstream fixed rather than apply patch ourselves. |
@panda1100 Hi I want to clarify the behavior you're seeing before investigating this issue. What problem are you seeing? Are the user/system level mca params not being read from the files on the 4.1.x branch? What commit are you working off of? |
Thank you @wckzhang ! We are on https://www.open-mpi.org/software/ompi/v4.1/downloads/openmpi-4.1.1.tar.bz2 , inside https://download.rockylinux.org/vault/rocky/9.0/AppStream/source/tree/Packages/o/openmpi-4.1.1-5.el9.src.rpm . The problem I faced is The background context here is that I faced this issue when I used OpenMPI and Apptainer (HPC focused container solution that formerly known as Singularity) like this |
What was your configure cmd line? Suspect your system default parameter file location isn't where OMPI expects it |
Thank you @rhc54, Let me check. |
@rhc54 This is what we used to build OpenMPI.
and actual
https://docs.fedoraproject.org/en-US/packaging-guidelines/RPMMacros/ |
@rhc54 https://github.com/open-mpi/ompi/pull/7744/files uses
|
I installed the HEAD of the v4.1.x branch, added an MCA param to the default param file, and it worked fine. I therefore expect that the problem lies in your use of those hieroglyphics to set the install directory for the default param file. One way to check: an example |
Thank you @rhc54 for your support. Our package uses v4.1.1, I will test against other 4.1.x point release and get back to you. Thank you again for your corporation. |
@rhc54 I checked
|
Nothing particularly special: $ ./configure --prefix=<foo>
$ make install
--- edit <foo>/etc/openmpi-mca-params.conf ---
$ mpirun -n 1 ./hello |
@rhc54 @wckzhang The root cause on my environment is host OpenMPI and OpenMPI inside container used different prefix (sysconfdir) when it is built. In that case, My apologies for the false alarm, and Thank you both for your kind support. |
Changing MCA parameters through openmpi-mca-params.conf does not work (no effect). It shows in ompi_info, though.
Details of the problem
$installdir/etc/openmpi-mca-params.conf
file changes are not applied to MPI processes.File has been modified so that
$installdir/bin/ompi_info
reports such changesRunning
$installdir/bin/mpirun -n 3 --omca pml_base_verbose=10 hello
reportsThe PML ucx is selected, showing that pml=ob1 has been ignored.
Note how the pml_base_version is passed on the command line, as otherwise it would have no effect either.
Configuration info
master #0dc23252
git clone
Edit
The text was updated successfully, but these errors were encountered: