-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CMake MPI preset: update logic and extend documentation #1300
Conversation
368dc83
to
a1cf1c5
Compare
On ALPS Daint
(an extent of the full list, just for reference)
|
cscs-ci run |
1 similar comment
cscs-ci run |
cmake/DLAF_AddTest.cmake
Outdated
"Flag used by MPI to specify the number of cores per rank for mpiexec. If not empty, you have to specify also the number of cores available per node in MPIEXEC_NUMCORES_PER_RANK." | ||
FORCE | ||
) | ||
set(MPIEXEC_NUMCORES_PER_RANK "1" CACHE STRING "Number of cores used by each MPI rank.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a good default?
4af1881
to
1b8f898
Compare
cscs-ci run |
@albestro you may already have noticed, but in case not: something in the cmake config does not behave as before for CI: https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/4700071344751697/7514005670787789/-/jobs/9430728485. |
Co-authored-by: Rocco Meli <rocco.meli@cscs.ch>
Co-authored-by: Rocco Meli <rocco.meli@cscs.ch>
Co-authored-by: Rocco Meli <rocco.meli@cscs.ch>
Co-authored-by: Rocco Meli <rocco.meli@cscs.ch>
Co-authored-by: Rocco Meli <rocco.meli@cscs.ch>
Co-authored-by: Rocco Meli <rocco.meli@cscs.ch>
Co-authored-by: Rocco Meli <rocco.meli@cscs.ch>
Co-authored-by: Mikael Simberg <mikael.simberg@iki.fi>
0de4aaa
to
aaccfa3
Compare
cscs-ci run |
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #1300 +/- ##
=======================================
Coverage 95.07% 95.07%
=======================================
Files 141 141
Lines 8654 8654
Branches 1110 1110
=======================================
Hits 8228 8228
Misses 239 239
Partials 187 187 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@albestro just because it's difficult to see from the pushed changes, what was the issue in CI? What did you change to fix it?
I think I've got bite by CMake lists, since I forgot to quote docstrings. The commit failing in CI was 3bbb283, so that we can look at the status of the code at that time. In particular the problem was that this docstring was defined as a "list" of strings (for formatting convenience) instead of a simple string DLA-Future/cmake/DLAF_AddTest.cmake Lines 15 to 18 in 3bbb283
Then it was getting used unquoted DLA-Future/cmake/DLAF_AddTest.cmake Line 56 in 3bbb283
Resulting in CMake setting the variable value with a list instead of picking the CACHE version of
which for what concerns the check this variable cannot be considered empty. Curiosity: this check would have not triggered if both variables in the check had a docstring hard-coded with a list of string instead of a simple string (it wasn't the case because the other one didn't need multiple strings for formatting convenience). I fixed it by using docstrings quoted, so CMake turns a list in a space separated string (as we need it). |
Changelog:
MPIEXEC_NUMCORES
withMPIEXEC_NUMCORES_PER_RANK
(see details below)MPIEXEC_MAX_NUMPROCS
is considered just forplain-mpi
MPIEXEC_MAX_NUMPROCS
/MPIEXEC_NUMCORES_PER_RANK
usagedlaf_setup_mpi_preset
to DLAF_AddTest cmake script (it seems a more appropriate place)Depending on the DLAF_MPI_PRESET we have two different behaviours.
--pika:bind=none --pika:threads=N
with
N=
srun -n MPIRANKS ${MPIEXEC_NUMCORE_FLAG}=N
with
N=
MPIEXEC_MAX_NUMPROCS/MPIRANKS
MPIEXEC_NUMCORES/MPIRANKS
MPIEXEC_MAX_NUMPROCS/MPIRANKS
MPIEXEC_NUMCORES_PER_RANK
The behaviour will change just for
slurm
/custom
preset.Some notes/reasons for the change are:
MPIEXEC_MAX_NUMPROCS
is used just forplain-mpi
and not in other presets. I'm not sure how helpful would be to have setting it also for other presets and what kind of additional check we could add (e.g. some kind of upper bound check on configuration).