-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MPI HALOEXCHANGE kernels #332
Conversation
Change corresponding pack and unpack extents to be on opposite sides of the domain so they make sense with periodic boundary conditions. Also refactor extent generation to be easier to understand.
Only Base_Seq is implemented
Add support for separate buffers where for example you pack into one buffer, then copy that to the MPI buffer, and then call MPI. In addition to a single buffer where for example you pack into the one buffer, and then call MPI.
Note that this is incorrect for periodic boundaries
Remove all variants of MPI_HALOEXCHANGE when the number of ranks is not a perfect cube of an integer. Restore periodic boundaries.
Note that this may result in ranks communicating mostly with themselves if the number of mpi ranks does not have 3 prime factors
Do we do any MPI builds in the CI? |
Not yet. We could add in this PR or wait until CI stuff is synced up between projects. We're in the process of moving to a new Spack. Then, we can switch over to newer compiler versions. |
To keep things simple let's worry about it in a different PR. |
d3700a0
to
2b5c5a9
Compare
@adrienbernede @rhornung67 Do either of you know if other projects with MPI tests actually enable MPI in CI testing? |
@MrBurmark Serac, MFEM, Umpire do. Now, Umpire does not test with MPI on every machine... |
Ok, I'll try adding MPI tests on every machine. |
Move ruby mpi CI job into list of extras
@adrienbernede I am seeing failures on lassen due to what appears to be invalid arguments to mpirun. Have you seen this before?
|
@MrBurmark, yes, it means that MPIEXEC_NUMPROC_FLAG is not set correctly... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working through the Caliper instrumentation, all looks great now.
* Allow controlled overlapping in flux allocations * Update to RAJA@develop (grab improved MPI test handling in CI) * From RAJA: From RSC : fix MPIEXEC flag with spectrum-mpi * Fix typo * From RAJA: From RSC: Import the full logic instead of overriding * From RAJA: From RSC: Fix syntax * From RAJA: From RSC: Fix missing import * Remove extra layer of allocation nesting on lassen * Passing flag to sub allocation * Attempt to fix quotes usage * Move to RADIUSS Shared CI 2023.12.0 release branch * Simplify build_and_test script further, remove alloc option * From RAJA: From RSC: Do not change flag for spectrum-mpi (fixed error elsewhere) but change it for cray-mpich * From RAJA: From RSC: Fix typo * From RAJA: From RSC: Fix syntax
@MrBurmark @rhornung67 I solved both the lassen and tioga issues. The CI should pass once corona is back. |
echo "~~~~~~~~~ Run Command: ~~~~~~~~~~~~~~~~~~~~~" | ||
echo "ctest --output-on-failure -T test 2>&1 | tee tests_output.txt" | ||
echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" | ||
date | ||
ctest --output-on-failure -T test 2>&1 | tee tests_output.txt | ||
date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MrBurmark I don’t know what --smpiargs="-disable_gpu_hooks"
is used for. Tests are running without it. Attempts to define it in lalloc failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is something you need on lassen if you're not running through lrun/mpi.
I'm working on the merge conflicts and I'll push an updated branch shortly |
…AJAPerf into feature/burmark1/MPI_HALOEXCHANGE
… corona, remove qos on ruby
I’m finally seeing green ! |
…PI_HALOEXCHANGE # Conflicts: # src/comm/HALO_PACKING_FUSED-Cuda.cpp # src/comm/HALO_PACKING_FUSED-Hip.cpp
b7de33a
to
ae949e0
Compare
Summary
Add halo-exchange kernels with MPI communication.
This PR adds halo-exchange communication kernels with actual MPI communicaiton.
It includes a version of the HALOEXCHANGE kernel and the HALOEXCHANGE_FUSED kernel.
This is a simple version of what is done in Comb.
The MPI_HALOEXCHANGE kernels do the same packing and unpacking as non-MPI HALOEXCHANGE kernels and they actually communicate the data via MPI. I made some changes to the non-MPI kernels to ensure consistency with the MPI kernels.
In all of the HALOEXCHANGE kernels there is a NxNxN structured block in each rank. In the MPI versions the number of ranks is factored and the blocks of each MPI process are connected into a larger 3D block. The number of processes on each size of this block can be manually set via
--mpi_3d_division I J K
. By default a division is created by factoring the number of MPI processes and trying to make a relatively square grid. If there is only a single factor in the number of MPI processes then the grid will be a like a pencil and if it only has two prime factors then it will be like a flat plane. Care should be taken when choosing the number of MPI processes and the mpi 3d division to ensure a realistic distribution of messages.