Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI #141

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open

MPI #141

wants to merge 28 commits into from

Conversation

valeriaRaffuzzi
Copy link
Member

@valeriaRaffuzzi valeriaRaffuzzi commented Nov 15, 2024

Finally! I think this is ready to go.

This PR includes MPI support. The main differences are in the tallies and in the dungeon for population normalisation between cycles and load balancing (from Paul Romano's PhD).

In the tallies, one key change is that a report was added: closeCycle other than reportCycleEnd. With MPI, there are two options: mpiSync 1 means that the tallies are synchronised each cycle; mpiSync 0 means that the tallies from each rank are collected at the end of the calculation. All calculations give identical results when mpiSync 1; they are within statistics when mpiSync 0. NOTE that this option applies to all tallies included in a tally admin rather to individual clerks. Splitting reportCycleEnd into two procedures (i.e., adding closeCycle), makes reproducibility easier for most tallyClerks.

In the dungeon, population normalisation was implemented using a new data structure, heapQueue. Note that to ensure reproducibility, particles have to be sorted before and after sampling. Then, load balancing is performed transferring particles between processes.

Results seem to be reproducible for all tallies, and all the tests pass successfully.

@Mikolaj-A-Kowalski However the github tests seem to fail during compilation because the MPI library is not available. Do you know if there's an easy solution to this? Otherwise I'll have a look.

Mikolaj-A-Kowalski and others added 26 commits September 30, 2024 12:11
Determine the maximum value of the key without implicit assumptions.
We were reusing first few random numbers following the source generation
(state of RNG was the same at the beginning of sampling the source
particle and its first flight). This commit moves the RNG back before
the source generation is performed, thus preventing the reuse.
No communication takes place at this stage.
Will be reproducable if fixes to source_init are merged.
Allows to limit the console I/O to the master process only. Applied
to fixed source calculation only at the moment.
It will be used for sampling without replacement.
Results from not master processes are not combined, hence they are lost
at the moment.
Fixes a bug for synchronised scoreMemory. The buffer value after
transfer in parallelBin was not properly set to 0 again.
It is not reproducable at the moment
use mpi_func, only : isMPIMaster, getWorkshare, getOffset, getMPIRank
#ifdef MPI
use mpi_func, only : MASTER_RANK, MPI_Bcast, MPI_INT, MPI_COMM_WORLD, &
MPI_DOUBLE, mpi_reduce, MPI_SUM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MPI_DOUBLE, mpi_reduce, MPI_SUM
MPI_DOUBLE, MPI_REDUCE, MPI_SUM

@@ -277,22 +308,29 @@ subroutine cycles(self, tally, tallyAtch, N_cycles)
end_T = real(N_cycles,defReal) * elapsed_T / i
T_toEnd = max(ZERO, end_T - elapsed_T)

#ifdef MPI
! Print the population numbers referred to all processes to screen
call mpi_reduce(nStart, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
call mpi_reduce(nStart, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error)
call MPI_REDUCE(nStart, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error)

printSectionEnd, printSeparatorLine
use mpi_func, only : isMPIMaster, getWorkshare, getOffset, getMPIRank
#ifdef MPI
use mpi_func, only : MASTER_RANK, MPI_Bcast, MPI_INT, MPI_COMM_WORLD, &
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we consistent with how we write MPI commands? Should they all be all caps or not? If they should then we should have

Suggested change
use mpi_func, only : MASTER_RANK, MPI_Bcast, MPI_INT, MPI_COMM_WORLD, &
use mpi_func, only : MASTER_RANK, MPI_BCAST, MPI_INT, MPI_COMM_WORLD, &

! Print the population numbers referred to all processes to screen
call mpi_reduce(nStart, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error)
nStart = nTemp
call mpi_reduce(nEnd, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
call mpi_reduce(nEnd, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error)
call MPI_REDUCE(nEnd, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error)

@@ -262,6 +288,11 @@ subroutine cycles(self, tally, tallyAtch, N_cycles)

end select

#ifdef MPI
! Broadcast k_eff obtained in the master to all processes
call MPI_Bcast(k_new, 1, MPI_DOUBLE, MASTER_RANK, MPI_COMM_WORLD)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
call MPI_Bcast(k_new, 1, MPI_DOUBLE, MASTER_RANK, MPI_COMM_WORLD)
call MPI_BCAST(k_new, 1, MPI_DOUBLE, MASTER_RANK, MPI_COMM_WORLD)


! Broadcast seed to all processes
#ifdef MPI
call MPI_Bcast(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
call MPI_Bcast(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD)
call MPI_BCAST(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD)

printSeparatorLine
use mpi_func, only : isMPIMaster, getWorkshare, getOffset, getMPIRank
#ifdef MPI
use mpi_func, only : MASTER_RANK, MPI_Bcast, MPI_INT, MPI_COMM_WORLD
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use mpi_func, only : MASTER_RANK, MPI_Bcast, MPI_INT, MPI_COMM_WORLD
use mpi_func, only : MASTER_RANK, MPI_BCAST, MPI_INT, MPI_COMM_WORLD


! Broadcast seed to all processes
#ifdef MPI
call MPI_Bcast(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
call MPI_Bcast(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD)
call MPI_BCAST(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD)

@@ -9,6 +9,7 @@ module errors_mod

use numPrecision
use universalVariables, only : MAX_COL
!use mpi_func, only : getMPIRank
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this?

@@ -36,7 +37,9 @@ subroutine fatalError(where, why)

! Upper frame
write(error_unit, *) repeat('<>', MAX_COL / 2)

! #ifdef MPI
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete?

!! It is necessary for load balancing in the dungeon: particles have to be
!! transferred betwen processes, and MPI doesn't allow to transfer types with
!! type-bound procedures
type, public :: particleStateDummy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so painful :( Are there no alternatives around this? It feels like a nasty bug waiting to happen!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. We need to pass particle states between ranks. The alternative is to pass individual properties but I don't see how that's less painful.

#ifdef MPI
integer(shortInt) :: ierr

call mpi_init(ierr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As before, I think we need to be consistent with how we call MPI functions. Maybe I have missed something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should. I am not sure if @Mikolaj-A-Kowalski had a plan when starting MPI. We could use full caps for the in-build MPI functions, like MPI_REDUCE and MPI_BCAST, and normal SCONE capitalisation for the SCONE procedures inside mpi_func.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep... massive delay here.
I am generally fully in the team small_letters with regards to all function calls. Capitals should just indicate constants etc.
I would just argue that the ALL CAPS is a bit of a legacy thing from the 80s. We should just treat MPI procedures like we do the intrinsic procedures and keep them small.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all the procedures should now be in small letters. The types however are all caps, as well as MPI inbuilt stuff like MPI_COMM_WORLD. I could make it all small if we prefer, that's not a problem.

After installation of all the dependencies and the compilation of SCONE,
one can finally run the code. The executable ``scone.out`` is in the ``build``
folder; one can run it from any folder by specifying the path to the executables
and to the input file::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be good to clearly describe the interaction between OpenMP and MPI?

@ChasingNeutrons
Copy link
Collaborator

To get the tests passing, I think it's a case of including MPI installation in the docker image? mikolajkowalski/scone-test

@valeriaRaffuzzi valeriaRaffuzzi force-pushed the mpi branch 2 times, most recently from a978577 to 3fb146e Compare December 17, 2024 14:49
@valeriaRaffuzzi
Copy link
Member Author

After a long time spent fighting with the compiler..... he is still winning. The problem is related to the Docker container though, rather than to SCONE. OMPI doesn't like it when we try to run the tests as root-user in the container (this refers to the unitTests). I managed to force it with gfortran9 and gfortran10 (which, as you can see, run ok). However I couldn't find a way to get gfortran8 to agree to run!
The latest is that I commented out gfortran8 to make sure that everything else was working correctly, which it is. If we want to reinstate gfortran8, I welcome ideas!

@valeriaRaffuzzi
Copy link
Member Author

Another thing to notice is that in the particleDungeon, the previous sampling w/o replacement algorithm is in a new procedure called samplingWithoutReplacement, and it isn't used.
I didn't want to delete it completely because it's nice code, and it might be useful in other contexts in the future? If you both agree I can however delete it since now it is redundant.

@Mikolaj-A-Kowalski
Copy link
Collaborator

After a long time spent fighting with the compiler..... he is still winning. The problem is related to the Docker container though, rather than to SCONE. OMPI doesn't like it when we try to run the tests as root-user in the container (this refers to the unitTests). I managed to force it with gfortran9 and gfortran10 (which, as you can see, run ok). However I couldn't find a way to get gfortran8 to agree to run! The latest is that I commented out gfortran8 to make sure that everything else was working correctly, which it is. If we want to reinstate gfortran8, I welcome ideas!

The reason here is most probably that the gfotran-8 image is based on the older Debian (buster), which has version of OpenMPI (3.1) before they added the environmental variables option in 4.0.

In general we should probably push up the compiler versions for CI and maybe drop gfortran-8 :-/
I would prefer it done in a separate PR for clarity. And in the same PR the newer compilers should be added.

If we want to keep gfotran-8 we could just try MPICH?

@Mikolaj-A-Kowalski
Copy link
Collaborator

Another thing to notice is that in the particleDungeon, the previous sampling w/o replacement algorithm is in a new procedure called samplingWithoutReplacement, and it isn't used. I didn't want to delete it completely because it's nice code, and it might be useful in other contexts in the future? If you both agree I can however delete it since now it is redundant.

No reason to keep dead code. It will be preserved in the Git history if anyone ever wants to inspect it.

@valeriaRaffuzzi
Copy link
Member Author

The reason here is most probably that the gfotran-8 image is based on the older Debian (buster), which has version of OpenMPI (3.1) before they added the environmental variables option in 4.0.

In general we should probably push up the compiler versions for CI and maybe drop gfortran-8 :-/ I would prefer it done in a separate PR for clarity. And in the same PR the newer compilers should be added.

If we want to keep gfotran-8 we could just try MPICH?

Yes that makes sense, about the older OMPI version (annoying!). I agree about dropping gfortran 8 and adding newer versions (all this in a separate PR). But in this case, the problem remains that I can't get the tests to pass in this PR.... I will surely manage with more work but I wonder if it's worth spending time on this.
Could try MPICH, but just to then drop it in the next PR?

@Mikolaj-A-Kowalski
Copy link
Collaborator

The reason here is most probably that the gfotran-8 image is based on the older Debian (buster), which has version of OpenMPI (3.1) before they added the environmental variables option in 4.0.
In general we should probably push up the compiler versions for CI and maybe drop gfortran-8 :-/ I would prefer it done in a separate PR for clarity. And in the same PR the newer compilers should be added.
If we want to keep gfotran-8 we could just try MPICH?

Yes that makes sense, about the older OMPI version (annoying!). I agree about dropping gfortran 8 and adding newer versions (all this in a separate PR). But in this case, the problem remains that I can't get the tests to pass in this PR.... I will surely manage with more work but I wonder if it's worth spending time on this. Could try MPICH, but just to then drop it in the next PR?

We can just make the new PR quick (yes... I know) and then rebase this one on the main with new compilers when it is merged. This one can be left without gfotran-8 for now.

@valeriaRaffuzzi valeriaRaffuzzi marked this pull request as draft December 23, 2024 14:03
@valeriaRaffuzzi valeriaRaffuzzi marked this pull request as ready for review December 23, 2024 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants