-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI #141
base: main
Are you sure you want to change the base?
MPI #141
Conversation
Determine the maximum value of the key without implicit assumptions.
We were reusing first few random numbers following the source generation (state of RNG was the same at the beginning of sampling the source particle and its first flight). This commit moves the RNG back before the source generation is performed, thus preventing the reuse.
No communication takes place at this stage.
Will be reproducable if fixes to source_init are merged.
Allows to limit the console I/O to the master process only. Applied to fixed source calculation only at the moment.
It will be used for sampling without replacement.
Results from not master processes are not combined, hence they are lost at the moment.
Fixes a bug for synchronised scoreMemory. The buffer value after transfer in parallelBin was not properly set to 0 again.
It is not reproducable at the moment
use mpi_func, only : isMPIMaster, getWorkshare, getOffset, getMPIRank | ||
#ifdef MPI | ||
use mpi_func, only : MASTER_RANK, MPI_Bcast, MPI_INT, MPI_COMM_WORLD, & | ||
MPI_DOUBLE, mpi_reduce, MPI_SUM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MPI_DOUBLE, mpi_reduce, MPI_SUM | |
MPI_DOUBLE, MPI_REDUCE, MPI_SUM |
@@ -277,22 +308,29 @@ subroutine cycles(self, tally, tallyAtch, N_cycles) | |||
end_T = real(N_cycles,defReal) * elapsed_T / i | |||
T_toEnd = max(ZERO, end_T - elapsed_T) | |||
|
|||
#ifdef MPI | |||
! Print the population numbers referred to all processes to screen | |||
call mpi_reduce(nStart, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call mpi_reduce(nStart, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error) | |
call MPI_REDUCE(nStart, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error) |
printSectionEnd, printSeparatorLine | ||
use mpi_func, only : isMPIMaster, getWorkshare, getOffset, getMPIRank | ||
#ifdef MPI | ||
use mpi_func, only : MASTER_RANK, MPI_Bcast, MPI_INT, MPI_COMM_WORLD, & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we consistent with how we write MPI commands? Should they all be all caps or not? If they should then we should have
use mpi_func, only : MASTER_RANK, MPI_Bcast, MPI_INT, MPI_COMM_WORLD, & | |
use mpi_func, only : MASTER_RANK, MPI_BCAST, MPI_INT, MPI_COMM_WORLD, & |
! Print the population numbers referred to all processes to screen | ||
call mpi_reduce(nStart, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error) | ||
nStart = nTemp | ||
call mpi_reduce(nEnd, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call mpi_reduce(nEnd, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error) | |
call MPI_REDUCE(nEnd, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error) |
@@ -262,6 +288,11 @@ subroutine cycles(self, tally, tallyAtch, N_cycles) | |||
|
|||
end select | |||
|
|||
#ifdef MPI | |||
! Broadcast k_eff obtained in the master to all processes | |||
call MPI_Bcast(k_new, 1, MPI_DOUBLE, MASTER_RANK, MPI_COMM_WORLD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call MPI_Bcast(k_new, 1, MPI_DOUBLE, MASTER_RANK, MPI_COMM_WORLD) | |
call MPI_BCAST(k_new, 1, MPI_DOUBLE, MASTER_RANK, MPI_COMM_WORLD) |
|
||
! Broadcast seed to all processes | ||
#ifdef MPI | ||
call MPI_Bcast(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call MPI_Bcast(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD) | |
call MPI_BCAST(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD) |
printSeparatorLine | ||
use mpi_func, only : isMPIMaster, getWorkshare, getOffset, getMPIRank | ||
#ifdef MPI | ||
use mpi_func, only : MASTER_RANK, MPI_Bcast, MPI_INT, MPI_COMM_WORLD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use mpi_func, only : MASTER_RANK, MPI_Bcast, MPI_INT, MPI_COMM_WORLD | |
use mpi_func, only : MASTER_RANK, MPI_BCAST, MPI_INT, MPI_COMM_WORLD |
|
||
! Broadcast seed to all processes | ||
#ifdef MPI | ||
call MPI_Bcast(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call MPI_Bcast(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD) | |
call MPI_BCAST(seed_temp, 1, MPI_INT, MASTER_RANK, MPI_COMM_WORLD) |
SharedModules/errors_mod.f90
Outdated
@@ -9,6 +9,7 @@ module errors_mod | |||
|
|||
use numPrecision | |||
use universalVariables, only : MAX_COL | |||
!use mpi_func, only : getMPIRank |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this?
SharedModules/errors_mod.f90
Outdated
@@ -36,7 +37,9 @@ subroutine fatalError(where, why) | |||
|
|||
! Upper frame | |||
write(error_unit, *) repeat('<>', MAX_COL / 2) | |||
|
|||
! #ifdef MPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete?
SharedModules/mpi_func.f90
Outdated
!! It is necessary for load balancing in the dungeon: particles have to be | ||
!! transferred betwen processes, and MPI doesn't allow to transfer types with | ||
!! type-bound procedures | ||
type, public :: particleStateDummy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so painful :( Are there no alternatives around this? It feels like a nasty bug waiting to happen!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. We need to pass particle states between ranks. The alternative is to pass individual properties but I don't see how that's less painful.
#ifdef MPI | ||
integer(shortInt) :: ierr | ||
|
||
call mpi_init(ierr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As before, I think we need to be consistent with how we call MPI functions. Maybe I have missed something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should. I am not sure if @Mikolaj-A-Kowalski had a plan when starting MPI. We could use full caps for the in-build MPI functions, like MPI_REDUCE and MPI_BCAST, and normal SCONE capitalisation for the SCONE procedures inside mpi_func.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep... massive delay here.
I am generally fully in the team small_letters
with regards to all function calls. Capitals should just indicate constants etc.
I would just argue that the ALL CAPS is a bit of a legacy thing from the 80s. We should just treat MPI procedures like we do the intrinsic procedures and keep them small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all the procedures should now be in small letters. The types however are all caps, as well as MPI inbuilt stuff like MPI_COMM_WORLD
. I could make it all small if we prefer, that's not a problem.
After installation of all the dependencies and the compilation of SCONE, | ||
one can finally run the code. The executable ``scone.out`` is in the ``build`` | ||
folder; one can run it from any folder by specifying the path to the executables | ||
and to the input file:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it would be good to clearly describe the interaction between OpenMP and MPI?
To get the tests passing, I think it's a case of including MPI installation in the docker image? mikolajkowalski/scone-test |
a978577
to
3fb146e
Compare
After a long time spent fighting with the compiler..... he is still winning. The problem is related to the Docker container though, rather than to SCONE. OMPI doesn't like it when we try to run the tests as root-user in the container (this refers to the unitTests). I managed to force it with gfortran9 and gfortran10 (which, as you can see, run ok). However I couldn't find a way to get gfortran8 to agree to run! |
Another thing to notice is that in the particleDungeon, the previous sampling w/o replacement algorithm is in a new procedure called samplingWithoutReplacement, and it isn't used. |
The reason here is most probably that the In general we should probably push up the compiler versions for CI and maybe drop gfortran-8 :-/ If we want to keep gfotran-8 we could just try MPICH? |
No reason to keep dead code. It will be preserved in the Git history if anyone ever wants to inspect it. |
Yes that makes sense, about the older OMPI version (annoying!). I agree about dropping gfortran 8 and adding newer versions (all this in a separate PR). But in this case, the problem remains that I can't get the tests to pass in this PR.... I will surely manage with more work but I wonder if it's worth spending time on this. |
We can just make the new PR quick (yes... I know) and then rebase this one on the |
Finally! I think this is ready to go.
This PR includes MPI support. The main differences are in the tallies and in the dungeon for population normalisation between cycles and load balancing (from Paul Romano's PhD).
In the tallies, one key change is that a report was added: closeCycle other than reportCycleEnd. With MPI, there are two options: mpiSync 1 means that the tallies are synchronised each cycle; mpiSync 0 means that the tallies from each rank are collected at the end of the calculation. All calculations give identical results when mpiSync 1; they are within statistics when mpiSync 0. NOTE that this option applies to all tallies included in a tally admin rather to individual clerks. Splitting reportCycleEnd into two procedures (i.e., adding closeCycle), makes reproducibility easier for most tallyClerks.
In the dungeon, population normalisation was implemented using a new data structure, heapQueue. Note that to ensure reproducibility, particles have to be sorted before and after sampling. Then, load balancing is performed transferring particles between processes.
Results seem to be reproducible for all tallies, and all the tests pass successfully.
@Mikolaj-A-Kowalski However the github tests seem to fail during compilation because the MPI library is not available. Do you know if there's an easy solution to this? Otherwise I'll have a look.