Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Particle tiling and OpenMP threading #551

Merged
merged 41 commits into from
Jul 27, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
d8d53ff
first tiling
MaxThevenet Jul 2, 2021
6e64c28
particle tiling for shape factor 0, both deposition and FG+PP
MaxThevenet Jul 2, 2021
4dbac90
little cleaning
MaxThevenet Jul 2, 2021
8f37c12
add a few missing files
MaxThevenet Jul 2, 2021
fa58841
instrument sorting functions
MaxThevenet Jul 2, 2021
f1b0f3a
minor cleaning and indentation
MaxThevenet Jul 2, 2021
8e3c141
first draft to fix race conditions in deposition
MaxThevenet Jul 2, 2021
d3cda2f
deposit in tmp arrays to avoid race conditions
MaxThevenet Jul 3, 2021
debec1b
a few fixed in alloc and guard cells
MaxThevenet Jul 3, 2021
a4aca6a
setval 0 each tile
MaxThevenet Jul 3, 2021
d9da854
attempt of cleaning, it broke the code
MaxThevenet Jul 3, 2021
668989d
fix error in number of tiles
MaxThevenet Jul 3, 2021
7e44f1f
minor typo in doc
MaxThevenet Jul 3, 2021
a83443a
revert spurious change
MaxThevenet Jul 3, 2021
04a9d20
typo in doc
MaxThevenet Jul 3, 2021
a039040
typo
MaxThevenet Jul 3, 2021
9b84773
revert spurious changes in input files
MaxThevenet Jul 3, 2021
3a7393f
better assert
MaxThevenet Jul 3, 2021
1550db5
eol
MaxThevenet Jul 3, 2021
0c7812e
a few more sorting for PC
MaxThevenet Jul 5, 2021
f7be859
fix code without openMP
MaxThevenet Jul 21, 2021
c100ae7
most tests pass with do_tiling = 1
MaxThevenet Jul 21, 2021
9b35e12
all tests pass with do_tiling, except slice_IO.1Rank which has a weir…
MaxThevenet Jul 21, 2021
ee382ba
attempt to fix ionization
MaxThevenet Jul 22, 2021
696fb8e
merge dev and fix conflicts
MaxThevenet Jul 22, 2021
364e7af
eol
MaxThevenet Jul 22, 2021
8e63276
fix compilation warnings
MaxThevenet Jul 22, 2021
a9c81e2
temporarily more verbose CI
MaxThevenet Jul 22, 2021
67dbbe5
avoid looping over empty vect
MaxThevenet Jul 22, 2021
ff5612b
initialize variables
MaxThevenet Jul 22, 2021
0abb79d
reverse verbose CI
MaxThevenet Jul 22, 2021
208d508
Add some comments
MaxThevenet Jul 22, 2021
009c9de
doc
MaxThevenet Jul 22, 2021
fa77098
typo
MaxThevenet Jul 22, 2021
8cc8a27
tiling on by default, and use smaller tile size for CI
MaxThevenet Jul 22, 2021
ca6a1fc
Merge branch 'development' into tiling3
MaxThevenet Jul 22, 2021
4b383e3
forgot one test
MaxThevenet Jul 22, 2021
d177061
Compare Ez, as Bz is too close to 0
MaxThevenet Jul 23, 2021
d97941e
increase tolerance for slice IO
MaxThevenet Jul 23, 2021
3538fb2
remove a few unnecessary tile sorts
MaxThevenet Jul 26, 2021
2b3ae93
Update src/particles/deposition/PlasmaDepositCurrentInner.H
MaxThevenet Jul 27, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions docs/source/run/parameters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,11 @@ General parameters
available. If both Adios2 and HDF5 are available, `h5` is used. Note that `json` is extremely
slow and is not recommended for production runs.

* ``hipace.do_tiling`` (`bool`) optional (default `true`)
Whether to use tiling, when running on CPU.
Currently, this option only affects plasma operations (gather, push and deposition).
The tile size can be set with ``plasmas.sort_bin_size``.

Field solver parameters
-----------------------

Expand Down Expand Up @@ -203,6 +208,10 @@ plasma parameters for each plasma are specified via `<plasma name>.plasma_proper
The `plasma_name` of the plasma that contains the new electrons that are produced
when this plasma gets ionized. Only needed if this plasma is ionizable.

* ``plasmas.sort_bin_size`` (`int`) optional (default `32`)
Tile size for plasma current deposition, when running on CPU.
When tiling is activated (``hipace.do_tiling = 1``), the current deposition is done in temporary arrays of size ``sort_bin_size`` (+ guard cells) that are atomic-added to the main current arrays.

Beam parameters
---------------

Expand Down
6 changes: 3 additions & 3 deletions examples/blowout_wake/analysis_slice_IO.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
from openpmd_viewer import OpenPMDTimeSeries

do_plot = False
field = 'Bz'
field = 'Ez'

ts1 = OpenPMDTimeSeries('full_io')
F_full = ts1.get_field(field=field, iteration=ts1.iterations[-1])[0]
Expand Down Expand Up @@ -69,5 +69,5 @@
print("error_xz", error_xz)
print("error_yz", error_yz)

assert(error_xz < 1.e-14)
assert(error_yz < 1.e-14)
assert(error_xz < 3.e-14)
assert(error_yz < 3.e-14)
2 changes: 2 additions & 0 deletions src/Hipace.H
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,8 @@ public:
static bool m_do_beam_jx_jy_deposition;
/** Whether to call amrex::Gpu::synchronize() around all profiler region */
static bool m_do_device_synchronize;
/** Whether to use tiling for particle operations */
static bool m_do_tiling;
/** How much the box is coarsened for beam injection, to avoid exceeding max int in cell count.
* Otherwise, changing this parameter only will not affect the simulation results. */
static int m_beam_injection_cr;
Expand Down
18 changes: 13 additions & 5 deletions src/Hipace.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#include "Hipace.H"
#include "utils/HipaceProfilerWrapper.H"
#include "particles/BinSort.H"
#include "particles/SliceSort.H"
#include "particles/BoxSort.H"
#include "utils/IOUtil.H"
#include "particles/pusher/GetAndSetPosition.H"
Expand Down Expand Up @@ -45,6 +45,7 @@ amrex::Real Hipace::m_external_Ez_slope = 0.;
amrex::Real Hipace::m_external_Ez_uniform = 0.;
amrex::Real Hipace::m_MG_tolerance_rel = 1.e-4;
amrex::Real Hipace::m_MG_tolerance_abs = 0.;
bool Hipace::m_do_tiling = true;

Hipace&
Hipace::GetInstance ()
Expand Down Expand Up @@ -118,6 +119,7 @@ Hipace::Hipace () :

pph.query("MG_tolerance_rel", m_MG_tolerance_rel);
pph.query("MG_tolerance_abs", m_MG_tolerance_abs);
pph.query("do_tiling", m_do_tiling);

if (maxLevel() > 0) {
AMREX_ALWAYS_ASSERT(maxLevel() < 2);
Expand Down Expand Up @@ -279,7 +281,8 @@ Hipace::MakeNewLevelFromScratch (
DefineSliceGDB(lev, ba, dm);
// Note: we pass ba[0] as a dummy box, it will be resized properly in the loop over boxes in Evolve
m_diags.AllocData(lev, ba[0], Comps[WhichSlice::This]["N"], Geom(lev));
m_fields.AllocData(lev, Geom(), m_slice_ba[lev], m_slice_dm[lev], ref_ratio);
m_fields.AllocData(lev, Geom(), m_slice_ba[lev], m_slice_dm[lev],
m_multi_plasma.m_sort_bin_size, ref_ratio);
}

void
Expand Down Expand Up @@ -351,7 +354,6 @@ Hipace::Evolve ()
HIPACE_PROFILE("Hipace::Evolve()");
const int rank = amrex::ParallelDescriptor::MyProc();
int const lev = 0;

m_box_sorters.clear();
m_multi_beam.sortParticlesByBox(m_box_sorters, boxArray(lev), geom[lev]);

Expand All @@ -367,6 +369,7 @@ Hipace::Evolve ()
ResetAllQuantities();

/* Store charge density of (immobile) ions into WhichSlice::RhoIons */
if (m_do_tiling) m_multi_plasma.TileSort(boxArray(lev)[0], geom[lev]);
m_multi_plasma.DepositNeutralizingBackground(m_fields, WhichSlice::RhoIons, geom[lev],
finestLevel()+1);

Expand Down Expand Up @@ -475,12 +478,15 @@ Hipace::SolveOneSlice (int islice, const int ibox, amrex::Vector<BeamBins>& bins
m_fields.getSlices(lev, WhichSlice::This).setVal(0.);
}

if (!m_explicit) m_multi_plasma.AdvanceParticles(m_fields, geom[lev], false,
true, false, false, lev);
if (!m_explicit) {
m_multi_plasma.AdvanceParticles(m_fields, geom[lev], false,
true, false, false, lev);
}

amrex::MultiFab rho(m_fields.getSlices(lev, WhichSlice::This), amrex::make_alias,
Comps[WhichSlice::This]["rho"], 1);

if (m_do_tiling) m_multi_plasma.TileSort(bx, geom[lev]);
m_multi_plasma.DepositCurrent(
m_fields, WhichSlice::This, false, true, true, true, m_explicit, geom[lev], lev);

Expand Down Expand Up @@ -546,6 +552,7 @@ Hipace::SolveOneSlice (int islice, const int ibox, amrex::Vector<BeamBins>& bins
m_fields.ShiftSlices(lev);

m_multi_plasma.DoFieldIonization(lev, geom[lev], m_fields);
if (m_multi_plasma.IonizationOn() && m_do_tiling) m_multi_plasma.TileSort(bx, geom[lev]);

// After this, the parallel context is the full 3D communicator again
amrex::ParallelContext::pop();
Expand Down Expand Up @@ -865,6 +872,7 @@ Hipace::PredictorCorrectorLoopToSolveBxBy (const int islice, const int lev, cons
/* Push particles to the next slice */
m_multi_plasma.AdvanceParticles(m_fields, geom[lev], true, true, false, false, lev);

if (m_do_tiling) m_multi_plasma.TileSort(boxArray(lev)[0], geom[lev]);
/* deposit current to next slice */
m_multi_plasma.DepositCurrent(
m_fields, WhichSlice::Next, true, true, false, false, false, geom[lev], lev);
Expand Down
9 changes: 7 additions & 2 deletions src/fields/Fields.H
Original file line number Diff line number Diff line change
Expand Up @@ -77,11 +77,13 @@ public:
* \param[in] geom Geometry
* \param[in] slice_ba BoxArray for the slice
* \param[in] slice_dm DistributionMapping for the slice
* \param[in] bin_size size of plasma tiles, for tmp arrays
* \param[in] ref_ratio mesh refinement ratio
*/
void AllocData (
int lev, amrex::Vector<amrex::Geometry> const& geom, const amrex::BoxArray& slice_ba,
const amrex::DistributionMapping& slice_dm, amrex::Vector<amrex::IntVect> const& ref_ratio);
const amrex::DistributionMapping& slice_dm, int bin_size,
amrex::Vector<amrex::IntVect> const& ref_ratio);

/** Class to handle transverse FFT Poisson solver on 1 slice */
amrex::Vector<std::unique_ptr<FFTPoissonSolver>> m_poisson_solver;
Expand All @@ -96,7 +98,8 @@ public:
* \param[in] islice slice index
*/
amrex::MultiFab& getSlices (int lev, int islice) {return m_slices[lev][islice]; }

/** Return reference to density tile arrays */
amrex::Vector<amrex::FArrayBox>& getTmpDensities() { return m_tmp_densities; }
/** \brief Copy between the full FArrayBox and slice MultiFab.
*
* \param[in] lev MR level
Expand Down Expand Up @@ -259,6 +262,8 @@ private:
amrex::Vector<std::array<amrex::MultiFab, m_nslices>> m_slices;
/** Whether to use Dirichlet BC for the Poisson solver. Otherwise, periodic */
bool m_do_dirichlet_poisson = true;
/** Temporary density arrays. one per OpenMP thread, used when tiling is on. */
amrex::Vector<amrex::FArrayBox> m_tmp_densities;
amrex::IntVect m_ref_ratio; /**< refinement ratio in {x, y, z} */
};

Expand Down
22 changes: 21 additions & 1 deletion src/fields/Fields.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Fields::Fields (Hipace const* a_hipace)
void
Fields::AllocData (
int lev, amrex::Vector<amrex::Geometry> const& geom, const amrex::BoxArray& slice_ba,
const amrex::DistributionMapping& slice_dm, amrex::Vector<amrex::IntVect> const& ref_ratio)
const amrex::DistributionMapping& slice_dm, int bin_size, amrex::Vector<amrex::IntVect> const& ref_ratio)
{
HIPACE_PROFILE("Fields::AllocData()");
// Need at least 1 guard cell transversally for transverse derivative
Expand Down Expand Up @@ -46,6 +46,26 @@ Fields::AllocData (
getSlices(lev, WhichSlice::This).DistributionMap(),
geom[lev])) );
}
int num_threads = 1;
#ifdef AMREX_USE_OMP
#pragma omp parallel
{
num_threads = omp_get_num_threads();
}
#endif
if (Hipace::m_do_tiling) {
const amrex::Box dom_box = slice_ba[0];
const amrex::IntVect ncell = dom_box.bigEnd() - dom_box.smallEnd() + 1;
AMREX_ALWAYS_ASSERT(ncell[0] % bin_size == 0 && ncell[1] % bin_size == 0);

m_tmp_densities.resize(num_threads);
for (int i=0; i<num_threads; i++){
amrex::Box bx = {{0, 0, 0}, {bin_size-1, bin_size-1, 0}};
bx.grow(m_slices_nguards);
// jx jy jz rho jxx jxy jyy
m_tmp_densities[i].resize(bx, 7);
}
}
}

void
Expand Down
3 changes: 2 additions & 1 deletion src/particles/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ target_sources(HiPACE
BeamParticleContainerInit.cpp
PlasmaParticleContainer.cpp
PlasmaParticleContainerInit.cpp
BinSort.cpp
SliceSort.cpp
TileSort.cpp
BoxSort.cpp
MultiBeam.cpp
MultiPlasma.cpp
Expand Down
2 changes: 1 addition & 1 deletion src/particles/MultiBeam.H
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

#include "BeamParticleContainer.H"
#include "fields/Fields.H"
#include "particles/BinSort.H"
#include "particles/SliceSort.H"
#include "particles/BoxSort.H"

class MultiBeam
Expand Down
2 changes: 1 addition & 1 deletion src/particles/MultiBeam.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#include "MultiBeam.H"
#include "deposition/BeamDepositCurrent.H"
#include "particles/BinSort.H"
#include "particles/SliceSort.H"
#include "pusher/BeamParticleAdvance.H"
#include "utils/HipaceProfilerWrapper.H"

Expand Down
15 changes: 14 additions & 1 deletion src/particles/MultiPlasma.H
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#define MULTIPLASMA_H_

#include "PlasmaParticleContainer.H"
#include "TileSort.H"
#include "fields/Fields.H"

class MultiPlasma
Expand Down Expand Up @@ -91,13 +92,25 @@ public:
*/
void DoFieldIonization (const int lev, const amrex::Geometry& geom, Fields& fields);

bool IonizationOn () const;
/** \brief whether all plasma species use a neutralizing background, e.g. no ion motion */
bool AllSpeciesNeutralizeBackground () const;

/** \brief sort particles of all containers by tile logically, and store results in m_all_bins
*
* \param[in] bx transverse box on which the particles are sorted
* \param[in] geom Geometry object
*/
void TileSort (amrex::Box bx, amrex::Geometry geom);

int m_sort_bin_size {32}; /**< Tile size to sort plasma particles */

private:

amrex::Vector<PlasmaParticleContainer> m_all_plasmas; /**< contains all plasma containers */
amrex::Vector<PlasmaBins> m_all_bins; /**< Logical tile bins for all plasma containers */
amrex::Vector<std::string> m_names; /**< names of all plasma containers */
int m_nplasmas; /**< number of plasma containers */
int m_nplasmas = 0; /**< number of plasma containers */
/** Background (hypothetical) density, used to compute the adaptive time step */
amrex::Real m_adaptive_density = 0.;
};
Expand Down
47 changes: 38 additions & 9 deletions src/particles/MultiPlasma.cpp
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
#include "MultiPlasma.H"
#include "particles/deposition/PlasmaDepositCurrent.H"
#include "particles/pusher/PlasmaParticleAdvance.H"
#include "TileSort.H"
#include "utils/HipaceProfilerWrapper.H"

MultiPlasma::MultiPlasma (amrex::AmrCore* amr_core)
{

amrex::ParmParse pp("plasmas");
pp.getarr("names", m_names);
pp.query("adaptive_density", m_adaptive_density);
pp.query("sort_bin_size", m_sort_bin_size);

if (m_names[0] == "no_plasma") return;
m_nplasmas = m_names.size();
for (int i = 0; i < m_nplasmas; ++i) {
Expand All @@ -20,6 +24,7 @@ MultiPlasma::InitData (amrex::Vector<amrex::BoxArray> slice_ba,
amrex::Vector<amrex::DistributionMapping> slice_dm,
amrex::Vector<amrex::Geometry> slice_gm, amrex::Vector<amrex::Geometry> gm)
{
HIPACE_PROFILE("MultiPlasma::InitData()");
for (auto& plasma : m_all_plasmas) {
const int lev = plasma.m_level;
plasma.SetParticleBoxArray(lev, slice_ba[lev]);
Expand All @@ -39,6 +44,7 @@ MultiPlasma::InitData (amrex::Vector<amrex::BoxArray> slice_ba,
plasma.InitIonizationModule(gm[lev], plasma_product);
}
}
if (m_nplasmas > 0) m_all_bins.resize(m_nplasmas);
}

amrex::Real
Expand All @@ -56,9 +62,10 @@ MultiPlasma::DepositCurrent (
Fields & fields, int which_slice, bool temp_slice, bool deposit_jx_jy, bool deposit_jz,
bool deposit_rho, bool deposit_j_squared, amrex::Geometry const& gm, int const lev)
{
for (auto& plasma : m_all_plasmas) {
::DepositCurrent(plasma, fields, which_slice, temp_slice, deposit_jx_jy, deposit_jz,
deposit_rho, deposit_j_squared, gm, lev);
for (int i=0; i<m_nplasmas; i++) {
::DepositCurrent(m_all_plasmas[i], fields, which_slice, temp_slice,
deposit_jx_jy, deposit_jz, deposit_rho, deposit_j_squared,
gm, lev, m_all_bins[i], m_sort_bin_size);
}
}

Expand All @@ -67,14 +74,16 @@ MultiPlasma::AdvanceParticles (
Fields & fields, amrex::Geometry const& gm, bool temp_slice, bool do_push,
bool do_update, bool do_shift, int lev)
{
for (auto& plasma : m_all_plasmas) {
AdvancePlasmaParticles(plasma, fields, gm, temp_slice, do_push, do_update, do_shift, lev);
for (int i=0; i<m_nplasmas; i++) {
AdvancePlasmaParticles(m_all_plasmas[i], fields, gm, temp_slice,
do_push, do_update, do_shift, lev, m_all_bins[i]);
}
}

void
MultiPlasma::ResetParticles (int lev, bool initial)
{
if (m_nplasmas < 1) return;
for (auto& plasma : m_all_plasmas) {
ResetPlasmaParticles(plasma, lev, initial);
}
Expand All @@ -85,11 +94,11 @@ MultiPlasma::DepositNeutralizingBackground (
Fields & fields, int which_slice, amrex::Geometry const& gm, int const nlev)
{
for (int lev = 0; lev < nlev; ++lev) {
for (auto& plasma : m_all_plasmas) {
if (plasma.m_neutralize_background){
for (int i=0; i<m_nplasmas; i++) {
if (m_all_plasmas[i].m_neutralize_background){
// current of ions is zero, so they are not deposited.
::DepositCurrent(plasma, fields, which_slice, false,
false, false, true, false, gm, lev);
::DepositCurrent(m_all_plasmas[i], fields, which_slice, false, false, false,
true, false, gm, lev, m_all_bins[i], m_sort_bin_size);
}
}
}
Expand All @@ -102,7 +111,16 @@ MultiPlasma::DoFieldIonization (
for (auto& plasma : m_all_plasmas) {
plasma.IonizationModule(lev, geom, fields);
}
}

bool
MultiPlasma::IonizationOn () const
{
bool ionization_on = false;
for (auto& plasma : m_all_plasmas) {
if (plasma.m_can_ionize) ionization_on = true;
}
return ionization_on;
}

bool
Expand All @@ -114,3 +132,14 @@ MultiPlasma::AllSpeciesNeutralizeBackground () const
}
return all_species_neutralize;
}

void
MultiPlasma::TileSort (amrex::Box bx, amrex::Geometry geom)
{
constexpr int lev = 0;
m_all_bins.clear();
for (auto& plasma : m_all_plasmas) {
m_all_bins.emplace_back(
findParticlesInEachTile(lev, bx, m_sort_bin_size, plasma, geom));
}
}
6 changes: 3 additions & 3 deletions src/particles/BinSort.H → src/particles/SliceSort.H
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#ifndef HIPACE_BinSort_H_
#define HIPACE_BinSort_H_
#ifndef HIPACE_SLICESORT_H_
#define HIPACE_SLICESORT_H_

#include "BeamParticleContainer.H"
#include "BoxSort.H"
Expand All @@ -25,4 +25,4 @@ findParticlesInEachSlice (
BeamParticleContainer& beam, const amrex::Geometry& geom,
const BoxSorter& a_box_sorter);

#endif // HIPACE_BinSort_H_
#endif // HIPACE_SLICESORT_H_
Loading