Particle tiling and OpenMP threading #551

MaxThevenet · 2021-07-03T13:46:54Z

This PR proposes to add logical tiling for plasma particles when running on CPU. The option is controlled with hipace.do_tiling (default true), and affects field gather (FG) + push and current deposition (CD). The tile size is controlled with plasmas.sort_bin_size (default 32). When activated, plasma particles are sorted logically (they are not re-ordered in memory, but an index mask is built to allow accessing particles in each tile) and particle operations are done on a per-tile basis:

For field gather and particle push, the main operations are encapsulated in a loop over tiles
The current deposition is done in temporary arrays, which are atomic-added to the main array

Effect on performance:

good: Improve cache re-use in CD and FG
bad: plasma particles are accessed randomly
good: enable OpenMP threading, to parallelise the simulation transversely.

In practice, the main changes are:

Create TileSort.H/cpp that does the logical tile sort (similar to BinSort.H/cpp, renamed SliceSort.H/cpp)
Create temporary arrays (in class Fields) for the current on 1 tile
Add a loop over tiles for plasma particle operations (Advance and Deposition)
Call the tile sort after each plasma particle advance.

…d problem size

MaxThevenet · 2021-07-23T07:43:57Z

This PR is ready for review.
@atmyers If I understand correctly, findParticlesInEachTile in src/particles/TileSort.cpp is not threaded, which has a noticeable effect on scaling. Do you see an easy way to make it openMP-parallel?

atmyers · 2021-07-23T15:46:38Z

I think we want to use a different algorithm there for the OpenMP case. Instead of using parallel prefix sum, we can use data duplication. It will probably be easier to explain this via a quick zoom meeting.

src/particles/deposition/PlasmaDepositCurrentInner.H

SeverinDiederichs

Awesome, thanks for this PR!

MaxThevenet added 17 commits July 2, 2021 13:00

first tiling

d8d53ff

particle tiling for shape factor 0, both deposition and FG+PP

6e64c28

little cleaning

4dbac90

add a few missing files

8f37c12

instrument sorting functions

fa58841

minor cleaning and indentation

f1b0f3a

first draft to fix race conditions in deposition

8e3c141

deposit in tmp arrays to avoid race conditions

d3cda2f

a few fixed in alloc and guard cells

debec1b

setval 0 each tile

a4aca6a

attempt of cleaning, it broke the code

d9da854

fix error in number of tiles

668989d

minor typo in doc

7e44f1f

revert spurious change

a83443a

typo in doc

04a9d20

typo

a039040

revert spurious changes in input files

9b84773

MaxThevenet added component: plasma About the plasma species performance optimization, benchmark, profiling, etc. labels Jul 3, 2021

MaxThevenet added 11 commits July 3, 2021 17:18

better assert

3a7393f

eol

1550db5

a few more sorting for PC

0c7812e

fix code without openMP

f7be859

most tests pass with do_tiling = 1

c100ae7

all tests pass with do_tiling, except slice_IO.1Rank which has a weir…

9b35e12

…d problem size

attempt to fix ionization

ee382ba

merge dev and fix conflicts

696fb8e

eol

364e7af

fix compilation warnings

8e63276

temporarily more verbose CI

a9c81e2

MaxThevenet added 11 commits July 22, 2021 16:00

avoid looping over empty vect

67dbbe5

initialize variables

ff5612b

reverse verbose CI

0abb79d

Add some comments

208d508

doc

009c9de

typo

fa77098

tiling on by default, and use smaller tile size for CI

8cc8a27

Merge branch 'development' into tiling3

ca6a1fc

forgot one test

4b383e3

Compare Ez, as Bz is too close to 0

d177061

increase tolerance for slice IO

d97941e

remove a few unnecessary tile sorts

3538fb2

MaxThevenet commented Jul 27, 2021

View reviewed changes

src/particles/deposition/PlasmaDepositCurrentInner.H Show resolved Hide resolved

Update src/particles/deposition/PlasmaDepositCurrentInner.H

2b3ae93

SeverinDiederichs approved these changes Jul 27, 2021

View reviewed changes

MaxThevenet merged commit 877b138 into Hi-PACE:development Jul 27, 2021

MaxThevenet deleted the tiling3 branch July 27, 2021 15:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Particle tiling and OpenMP threading #551

Particle tiling and OpenMP threading #551

MaxThevenet commented Jul 3, 2021 •

edited

Loading

MaxThevenet commented Jul 23, 2021

atmyers commented Jul 23, 2021

SeverinDiederichs left a comment

Particle tiling and OpenMP threading #551

Particle tiling and OpenMP threading #551

Conversation

MaxThevenet commented Jul 3, 2021 • edited Loading

Effect on performance:

In practice, the main changes are:

MaxThevenet commented Jul 23, 2021

atmyers commented Jul 23, 2021

SeverinDiederichs left a comment

Choose a reason for hiding this comment

MaxThevenet commented Jul 3, 2021 •

edited

Loading