Reduce the number of Hankel transforms per iteration #161

RemiLehe · 2017-12-06T18:59:44Z

In the current dev branch, we perform 16 Hankel transform and 16 Fourier transform per azimuthal mode and per iteration. Within those 16 transforms, 12 transforms are done because we do a spect2interp and interp2spect of the fields E and B (2 * 2 fields * 3 components).

The reason for performing these back-and-forth transformations is that we have to perform some operations on the fields in spectral space (Maxwell push) and some operations in real space (MPI exchanges and damping of the open boundaries).

However, because the MPI exchanges and damping of the open boundaries is purely along z, updating the spectral fields can be done only by a succession an inverse and forward Fourier transform (no Hankel transform involved). Then updating the fields in interpolation space requires an additional Hankel + Fourier transform.

This effectively replaces the 16 Hankel and 16 Fourier transforms by 10 Hankel and 22 Fourier transforms (per mode, per iteration). Because the Fourier transforms are almost always much faster than the Hankel transforms, this results in a speedup of the simulation.

Note: I am definitely not the first to use this trick! @Hightower has been using it for a while in his code chimera.

MKirchen · 2017-12-07T09:31:33Z

Nice, looking forward to this PR!

RemiLehe · 2017-12-08T05:13:22Z

fbpic/boundaries/boundary_communicator.py

@@ -705,10 +705,10 @@ def exchange_particles_aperiodic_subdomain(self, species, fld, time ):
 add_buffers_to_particles( species, float_recv_left, float_recv_right,
 uint_recv_left, uint_recv_right )

- def damp_guard_EB( self, interp ):


Note: I renamed this function to be more explicit about what it does (esp. because the guard cells inbetween two MPI processes are not damped by this function)

RemiLehe · 2017-12-08T05:26:50Z

I performed the usual 2000 x 400 benchmark (timing the complete PIC, not just the transforms) on a GTX 1080 Ti GPU:

Current dev: 325 ms/iteration
This PR: 275 ms/iteration

MKirchen · 2017-12-08T07:02:13Z

Nice!

hightower8083 · 2017-12-08T08:25:50Z

wow, 18 percent, cool!
But I'm not sure i get how you've got it down to 10 DHTs: 6 to get in J and rho (I guess each mode of rho also needs two + and - components) + 6 to get out E and B after the Maxwell push..?

UPD: oups, my bad -- I've forgot you don't need the + and - components for the scalars..

RemiLehe added 2 commits December 6, 2017 09:42

Implemented partial transformations before/after MPI exchanges

6da5382

Correct the moment at which fields are exchanged

e422dad

RemiLehe added cpu gpu performance refactoring labels Dec 6, 2017

Use partial transforms when exchanging the currents

55820ea

RemiLehe changed the title ~~[WIP] Reduce the number of Hankel transforms per iteration~~ Reduce the number of Hankel transforms per iteration Dec 8, 2017

RemiLehe commented Dec 8, 2017

View reviewed changes

MKirchen merged commit 1690d84 into fbpic:dev Dec 8, 2017

RemiLehe deleted the partial_transforms branch December 12, 2017 18:00

RemiLehe mentioned this pull request Nov 24, 2019

Implement Perfectly-Matched-Layers #404

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce the number of Hankel transforms per iteration #161

Reduce the number of Hankel transforms per iteration #161

RemiLehe commented Dec 6, 2017 •

edited

Loading

MKirchen commented Dec 7, 2017

RemiLehe Dec 8, 2017

RemiLehe commented Dec 8, 2017

MKirchen commented Dec 8, 2017

hightower8083 commented Dec 8, 2017 •

edited

Loading

Reduce the number of Hankel transforms per iteration #161

Reduce the number of Hankel transforms per iteration #161

Conversation

RemiLehe commented Dec 6, 2017 • edited Loading

MKirchen commented Dec 7, 2017

RemiLehe Dec 8, 2017

Choose a reason for hiding this comment

RemiLehe commented Dec 8, 2017

MKirchen commented Dec 8, 2017

hightower8083 commented Dec 8, 2017 • edited Loading

RemiLehe commented Dec 6, 2017 •

edited

Loading

hightower8083 commented Dec 8, 2017 •

edited

Loading