Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add second flush for beam IO #480

Merged
merged 2 commits into from
May 6, 2021
Merged

add second flush for beam IO #480

merged 2 commits into from
May 6, 2021

Conversation

SeverinDiederichs
Copy link
Member

@SeverinDiederichs SeverinDiederichs commented May 5, 2021

This PR solves a bug we found in the diagnostics:

In rare occasions (since commit 5d4bb74, but not on the commit before), the beam IO would show undefined behaviour in parallel runs. Although in the print statement in SaveRealProperty showed the correct values, the openPMD file had NaNs or 0s in there.

This was presumably caused by altering the data before it was flushed. Before #453, beam and field data were dumped together at the end of the step. Now, the beam is written to file before the plasma loop. However, there was still only one flushing to file at the end of the time step (so after the beam was altered).

This is fixed in this PR and the bug does not appear anymore.

Using the following input script, the bug may be reproduced (reproducible with GCC, but not with Clang):

amr.n_cell = 16 16 100

hipace.normalized_units = 1
hipace.verbose = 1
hipace.output_period = 1
random_seed = 4
amr.blocking_factor = 2
amr.max_level = 1

max_step = 9

hipace.numprocs_x = 1
hipace.numprocs_y = 1
hipace.dt=1.

hipace.depos_order_xy = 2

geometry.coord_sys   = 0                  # 0: Cartesian
geometry.is_periodic = 1     1     0      # Is periodic?
geometry.prob_lo     = -10.   -10.   -10.    # physical domain
geometry.prob_hi     =  10.    10.    10.

beams.names = beam beam2

beam.injection_type = fixed_weight
beam.num_particles = 100000
beam.density = 200.
beam.u_mean = 20. 10. 20
beam.u_std = 100. 100. 15.
beam.position_mean = 0. 0. 0
beam.position_std = 0.1 0.1 1.41
beam.dx_per_dzeta = 0.2

beam2.injection_type = fixed_weight
beam2.num_particles = 30
beam2.density = 200.
beam2.u_mean = 8. 23. 21
beam2.u_std = 80. 120. 14.
beam2.position_mean = 0. 0. 0
beam2.position_std = 8. 0.3 1.41

plasmas.names = no_plasma
diagnostic.diag_type = xz
  • Small enough (< few 100s of lines), otherwise it should probably be split into smaller PRs
  • Tested (describe the tests in the PR description)
  • Runs on GPU (basic: the code compiles and run well with the new module)
  • Contains an automated test (checksum and/or comparison with theory)
  • Documented: all elements (classes and their members, functions, namespaces, etc.) are documented
  • Constified (All that can be const is const)
  • Code is clean (no unwanted comments, )
  • Style and code conventions are respected at the bottom of https://github.com/Hi-PACE/hipace
  • Proper label and GitHub project, if applicable

@SeverinDiederichs SeverinDiederichs added bug Something isn't working component: beam About the beam species component: diagnostics About any types of diagnostics labels May 5, 2021
Copy link
Member

@MaxThevenet MaxThevenet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(discussed offline) Look great, thanks for this fix! A few tests fail. Could you check that the code is correct now for these tests (meaning it was incorrect before)? For instance, you could compute the checksum (or print all values to std::out or something) directly in the code right before flushing. You could also run a simulation where the checksum is obvious (e.g. with 1000 particles having uz=100, the checksum for uz defined as np.sum(np.abs(Q)) should be exactly 1.e5). If the results agree with the new version, this PR is a fix indeed. If it agrees with the old, there's still something we're missing.

@SeverinDiederichs
Copy link
Member Author

SeverinDiederichs commented May 6, 2021

I did manage to generate a test showing that this PR is correct.
Adapting the beam_evolution.1Rank.sh test to

# Run the simulation
mpiexec -n 1 $HIPACE_EXECUTABLE $HIPACE_EXAMPLE_DIR/inputs_normalized \
        amr.n_cell = 32 32 10 \
        max_step = 20 \
        geometry.prob_lo = -2. -2. -2. \
        geometry.prob_hi =  2.  2.  2. \
        hipace.dt = 10. \
        hipace.output_period = 20 \
        beam.density = 1.e-8 \
        beam.radius = 1. \
        beam.injection_type = fixed_weight \
        beam.num_particles = 10000 \
        beam.position_mean = 0 0 0 \
        beam.position_std = 0.1 0.1 0.1 \
        beam.ppc = 4 4 1 \
        hipace.external_ExmBy_slope = .0 \
        hipace.external_Ez_uniform = 1.0 \
        hipace.file_prefix = $TEST_NAME

Now the uz benchmark should be npart * (uz_init - dt*max_step*Ez) = 8e6. uz_init is 1e3 as in the input file. In this PR, I do get the correct result. On development, I get 7.9e6. This shows exactly what goes wrong: In the last time step, the beam is assigned to be written to file, then pushed once more, and then it is finally flushed to file, so the output is npart * (uz_init - dt*(max_step + 1)*Ez)

I will now reset the benchmarks, which were incorrect previously and then we can merge

@MaxThevenet
Copy link
Member

Sounds great! Could you reset the benchmarks then?

@MaxThevenet MaxThevenet merged commit 51c4bb3 into development May 6, 2021
@MaxThevenet MaxThevenet deleted the fix_IO_bug branch May 6, 2021 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component: beam About the beam species component: diagnostics About any types of diagnostics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants