Skip to content

Enable GPU execution of mpas_reconstruct_2d via OpenACC #1289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 3, 2025

Conversation

gdicker1
Copy link
Collaborator

This PR slightly modifies and adds OpenACC directives to mpas_reconstruct_2d so it can execute on GPU(s).

Timing for the OpenACC data transfers in this routine is captured in the log file by a new timer: mpas_reconstruct_2d [ACC_data_xfer].

NOTE two things about this PR:

@gdicker1
Copy link
Collaborator Author

I used the compare_netcdf.py script and looked at the differences between log.atmosphere.0000.out files to characterize the differences. I used the 6 timestep regional testcase.

Running the loop near mpas_vector_reconstruct.F L273-283 in unmodified code on the CPU had no answer differences to a GPU run of the commit I started this PR-branch from.

When running this code on the GPU I observed differences in the u, w, and scalars [1-8] values reported in the log file. Unlike some other answer differences, the locations of mins and maxes did not change for u and w. Select output from compare_netcdf.py baseline_acc/restart.2019-09-01_00.06.00.nc port_att/restart.2019-09-01_00.06.00.nc 1:

            Variable  Min       Max      
=========================================                                                
# ... omitting lines ...
                   u -0.041306  0.041719
                   w -0.040645  0.030554
              rho_zz -0.000086  0.000116                                                                                                                                          
             theta_m -0.054626  0.081970                                                 
          pressure_p -16.194092  11.606445
# ... omitting lines ...

Footnotes

  1. Full output saved on Derecho within "/glade/work/gdicker/mpas-work/2025Feb06_PortMPASReconstruct/port_att1739582865/diff.baselineVtest.restart.txt"

@gdicker1
Copy link
Collaborator Author

More on the answer differences, this does seem to be due to how the default CPU and GPU math implementations differ. I get no answer differences if I add the flags described in #1287 to my baseline_acc and PR branch builds (namely -gpu=math_uniform; -Mnofma is already in GPU builds).

@mgduda mgduda added the OpenACC Work related to OpenACC acceleration of code label Feb 19, 2025
@gdicker1
Copy link
Collaborator Author

@mgduda This should be ready for re-review now!

@mgduda
Copy link
Contributor

mgduda commented May 2, 2025

@gdicker1 Other than one more whitespace change request, I think this PR is good to go. After adjusting the whitespace, please feel free to clean up the commit history, and I'll approve the PR. Thanks!

gdicker1 added 2 commits May 2, 2025 17:05
Add nVertLevels, derefernce integer pointers to loop bounds so they
transfer to the GPU correctly, and make loops in vertical dimension
explicit for OpenACC parallel loop directives.

Also ensure that, after initialization finishes, the invariant fields
used in this routine will be on the device.
Ensures the data needed for the mpas_reconstruct_2d routine has
been fetched onto the device (GPU) at the beginning and end of the
routine. The time for these transfers are captured in a new timer
'mpas_reconstruct_2d [ACC_data_xfer]'. This is enforced by the
default(present) clauses.

NOTE: coeffs_reconstruct, nEdgesOnCell, edgesOnCell, latCell, and
lonCell are also fetched in mpas_reconstruct_2d. This is because this
routine is called before these variables would be uploaded to the device
during mpas_atm_dynamics_init as part of atmosphere_core initialization.
The copyins will not execute anymore once the model starts timestepping
and the OpenACC runtime sees the variables are present on the device.
@gdicker1 gdicker1 force-pushed the framework/acc_mpas_reconstruct_2d branch from 5ed3990 to 607d858 Compare May 2, 2025 23:17
@gdicker1
Copy link
Collaborator Author

gdicker1 commented May 2, 2025

@mgduda this should be good to go now!

@mgduda mgduda self-requested a review May 3, 2025 00:46
Copy link
Contributor

@mgduda mgduda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great -- thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OpenACC Work related to OpenACC acceleration of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants