Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix omp update directive in trgtol_mod (no MPI code path) #212

Conversation

thomasgibson
Copy link

This is a very minor PR correcting an OMP directive when building ECTrans without MPI, for PR #211

I used the test program from @samhatfield's comment here to verify the OMP implementation on the GPU on MI300A with and without unified memory enabled. I am getting identical results in both cases!

Here are the results running on MI300A in "discrete GPU" mode (HSA_XNACK=0):
single precision:

ecTrans at version: 1.5.1
commit: 59c7ff9739dac8e1332f7c788dbbe3f67a929eb1

 R%NTMAX= 79
 R%NSMAX= 79
 setup_trans: sizes1 NUMP= 80
 Using OpenMP offloading
    FG%ZAS:       611840B
    FG%ZAA:       611840B
   FG%ZAS0:        26240B
   FG%ZAA0:        25600B
 FG%ZEPSNM:        26240B
 ===GPU arrays successfully allocated
 GRID_POINT_FIELD =  -3.66075754,  3.66075754
 TEST_PROGRAM 1
 Error =  2.612723335E-7

double precision:

ecTrans at version: 1.5.1
commit: 59c7ff9739dac8e1332f7c788dbbe3f67a929eb1

 R%NTMAX= 79
 R%NSMAX= 79
 setup_trans: sizes1 NUMP= 80
 Using OpenMP offloading
    FG%ZAS:      1223680B
    FG%ZAA:      1223680B
   FG%ZAS0:        26240B
   FG%ZAA0:        25600B
 FG%ZEPSNM:        52480B
 ===GPU arrays successfully allocated
 GRID_POINT_FIELD =  -3.6607575710598272,  3.6607575710598272
 TEST_PROGRAM 1
 Error =  8.00660226648715344E-15

With unified memory enabled (HSA_XNACK=1):
single precision:

ecTrans at version: 1.5.1
commit: 59c7ff9739dac8e1332f7c788dbbe3f67a929eb1

 R%NTMAX= 79
 R%NSMAX= 79
 setup_trans: sizes1 NUMP= 80
 Using OpenMP offloading
    FG%ZAS:       611840B
    FG%ZAA:       611840B
   FG%ZAS0:        26240B
   FG%ZAA0:        25600B
 FG%ZEPSNM:        26240B
 ===GPU arrays successfully allocated
 GRID_POINT_FIELD =  -3.66075754,  3.66075754
 TEST_PROGRAM 1
 Error =  2.612723335E-7

double precision:

ecTrans at version: 1.5.1
commit: 59c7ff9739dac8e1332f7c788dbbe3f67a929eb1

 R%NTMAX= 79
 R%NSMAX= 79
 setup_trans: sizes1 NUMP= 80
 Using OpenMP offloading
    FG%ZAS:      1223680B
    FG%ZAA:      1223680B
   FG%ZAS0:        26240B
   FG%ZAA0:        25600B
 FG%ZEPSNM:        52480B
 ===GPU arrays successfully allocated
 GRID_POINT_FIELD =  -3.6607575710598272,  3.6607575710598272
 TEST_PROGRAM 1
 Error =  8.00660226648715344E-15

@samhatfield samhatfield marked this pull request as ready for review February 12, 2025 17:30
@samhatfield samhatfield merged commit f7f3551 into ecmwf-ifs:refresh_openmp_dir_trans Feb 12, 2025
1 check passed
samhatfield added a commit that referenced this pull request Feb 13, 2025
Fix omp update directive in trgtol_mod (no MPI code path)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants