-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Permit DIR_TRANS with OpenMP offload #211
Open
samhatfield
wants to merge
48
commits into
develop
Choose a base branch
from
refresh_openmp_dir_trans
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
These variables should be inherited from the parent scope and shared between all threads. This was a bug.
Otherwise this will not compile with CCE and -acc.
We already put this in LEINV, so why is it needed? No idea.
Co-authored-by: Paul Mullowney <Paul.Mullowney@amd.com>
Co-authored-by: Sam Hatfield <samuel.hatfield@ecmwf.int>
Co-authored-by: Sam Hatfield <samuel.hatfield@ecmwf.int>
Co-authored-by: Sam Hatfield <samuel.hatfield@ecmwf.int>
Co-authored-by: Sam Hatfield <samuel.hatfield@ecmwf.int>
Co-authored-by: Sam Hatfield <samuel.hatfield@ecmwf.int>
Co-authored-by: Sam Hatfield <samuel.hatfield@ecmwf.int>
Co-authored-by: Sam Hatfield <samuel.hatfield@ecmwf.int>
Co-authored-by: Thomas Gibson <Thomas.Gibson@amd.com>
I'm testing this PR with the following test program: PROGRAM TEST_PROGRAM
USE PARKIND1, ONLY: JPIM, JPRB
USE MPL_MODULE
IMPLICIT NONE
! Spectral truncation
INTEGER(JPIM), PARAMETER :: TRUNC = 79
INTEGER(JPIM) :: verbosity = 0
! Arrays for storing our field in spectral space and grid point space
REAL(KIND=JPRB), ALLOCATABLE :: SPECTRAL_FIELD(:,:)
REAL(KIND=JPRB), ALLOCATABLE :: SPECTRAL_FIELD_2(:,:)
REAL(KIND=JPRB), ALLOCATABLE :: GRID_POINT_FIELD(:,:,:)
! Dimensions of our arrays in spectral space and grid point space
INTEGER(KIND=JPIM) :: NSPEC2
INTEGER(KIND=JPIM) :: NGPTOT
INTEGER(KIND=JPIM) :: SPECTRAL_INDICES(0:TRUNC)
#include "setup_trans0.h"
#include "setup_trans.h"
#include "trans_inq.h"
#include "inv_trans.h"
#include "dir_trans.h"
#include "trans_end.h"
CALL MPL_INIT(ldinfo=(verbosity>=1))
CALL DR_HOOK_INIT()
! Initialise ecTrans (resolution-agnostic aspects)
CALL SETUP_TRANS0(LDMPOFF=.TRUE., KPRINTLEV=VERBOSITY)
! Initialise ecTrans (resolution-specific aspects)
CALL SETUP_TRANS(KSMAX=TRUNC, KDGL=2 * (TRUNC + 1))
! Inquire about the dimensions in spectral space and grid point space
CALL TRANS_INQ(KSPEC2=NSPEC2, KGPTOT=NGPTOT, KASM0=SPECTRAL_INDICES)
! Allocate our work arrays
ALLOCATE(SPECTRAL_FIELD(1,NSPEC2))
ALLOCATE(SPECTRAL_FIELD_2(1,NSPEC2))
ALLOCATE(GRID_POINT_FIELD(NGPTOT,1,1))
! Initialise our spectral field arrays
SPECTRAL_FIELD(:,:) = 0.0_JPRB
SPECTRAL_FIELD(1,SPECTRAL_INDICES(3) + 2 * 5 + 1) = 1.0_JPRB
! Perform an inverse transform
CALL INV_TRANS(PSPSCALAR=SPECTRAL_FIELD, PGP=GRID_POINT_FIELD)
WRITE(6,*) "GRID_POINT_FIELD = ", MINVAL(GRID_POINT_FIELD), MAXVAL(GRID_POINT_FIELD)
FLUSH(6)
! Perform a direct transform
CALL DIR_TRANS(PGP=GRID_POINT_FIELD, PSPSCALAR=SPECTRAL_FIELD_2)
WRITE(6,*) "TEST_PROGRAM 1"
FLUSH(6)
! Compute error between before and after fields
WRITE(6,*) "Error = ", NORM2(SPECTRAL_FIELD_2 - SPECTRAL_FIELD)
FLUSH(6)
CALL TRANS_END
CALL MPL_END(ldmeminfo=.false.)
END PROGRAM TEST_PROGRAM Currently it gives the following output on CPU:
On GPU with OpenACC:
On GPU with OpenMP:
|
8aaf383
to
f9a2644
Compare
Bug found in
|
This is needed so that device-side arrays are correctly deallocated at the end (especially important for CCE builds).
The first test will take a while but the subsequent tests should be significantly faster.
2f279fa
to
20e6bd2
Compare
Fix omp update directive in trgtol_mod (no MPI code path)
f7f3551
to
6e0f184
Compare
26 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.