Skip to content

Commit

Permalink
Merge pull request #161 from jimmielin/bugfix/wrf-cesm
Browse files Browse the repository at this point in the history
Fix masking issues in MPI (WRF, CESM) environment
lizziel authored Aug 29, 2022
2 parents e0affc5 + 99cebf2 commit 6c314f2
Showing 1 changed file with 56 additions and 22 deletions.
78 changes: 56 additions & 22 deletions src/Core/hco_config_mod.F90
Original file line number Diff line number Diff line change
@@ -2262,31 +2262,65 @@ SUBROUTINE RegisterPrepare( HcoState, RC )
ELSE IF ( Lct%Dct%DctType == HCO_DCTTYPE_MASK .AND. &
Lct%Dct%Dta%Cover == -999 ) THEN

If (HcoState%Options%isESMF) Then
ThisCover = -1
Else
! Get mask edges
lon1 = Lct%Dct%Dta%ncYrs(1)
lat1 = Lct%Dct%Dta%ncYrs(2)
lon2 = Lct%Dct%Dta%ncMts(1)
lat2 = Lct%Dct%Dta%ncMts(2)

ThisCover = CALC_COVERAGE( lon1, lon2, &
lat1, lat2, &
cpux1, cpux2, &
cpuy1, cpuy2 )

! There appear to be some issues with full masks coverages
! when working in an MPI environment. Specifically, masks
! can be seen as fully covering a given CPU even though in
! reality it may only cover parts of it. Thus, in ESMF mode
! always set coverage to zero or partial (ckeller, 3/17/16).
! The mask coverage calculation (which only has three values)
! is used to simplify I/O and CPU operations in code below.
!
! However, there are two distinct bugs related to this:
!
! (1)
! There appear to be some issues with full masks coverages
! when working in an MPI environment. Specifically, masks
! can be seen as fully covering a given CPU even though in
! reality it may only cover parts of it. Thus, in ESMF mode
! always set coverage to zero or partial (ckeller, 3/17/16).
!
! This appears to be related to masking for two inventories
! with overlapping temporal coverage. For example, if inventory
! A is 2013-2015, and B is 2010-2018 with higher hierarchy,
! but not the same mask (maybe A covers regions that B does not).
! If both masks are set to ThisCover == 1 (full coverage), because
! a certain CPU might be overlapped by lon1/lat1/lon2/lat2 even
! though the actual netCDF shape of the mask is different,
! then a simulation running 2013-2015 will see inventory B on that CPU
! decide it has full coverage (only through lon1/..), and skip
! inventory A altogether, resulting in missing emissions.
! This behavior is in the line
! IF ( (tmpLct%Dct%Hier > Hier) .AND. (tmpCov==1) ) THEN below.
!
! (2)
! Another artifact caused by MPI environments:
! where lon1/lat1/... is set too small, resulting in certain CPUs not
! having overlap (defined by cpux/y) with lon1/lat1/..., and thus
! skipping the base inventory as a bug. This behavior is in the line
! IF ( (mskLct%Dct%DctType == HCO_DCTTYPE_MASK ) .AND. &
! (mskLct%Dct%Dta%Cover == 0 ) ) THEN
!
! Because the code only distinguishes between full/partial and zero coverage,
! and skips reading the base field if coverage is zero, this may cause
! issues with MPI environments in WRF and CESM where the mask lon1/lat1/lon2/lat2
! boundaries are set too small compared to the mask, and result in the
! base field being skipped over small CPU decompositions where it should not have
! been. The above fix does not fix the issue where ThisCover == 0,
! which is the root cause in WRF and CESM. Thus, always set to partial coverage
! (hplin, 8/19/22)
!
! Thus, the following fix needs to be applied for ESMF environments, skipping
! a lot of the calculations below.
#if defined ( ESMF_ ) || defined( MODEL_WRF ) || defined( MODEL_CESM )
IF ( ThisCover == 1 ) ThisCover = -1
ThisCover = -1
#else
! Get mask edges
lon1 = Lct%Dct%Dta%ncYrs(1)
lat1 = Lct%Dct%Dta%ncYrs(2)
lon2 = Lct%Dct%Dta%ncMts(1)
lat2 = Lct%Dct%Dta%ncMts(2)

ThisCover = CALC_COVERAGE( lon1, lon2, &
lat1, lat2, &
cpux1, cpux2, &
cpuy1, cpuy2 )
#endif

ENDIF

! Update container information
Lct%Dct%Dta%Cover = ThisCover
Lct%Dct%Dta%ncYrs(:) = -999

0 comments on commit 6c314f2

Please sign in to comment.