Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug and Proposed Solution in MERRA Aerosol Code #152

Closed
JacobCarley-NOAA opened this issue Jan 8, 2024 · 0 comments · Fixed by ufs-community/ufs-weather-model#2082
Closed
Labels
bug Something isn't working

Comments

@JacobCarley-NOAA
Copy link

Description

The RRFS team has encountered several model crashes over the past several days. After running the model in debug mode we think we have traced the source of the bug to aerinterp.F90. We have a possible solution for the bug, described below.

The bug: Occurs in the subroutine aerinterpol with an array out of bounds issue on line 435 with the second index of the aerpres array. Here is the error:

forrtl: severe (408): fort: (2): Subscript #2 of the array AERPRES has value 503 which is greater than the upper bound of 72

It appears that this array re-uses the indices of i1 and i2. These indices used previously in the same routine and were defined by values that can have much larger numbers than 72 (see lines 392-395). Specifically it appears they take on values that are likely the size of the horizontal dimension of the MERRA file (nlons*nlats, which is 576*361) and not related to the vertical dimension (which is up to 72). We noticed there is an edge case where i1 and i2 may not get re-assigned later in the code, and hence a value larger than 72 may make it to line 435. We think this is causing the problem. Jili Dong found the following issue associated with an if-statement beginning with a loop on line 428:

             DO  k=1, levsaer-1      !! from sfc to toa
              IF(prsl(j,L) < aerpres(j,k) .and. prsl(j,L)>aerpres(j,k+1)) then
                 i1 = k
                 i2 = min(k+1,levsaer)
                 exit
              ENDIF
             ENDDO

Here is how Jili explains it: Note that when searching for the k level where prsl falls within aerpres, the code uses < or >. If prsl equals exactly to aerpres(i,k) or aerpres (j,k+1), this if block will be skipped. If the block is skipped then line 435 will use indice values that we set for a different purpose earlier in the code.

The proposed solution from Jili:

Should this "if" statement be changed to the following?

IF(prsl(j,L) <= aerpres(j,k) .and. prsl(j,L)>aerpres(j,k+1)) then

Note that Jili has just changed the < to <=. We have tested this and so far it has fixed three crashes that we've had. So it seems to be working so far.

Steps to Reproduce

Please provide detailed steps for reproducing the issue.

Run RRFS cycle 202401080000 in debug mode.

Additional Context

Please provide any relevant information about your setup. This is important in case the issue is not reproducible except for under certain conditions.

  • Machine: WCOSS2
  • Compiler:
  • Suite Definition File or Scheme: RRFS
  • Reference other issues or PRs in other repositories that this is related to, and how they are related.

Output

See above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant