Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to Logging and Initialization of the CLM Lake Model (Including PR#1863) #1844

Merged
merged 24 commits into from
Aug 22, 2023

Conversation

SamuelTrahanNOAA
Copy link
Collaborator

@SamuelTrahanNOAA SamuelTrahanNOAA commented Jul 27, 2023

Description

This is the work of @tanyasmirnova to fix some initialization issues in the CLM Lake Model. Initialization appears to work correctly without fractional ice. There are some lingering issues with fractional ice initialization that we're working on.

Changes:

  1. Use ice thickness hice(i) to find the level in the lake where ice is zero.
  2. Do not allow lake temperature to be below freezing point if there is no ice.
  3. If there is no snow or ice, do not allow surface lake temperature to be below freezing point. These changes fixed the problem with large errors in the energy budget at the beginning of the cold-start run with lakes.
  4. Added flag to turn on debug print statements in the CLM lake model.
  5. Turned on fractional ice by default. Users running flake must turn it off explicitly, using frac_ice=.false. in the gfs_physics namelist.
  6. Corrections to metadata of lake_q2m in GFS_diagnostics.F90

Input data additions/changes

  • No changes are expected to input data.

Anticipated changes to regression tests:

  • Changes are expected to the following tests:

Most of the tests with "hrrr" or "clm_lake" in their name. Here's a list.

List of tests whose results changed.
COMPILE | rrfs | intel | -DAPP=ATM -DCCPP_SUITES=FV3_RAP,FV3_RAP_sfcdiff,FV3_HRRR,FV3_RRFS_v1beta,FV3_RRFS_v1nssl -D32BIT=ON | | fv3 |
RUN | hrrr_control                                      | - wcoss2                   | baseline |
RUN | hrrr_control_qr                                   | - wcoss2                   |          |
RUN | hrrr_control_decomp                               | - wcoss2                   |          |
RUN | hrrr_control_2threads                             | - wcoss2                   |          |
RUN | hrrr_control_restart                              | - wcoss2                   |          | hrrr_control
RUN | hrrr_control_restart_qr                           | - wcoss2                   |          | hrrr_control_qr

COMPILE | atm_debug_dyn32 | intel | -DAPP=ATM -DDEBUG=ON -D32BIT=ON -DCCPP_SUITES=FV3_HRRR,FV3_GFS_v16,FV3_GFS_v16_csawmg,FV3_GFS_v16_ras,FV3_GFS_v17_p8,FV3_GFS_v15_thompson_mynn_lam3km,FV3_RAP,FV3_RAP_unified_ugwp,FV3_RAP_cires_ugwp,FV3_RAP_flake,FV3_RAP_clm_lake,FV3_RAP_noah,FV3_RAP_sfcdiff,FV3_RAP_noah_sfcdiff_cires_ugwp,FV3_RRFS_v1beta | | fv3 |
RUN | hrrr_control_debug                                |                            | baseline |
RUN | rap_clm_lake_debug                                |                            | baseline |

COMPILE | rrfs_dyn32_phy32 | intel | -DAPP=ATM -DCCPP_SUITES=FV3_RAP,FV3_HRRR -D32BIT=ON -DCCPP_32BIT=ON | | fv3 |
RUN | hrrr_control_dyn32_phy32                          |                            | baseline |
RUN | hrrr_control_qr_dyn32_phy32                       |                            | baseline |
RUN | hrrr_control_2threads_dyn32_phy32                 |                            |          |
RUN | hrrr_control_decomp_dyn32_phy32                   |                            |          |
RUN | hrrr_control_restart_dyn32_phy32                  |                            |          | hrrr_control_dyn32_phy32
RUN | hrrr_control_restart_qr_dyn32_phy32               |                            |          | hrrr_control_qr_dyn32_phy32

COMPILE | rrfs_dyn32_phy32_debug | intel | -DAPP=ATM -DCCPP_SUITES=FV3_RAP,FV3_HRRR -D32BIT=ON -DCCPP_32BIT=ON -DDEBUG=ON | | fv3 |
RUN | hrrr_control_debug_dyn32_phy32                    |                            | baseline |

COMPILE | rrfs | gnu | -DAPP=ATM -DCCPP_SUITES=FV3_RAP,FV3_RAP_sfcdiff,FV3_HRRR,FV3_RRFS_v1beta -D32BIT=ON | + hera cheyenne | fv3 |
RUN | hrrr_control                                      | + hera cheyenne            | baseline |
RUN | hrrr_control_qr                                   | + hera cheyenne            |          |
RUN | hrrr_control_2threads                             | + hera cheyenne            |          |
RUN | hrrr_control_decomp                               | + hera cheyenne            |          |
RUN | hrrr_control_restart                              | + hera cheyenne            |          | hrrr_control
RUN | hrrr_control_restart_qr                           | + hera cheyenne            |          | hrrr_control_qr

COMPILE | atm_dyn32_debug | gnu | -DAPP=ATM -D32BIT=ON -DDEBUG=ON | + hera cheyenne | fv3 |
RUN | hrrr_control_debug                                | + hera cheyenne            | baseline |
RUN | rap_clm_lake_debug                                | + hera cheyenne            | baseline |

COMPILE | rrfs_dyn32_phy32 | gnu | -DAPP=ATM -DCCPP_SUITES=FV3_RAP,FV3_HRRR -D32BIT=ON -DCCPP_32BIT=ON | + hera cheyenne | fv3 |
RUN | hrrr_control_dyn32_phy32                          | + hera cheyenne            | baseline |
RUN | hrrr_control_qr_dyn32_phy32                       | + hera cheyenne            | baseline |
RUN | hrrr_control_2threads_dyn32_phy32                 | + hera cheyenne            |          |
RUN | hrrr_control_decomp_dyn32_phy32                   | + hera cheyenne            |          |
RUN | hrrr_control_restart_dyn32_phy32                  | + hera cheyenne            |          | hrrr_control_dyn32_phy32
RUN | hrrr_control_restart_qr_dyn32_phy32               | + hera cheyenne            |          | hrrr_control_qr_dyn32_phy32

COMPILE | atm_dyn32_phy32_debug | gnu | -DAPP=ATM -D32BIT=ON -DCCPP_32BIT=ON -DDEBUG=ON | + hera cheyenne | fv3 |
RUN | hrrr_control_debug_dyn32_phy32                    | + hera cheyenne            | baseline |

Subcomponents involved:

  • AQM
  • CDEPS
  • CICE
  • CMEPS
  • CMakeModules
  • FV3
  • GOCART
  • HYCOM
  • MOM6
  • NOAHMP
  • WW3
  • stochastic_physics
  • none

Library Updates/Changes

  • Not Needed

Combined with PR's (If Applicable):

Commit Queue Checklist:

  • Link PR's from all sub-components involved in section below
  • Confirm reviews completed in ALL sub-component PR's
  • Add all appropriate labels to this PR.
  • Run full RT suite on either Hera/Cheyenne AND attach log to a PR comment.
  • Add list of any failed regression tests to "Anticipated changes to regression tests" section.

Linked PR's and Issues:

Testing Day Checklist:

  • This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR.
  • Move new/updated input data on RDHPCS Hera and propagate input data changes to all supported systems.

Testing Log (for CM's):

  • RDHPCS
    • Hera
    • Orion
    • Jet
    • Gaea
    • Cheyenne
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
    • Completed
  • opnReqTest
    • N/A
    • Log attached to comment

1. Use ice thickness hice(i) to find the level in the lake where ice is
   zero.
2. Do not allow lake temperature to be below freezing point if there is
   no ice.
3. If there is no snow or ice, do not allow surface lake temperature to
   be below freezing point.
   These changes fixed the problem with large errors in the energy budget
   at the beginning of the cold-start run with lakes.
4. Added flag to turn on debug print statements in the CLM lake model.
@SamuelTrahanNOAA
Copy link
Collaborator Author

I've updated this branch and I'm retesting it on Hera.

@SamuelTrahanNOAA
Copy link
Collaborator Author

SamuelTrahanNOAA commented Aug 12, 2023

Hera tests passed.

Edit: I committed the log.

@SamuelTrahanNOAA
Copy link
Collaborator Author

I have synced the components and added one more bug fix from tanya. It corrects the metadata for lake_q2m. That variable isn't in the diag_table of any regression test, so no tests are affected.

Hera tests passed again. The list of tests with new baselines is the same.

@jkbk2004 jkbk2004 mentioned this pull request Aug 18, 2023
41 tasks
@FernandoAndrade-NOAA
Copy link
Collaborator

FernandoAndrade-NOAA commented Aug 18, 2023

Hi @SamuelTrahanNOAA, we'd like to get started on testing this PR, please go ahead and resolve conflicts / sync up your weather model and sub components with their latest dev branches. If you could include the changes from Grant's #1863 as well it would be appreciated. It should not change any baselines as they are documentation related updates for CCPP. Thank you

@jkbk2004
Copy link
Collaborator

@SamuelTrahanNOAA Do you think it's possible to combine #1863 srw doc update pr to this pr at ccpp pr level? If yes, a conversation with @mkavulich @grantfirl might be helpful to make combining go smoothly. Otherwise, once this gets synced up, we can start working on regression tests.

@jkbk2004
Copy link
Collaborator

Oh! @FernandoAndrade-NOAA is following up faster than me!

@SamuelTrahanNOAA
Copy link
Collaborator Author

Yes, I can combine them. The merge will be delicite due to the large number of changes to clm_lake.f90 from the other PR.

@SamuelTrahanNOAA
Copy link
Collaborator Author

Where are the Hera baselines? They appear to be missing:

cat bl_date.conf
export BL_DATE=20230816

ls /scratch1/NCEPDEV/nems/emc.nemspara/RT/NEMSfv3gfs/develop-20230816
ls: cannot access /scratch1/NCEPDEV/nems/emc.nemspara/RT/NEMSfv3gfs/develop-20230816: No such file or directory

@grantfirl
Copy link
Collaborator

Where are the Hera baselines? They appear to be missing:

cat bl_date.conf export BL_DATE=20230816

ls /scratch1/NCEPDEV/nems/emc.nemspara/RT/NEMSfv3gfs/develop-20230816 ls: cannot access /scratch1/NCEPDEV/nems/emc.nemspara/RT/NEMSfv3gfs/develop-20230816: No such file or directory

I think that the baselines have moved. If you look at DISKNM for hera, I see /scratch2/NAGAPE/epic/UFS-WM_RT.

@SamuelTrahanNOAA
Copy link
Collaborator Author

In my local working copy, I've merged the two PRs. I generated a baseline before merging 1863. Now I'm testing against that baseline with 1863 merged. If all goes according to plan, I'll push it in a couple hours, and you can start testing.

@jkbk2004
Copy link
Collaborator

In my local working copy, I've merged the two PRs. I generated a baseline before merging 1863. Now I'm testing against that baseline with 1863 merged. If all goes according to plan, I'll push it in a couple hours, and you can start testing.

@SamuelTrahanNOAA Thanks so much!

@SamuelTrahanNOAA
Copy link
Collaborator Author

Apologies for the delay. I have merged #1863 into this PR and re-run the Hera tests. The list of tests with new baselines is the same as before.

@jkbk2004
Copy link
Collaborator

@zach1221 Thanks for the explanation.

I do think, as has been suggested, that it is important that a CM be assigned to each PR so that we have a point-person who knows the exact readiness of each PR.

@DeniseWorthen Hera is already a pre-test machine. What is the problem?

@DeniseWorthen
Copy link
Collaborator

The Hera log you posted was not a "testing" log---it was a log showing you had changed the baseline date, ran the tests against the new baseline and committed the log.

@SamuelTrahanNOAA
Copy link
Collaborator Author

What I did on Hera was this:

  1. Generate a new baseline just for the tests whose results I expect will change.
  2. Copy the existing baselines, and replace just those tests' baselines.
  3. Run only the tests whose results should change, against those baselines.
  4. Merge PR 1863.
  5. Run the full test suite.

This tells me that:

  1. No additional tests have changed results beyond the set I expect.
  2. Tests with changed results, that don't have their own baselines, pass.
  3. 1863 does not change the results at all.
  4. Everything still works on Hera.

I committed that regression test log. Maybe that's confusing your commit process?

@jkbk2004
Copy link
Collaborator

What I did on Hera was this:

  1. Generate a new baseline just for the tests whose results I expect will change.
  2. Copy the existing baselines, and replace just those tests' baselines.
  3. Run only the tests whose results should change, against those baselines.
  4. Merge PR 1863.
  5. Run the full test suite.

This tells me that:

  1. No additional tests have changed results beyond the set I expect.
  2. Tests with changed results, that don't have their own baselines, pass.
  3. 1863 does not change the results at all.
  4. Everything still works on Hera.

I committed that regression test log. Maybe that's confusing your commit process?

@SamuelTrahanNOAA No problem. All tests look good on hera. The log is still valid.

@DeniseWorthen
Copy link
Collaborator

@SamuelTrahanNOAA As far as I can see, you did everything we expect of a PR author. The comments were about how the PR was managed on the CM side.

@FernandoAndrade-NOAA
Copy link
Collaborator

Jenkins ci logs are attached, ORTs passed.
ufs-weather-model » ort-docker-pipeline » PR-1844 #1 Console [Jenkins].txt

@zach1221
Copy link
Collaborator

@SamuelTrahanNOAA Testing looks like it's complete. We can begin the merging process of the sub-component PRs.

@DeniseWorthen
Copy link
Collaborator

@zach1221 Do we have jet and wcoss2 logs?

@SamuelTrahanNOAA
Copy link
Collaborator Author

WCOSS2 was inaccessible to developers until about an hour and a half ago.

@DeniseWorthen
Copy link
Collaborator

@SamuelTrahanNOAA We usually see a comment from @BrianCurtis-NOAA if he is unable to run or is not planning to run on wcoss2.

@zach1221
Copy link
Collaborator

wcoss2 was down today and jobs couldn't run on Jet apparently. @DeniseWorthen

@DeniseWorthen
Copy link
Collaborator

@zach1221 What was the issue on jet?

@zach1221
Copy link
Collaborator

@zach1221 What was the issue on jet?

I'll defer to @jkbk2004 on this, as he attempted Jet and was unable to run tests today.

@jkbk2004
Copy link
Collaborator

On jet, hafs realtime experiment is on-going. Jobs are sitting in queue due to the experiment.

@SamuelTrahanNOAA
Copy link
Collaborator Author

The fv3atm PR has been merged. I've updated this PR's branch to point to the head of the NOAA-EMC develop branch of fv3atm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Baseline Updates Current baselines will be updated. jenkins-ci Jenkins CI: ORT build/test on docker container Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants