-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assimilate GMI in GSI (#689) #692
Conversation
@RussTreadon-NOAA Could you add @emilyhcliu , @ADCollard , and @azadeh-gh as reviewers? I have run the regression tests on hera and all 7 tests are passed. |
Please post ctest results in this PR or the originating issue, #689. Ctests need to be run on WCOSS2, Hera, Orion, and Hercules. This PR will be returned to closed and returned to the develop if at least two peer reviews with approvals and ctests are completed. |
The following tests FAILED:
It seems a few files are not existed, e.g. amsuabufr_db -> /lfs/h2/emc/da/noscrub/russ.treadon/CASES/regtest/gfs/prod/gdas.20221109/00/atmos/gdas.t00z.amuadb.tm00.bufr_d The tmpdir is /lfs/h2/emc/ptmp/xin.c.jin/GSI/tmpreg_global_4denvar/global_4denvar_loproc_updat>
|
The two failure for ctest on Hercules are:
The runtime for hafs_3denvar_hybens_loproc_updat is 254.941863 seconds and is within the allowable threshold time of 413.289465 seconds, continuing with regression test. The runtime for hafs_3denvar_hybens_hiproc_updat is 200.579084 seconds and is within the allowable threshold time of 331.662442 seconds, continuing with regression test. The memory for hafs_3denvar_hybens_loproc_updat is 2532516 KBs and is within the maximum allowable memory of 2780342 KBs, continuing with regression test. The results (penalty) between the two runs (hafs_3denvar_hybens_loproc_updat and hafs_3denvar_hybens_loproc_contrl) are reproducible. The fv3_dynvars are reproducible The results (penalty) between the two runs (hafs_3denvar_hybens_loproc_updat and hafs_3denvar_hybens_hiproc_updat) are reproducible The fv3_sfcdata are reproducible
The runtime for hafs_4denvar_glbens_loproc_updat is 328.091246 seconds and is within the maximum allowable operational time of 1200 seconds, continuing with regression test. The runtime for hafs_4denvar_glbens_loproc_updat is 328.091246 seconds. This has exceeded maximum allowable threshold time of 326.818903 seconds, resulting in Failure time-thresh of the regression test. The runtime for hafs_4denvar_glbens_hiproc_updat is 254.226137 seconds and is within the allowable threshold time of 282.596399 second, continuing with regression test. The memory for hafs_4denvar_glbens_loproc_updat is 2858764 KBs and is within the maximum allowable memory of 3199819 KBs, continuing with regression test. The results (penalty) between the two runs (hafs_4denvar_glbens_loproc_updat and hafs_4denvar_glbens_loproc_contrl) are reproducible. The fv3_dynvars are reproducible The results (penalty) between the two runs (hafs_4denvar_glbens_loproc_updat and hafs_4denvar_glbens_hiproc_updat) are reproducible The fv3_dynvars are reproducible Any comments on how to deal with these failure. |
WCOSS2 failure
File
@xincjin-NOAA , you do not belong to the WCOSS2 rstprod group.
You need to request rstprod access. Please visit NCO's Restricted Data Information page to learn how to request rstprod access. |
Hercules failure The The
The loproc_updat ran 31 seconds longer than the loproc_contrl. The |
Please post Hera ctests results when they are available. Since this PR adds code to assimilate GMI, we need confirmation either in this PR or in the originating issue, #689 , that the changes in this PR result in |
Results from two GMI experiments which were Run for 60 days can be found on https://www.emc.ncep.noaa.gov/users/xjin/v16_gmi_ens_exp/ https://www.emc.ncep.noaa.gov/users/xjin/v16_gmi_test_09_20/ https://www.emc.ncep.noaa.gov/gmb/gdas/radiance/xjin/gmi/gmi_ens_exp/ https://www.emc.ncep.noaa.gov/gmb/gdas/radiance/xjin/gmi/gmi_test_09_20/ Other details can be found on the attached poster. |
Ctest on Hera passed 6 tests. The test not passed has the following information: [Xin.C.Jin@hfe08 regression]$ more hafs_4denvar_glbens_regression_results.txt The runtime for hafs_4denvar_glbens_loproc_updat is 360.988536 seconds. This has exceeded maximum allowable threshold time of 329.516453 seconds, resulting in Failure time-thresh of the regression test. The runtime for hafs_4denvar_glbens_hiproc_updat is 263.256873 seconds. This has exceeded maximum allowable threshold time of 260.885010 seconds, resulting in Failure of timethresh2 the regression test. The memory for hafs_4denvar_glbens_loproc_updat is 2882748 KBs and is within the maximum allowable memory of 3244072 KBs, continuing with regression test. The results (penalty) between the two runs (hafs_4denvar_glbens_loproc_updat and hafs_4denvar_glbens_loproc_contrl) are reproducible. The fv3_dynvars are reproducible The results (penalty) between the two runs (hafs_4denvar_glbens_loproc_updat and hafs_4denvar_glbens_hiproc_updat) are reproducible The fv3_dynvars are reproducible |
@xincjin-NOAA , both of the failed checks for the Hera |
The wall time are as below: The updat runs are slower than the contrl ones The breakdown for the loproc runs are as below: hafs_4denvar_glbens_loproc_contrl/stdout hafs_4denvar_glbens_loproc_updat/stdout I guess I don't have enough knowledge to judge if they are normal or not. |
Some options to consider
|
hafs_4denvar_glbens_hiproc_contrl/stdout:The total amount of wall time = 267.771050 These are the results from a new ctest in which I EXCHANGED the locations of contrl and updat. This means that the updat represent the GSI codes from develop branch. I am not sure if you can check the git branch on the directory of: /scratch1/NCEPDEV/da/Xin.C.Jin/git/GSI and /scratch1/NCEPDEV/da/Xin.C.Jin/git/develop It seems that the wall time is related to the order of the test runs |
Ctest for WCOSS2 is passed. |
As for updating the global ctests to include assimilating GMI. do we need to make ctest after this update. Because the ctest will fail |
Reminder: Due date for merger of this PR into |
ctest note The ObsProc team may be able to generate GMI bufr dump files for portions of February 2024. If this is possible, we should update ctests |
@xincjin-NOAA , please bring |
@RussTreadon-NOAA , updated xincjin-NOAA:gmi_new, thanks for remind this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
Update global ctest case to 2024022300 C96C48L127 background files to run The following files were modified in a working copy of
The
Files needed to run the 20240223 global case have been rsync'd to the @xincjin-NOAA , I recommend that you update
in your Tagging @ADCollard since you asked about updating the GSI global ctests. We may want to tweak global ctest namelist variables in |
@xincjin-NOAA , I looked at |
@RussTreadon-NOAA Yes, the directory is correct. I am changing this file and doing some other test now |
After refactor the code and apply back the reverted commit, all ctests on WCOSS2 were passed (install develop with fca6bea and gmi_new with 8078902) Test project /lfs/h2/emc/da/noscrub/xin.c.jin/gmi_new/build will test on other platform then. |
@RussTreadon-NOAA @ADCollard @emilyhcliu @TingLei-NOAA After new update of this PR, all ctests are passed on WCOSS2, Hera, and Orion. There is one failure on Hercules: Hercules: Test project /work/noaa/da/xinjin/git/gmi_new/build 86% tests passed, 1 tests failed out of 7 Total Test time (real) = 1682.79 sec The following tests FAILED: The results between the two runs (hafs_3denvar_hybens_loproc_updat and hafs_3denvar_hybens_hiproc_updat) are not reproducible I am not sure if this is a known issue. |
Thank you @xincjin-NOAA for refactoring the code. It's great to see reproducible results once again on WCOSS2. @TingLei-NOAA is working on the Hercules |
@xincjin-NOAA , please request a re-review from the peer reviewers for this PR. Your refactored changes need to be reviewed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor comments, otherwise the code changes look good from a coding perspective.
What about the science perspective? If gmi is assimilated, does the refactored code yield the intended results? The global_4denvar
test only processes gmi in monitor mode. It does not yet assimilate gmi data.
@RussTreadon-NOAA, From the science perspective, If gmi is assimilated, the refactored code will yield the intended results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @xincjin-NOAA for removing unnecessary computations.
@xincjin-NOAA , have you requested re-reviews from Emily, Andrew, and Azadeh? If not, please do so. This PR has passed it's due date. |
The code changes due to GMI look good, and they do (should) not change regression results. Approved! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given two peer review approvals and documentation of ctests results on WCOSS2, Hera, Orion, and Hercules, approve and pass this PR onto GSI Handling Review team for merger into develop
.
@ALL Thanks everyone for making this PR close! |
DUE DATE for merger of this PR into
develop
is 3/6/2024 (six weeks after PR creation).Description
This pull request is to related to #689
Resolves #689
The original code for assimilating GMI in GSI is not working properly.
The main changes are:
Type of change
How Has This Been Tested?
The changes have been verified by a few experiments with more than two months running time
Checklist