Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make consistent use of restart_interval across UFS applications #1700

Merged
merged 30 commits into from
Jun 26, 2023

Conversation

aerorahul
Copy link
Contributor

@aerorahul aerorahul commented Jun 18, 2023

Description

Restart of model components in the ufs-weather-model is controlled in model_configure for FV3 and nems.configure for MOM6, CICE6 and CMEPS. WW3 uses an equivalent input as nems.configure to determine its stride.

Relevant variables controlling the output frequency of restarts:
nems.configure : restart_n
expected value:
frequency: every frequency unit (unit is controlled via a different variable in nems.configure).

model_configure : restart_interval
expected values are:
0: end of the forecast
frequency -1: every frequency hour
restart_hour1 restart_hour2 : at hours restart_hour1, restart_hour2

This PR:

  • removes all references to a list version of restart_interval. This list is no longer usable in the coupled model.
  • config.fcst and config.efcs will henceforth specify an true restart interval in hours. restart_interval=0 will imply writing out restarts at the end of the forecast at FHMAX.

In addition, this PR:

  • removes static NCO config files. They will be reintroduced when v17 is being prepared. They are no longer in sync and thus not necessary to carry forward in develop
  • removes missed links in link_workflow.sh for regrid_nemsio.fd, fv3nemsio.fd and enkf_chgres_recenter.fd. These were removed from gfs-utils
  • does some cleanup in the forecast subscripts.

This work is necessary for enabling coupled ensemble forecasts in cycling mode.

Fixes #496

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

How Has This Been Tested?
CI tests

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

ush/forecast_det.sh Fixed Show resolved Hide resolved
ush/forecast_det.sh Fixed Show fixed Hide fixed
ush/forecast_det.sh Fixed Show fixed Hide fixed
ush/forecast_postdet.sh Fixed Show fixed Hide fixed
ush/forecast_postdet.sh Fixed Show fixed Hide fixed
ush/forecast_postdet.sh Fixed Show fixed Hide fixed
ush/forecast_postdet.sh Fixed Show fixed Hide fixed
@aerorahul aerorahul added the CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera label Jun 18, 2023
@emcbot emcbot added CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress and removed CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera labels Jun 18, 2023
@emcbot
Copy link

emcbot commented Jun 18, 2023

Automated global-workflow Testing Results:

Machine: Hera
Start: Sun Jun 18 04:36:56 UTC 2023 on hfe05
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Sun Jun 18 04:42:16 UTC 2023
Build:                         *SUCCESS*
Build: Completed at Sun Jun 18 05:52:46 UTC 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 05:52:50 UTC 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 05:52:54 UTC 2023 for experiment C96C48_hybatmDA
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 05:52:57 UTC 2023 for experiment C96_atm3DVar

@emcbot emcbot added CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress labels Jun 18, 2023
@emcbot
Copy link

emcbot commented Jun 18, 2023

Automated global-workflow Testing Results:

Machine: Hera
Start: Sun Jun 18 04:36:56 UTC 2023 on hfe05
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Sun Jun 18 04:42:16 UTC 2023
Build:                         *SUCCESS*
Build: Completed at Sun Jun 18 05:52:46 UTC 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 05:52:50 UTC 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 05:52:54 UTC 2023 for experiment C96C48_hybatmDA
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 05:52:57 UTC 2023 for experiment C96_atm3DVar
Experiment C48_S2S Terminated: *FAILED*
Experiment C48_S2S Terminated with 1 tasks failed at Sun Jun 18 06:07:13 UTC 2023
Error logs:
/scratch1/NCEPDEV/global/Terry.McGuinness/GFS_CI_ROOT/PR/1700/RUNTESTS/C48_S2S/COMROT/C48_S2S/logs/2021032312/gfscoupled_ic.log

@aerorahul aerorahul added CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed labels Jun 18, 2023
@emcbot emcbot added CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress and removed CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera labels Jun 18, 2023
@emcbot
Copy link

emcbot commented Jun 18, 2023

Automated global-workflow Testing Results:

Machine: Hera
Start: Sun Jun 18 13:56:56 UTC 2023 on hfe05
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Sun Jun 18 14:01:41 UTC 2023
Build:                         *SUCCESS*
Build: Completed at Sun Jun 18 15:12:25 UTC 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 15:12:29 UTC 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 15:12:32 UTC 2023 for experiment C96C48_hybatmDA
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 15:12:35 UTC 2023 for experiment C96_atm3DVar

@emcbot
Copy link

emcbot commented Jun 18, 2023

Automated global-workflow Testing Results:

Machine: Hera
Start: Sun Jun 18 13:56:56 UTC 2023 on hfe05
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Sun Jun 18 14:01:41 UTC 2023
Build:                         *SUCCESS*
Build: Completed at Sun Jun 18 15:12:25 UTC 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 15:12:29 UTC 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 15:12:32 UTC 2023 for experiment C96C48_hybatmDA
Created experiment:            *SUCCESS*
Case setup: Completed at Sun Jun 18 15:12:35 UTC 2023 for experiment C96_atm3DVar
Experiment C48_S2S completed: *SUCCESS*
Experiment C48_S2S Completed at Sun Jun 18 15:56:10 UTC 2023
with 18 successfully completed jobs

@emcbot emcbot added CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress labels Jun 18, 2023
@emcbot emcbot added the CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress label Jun 24, 2023
@emcbot
Copy link

emcbot commented Jun 24, 2023

Checkout: FAILED
Checkout: Failed at Fri Jun 23 22:56:15 UTC 2023
Checkout: see output at /scratch1/NCEPDEV/global/Terry.McGuinness/GFS_CI_ROOT/PR/1700/global-workflow/sorc/log.checkout
Automated global-workflow Testing Results:

Machine: Hera
Start: Fri Jun 23 22:56:26 UTC 2023 on hfe05
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Fri Jun 23 23:02:30 UTC 2023
Build:                         *SUCCESS*
Build: Completed at Sat Jun 24 00:13:53 UTC 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:13:58 UTC 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:14:02 UTC 2023 for experiment C96C48_hybatmDA
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:14:06 UTC 2023 for experiment C96_atm3DVar

@emcbot
Copy link

emcbot commented Jun 24, 2023

Checkout: FAILED
Checkout: Failed at Fri Jun 23 22:56:15 UTC 2023
Checkout: see output at /scratch1/NCEPDEV/global/Terry.McGuinness/GFS_CI_ROOT/PR/1700/global-workflow/sorc/log.checkout
Automated global-workflow Testing Results:

Machine: Hera
Start: Fri Jun 23 22:56:26 UTC 2023 on hfe05
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Fri Jun 23 23:02:30 UTC 2023
Build:                         *SUCCESS*
Build: Completed at Sat Jun 24 00:13:53 UTC 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:13:58 UTC 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:14:02 UTC 2023 for experiment C96C48_hybatmDA
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:14:06 UTC 2023 for experiment C96_atm3DVar
Experiment C48_S2S completed: *SUCCESS*
Experiment C48_S2S Completed at Sat Jun 24 00:56:08 UTC 2023
with 18 successfully completed jobs

@emcbot emcbot added CI-Orion-Running **Bot use only** CI testing on Orion for this PR is in-progress and removed CI-Orion-Building **Bot use only** CI testing is cloning/building on Orion labels Jun 24, 2023
@emcbot
Copy link

emcbot commented Jun 24, 2023

Automated global-workflow Testing Results:

Machine: Orion
Start: Fri Jun 23 18:44:21 CDT 2023 on Orion-login-1.HPC.MsState.Edu
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Fri Jun 23 18:46:35 CDT 2023
Build:                         *SUCCESS*
Build: Completed at Fri Jun 23 20:03:58 CDT 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:06 CDT 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:10 CDT 2023 for experiment C96_atm3DVar
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:14 CDT 2023 for experiment C96C48_hybatmDA

@emcbot
Copy link

emcbot commented Jun 24, 2023

Automated global-workflow Testing Results:

Machine: Orion
Start: Fri Jun 23 18:44:21 CDT 2023 on Orion-login-1.HPC.MsState.Edu
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Fri Jun 23 18:46:35 CDT 2023
Build:                         *SUCCESS*
Build: Completed at Fri Jun 23 20:03:58 CDT 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:06 CDT 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:10 CDT 2023 for experiment C96_atm3DVar
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:14 CDT 2023 for experiment C96C48_hybatmDA
Experiment C48_S2S completed: *SUCCESS*
Experiment C48_S2S Completed at Fri Jun 23 20:49:06 CDT 2023
with 18 successfully completed jobs

@emcbot
Copy link

emcbot commented Jun 24, 2023

Checkout: FAILED
Checkout: Failed at Fri Jun 23 22:56:15 UTC 2023
Checkout: see output at /scratch1/NCEPDEV/global/Terry.McGuinness/GFS_CI_ROOT/PR/1700/global-workflow/sorc/log.checkout
Automated global-workflow Testing Results:

Machine: Hera
Start: Fri Jun 23 22:56:26 UTC 2023 on hfe05
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Fri Jun 23 23:02:30 UTC 2023
Build:                         *SUCCESS*
Build: Completed at Sat Jun 24 00:13:53 UTC 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:13:58 UTC 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:14:02 UTC 2023 for experiment C96C48_hybatmDA
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:14:06 UTC 2023 for experiment C96_atm3DVar
Experiment C48_S2S completed: *SUCCESS*
Experiment C48_S2S Completed at Sat Jun 24 00:56:08 UTC 2023
with 18 successfully completed jobs
Experiment C96C48_hybatmDA completed: *SUCCESS*
Experiment C96C48_hybatmDA Completed at Sat Jun 24 02:35:13 UTC 2023
with 148 successfully completed jobs

@emcbot
Copy link

emcbot commented Jun 24, 2023

Automated global-workflow Testing Results:

Machine: Orion
Start: Fri Jun 23 18:44:21 CDT 2023 on Orion-login-1.HPC.MsState.Edu
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Fri Jun 23 18:46:35 CDT 2023
Build:                         *SUCCESS*
Build: Completed at Fri Jun 23 20:03:58 CDT 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:06 CDT 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:10 CDT 2023 for experiment C96_atm3DVar
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:14 CDT 2023 for experiment C96C48_hybatmDA
Experiment C48_S2S completed: *SUCCESS*
Experiment C48_S2S Completed at Fri Jun 23 20:49:06 CDT 2023
with 18 successfully completed jobs
Experiment C96_atm3DVar completed: *SUCCESS*
Experiment C96_atm3DVar Completed at Fri Jun 23 22:00:08 CDT 2023
with 86 successfully completed jobs

@emcbot
Copy link

emcbot commented Jun 24, 2023

Checkout: FAILED
Checkout: Failed at Fri Jun 23 22:56:15 UTC 2023
Checkout: see output at /scratch1/NCEPDEV/global/Terry.McGuinness/GFS_CI_ROOT/PR/1700/global-workflow/sorc/log.checkout
Automated global-workflow Testing Results:

Machine: Hera
Start: Fri Jun 23 22:56:26 UTC 2023 on hfe05
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Fri Jun 23 23:02:30 UTC 2023
Build:                         *SUCCESS*
Build: Completed at Sat Jun 24 00:13:53 UTC 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:13:58 UTC 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:14:02 UTC 2023 for experiment C96C48_hybatmDA
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:14:06 UTC 2023 for experiment C96_atm3DVar
Experiment C48_S2S completed: *SUCCESS*
Experiment C48_S2S Completed at Sat Jun 24 00:56:08 UTC 2023
with 18 successfully completed jobs
Experiment C96C48_hybatmDA completed: *SUCCESS*
Experiment C96C48_hybatmDA Completed at Sat Jun 24 02:35:13 UTC 2023
with 148 successfully completed jobs
Experiment C96_atm3DVar completed: *SUCCESS*
Experiment C96_atm3DVar Completed at Sat Jun 24 03:21:16 UTC 2023
with 86 successfully completed jobs

@emcbot emcbot added CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully and removed CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress labels Jun 24, 2023
@emcbot
Copy link

emcbot commented Jun 24, 2023

Checkout: FAILED
Checkout: Failed at Fri Jun 23 22:56:15 UTC 2023
Checkout: see output at /scratch1/NCEPDEV/global/Terry.McGuinness/GFS_CI_ROOT/PR/1700/global-workflow/sorc/log.checkout
Automated global-workflow Testing Results:

Machine: Hera
Start: Fri Jun 23 22:56:26 UTC 2023 on hfe05
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Fri Jun 23 23:02:30 UTC 2023
Build:                         *SUCCESS*
Build: Completed at Sat Jun 24 00:13:53 UTC 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:13:58 UTC 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:14:02 UTC 2023 for experiment C96C48_hybatmDA
Created experiment:            *SUCCESS*
Case setup: Completed at Sat Jun 24 00:14:06 UTC 2023 for experiment C96_atm3DVar
Experiment C48_S2S completed: *SUCCESS*
Experiment C48_S2S Completed at Sat Jun 24 00:56:08 UTC 2023
with 18 successfully completed jobs
Experiment C96C48_hybatmDA completed: *SUCCESS*
Experiment C96C48_hybatmDA Completed at Sat Jun 24 02:35:13 UTC 2023
with 148 successfully completed jobs
Experiment C96_atm3DVar completed: *SUCCESS*
Experiment C96_atm3DVar Completed at Sat Jun 24 03:21:16 UTC 2023
with 86 successfully completed jobs

@emcbot emcbot added CI-Orion-Failed **Bot use only** CI testing on Orion for this PR has failed and removed CI-Orion-Running **Bot use only** CI testing on Orion for this PR is in-progress labels Jun 24, 2023
@emcbot
Copy link

emcbot commented Jun 24, 2023

Automated global-workflow Testing Results:

Machine: Orion
Start: Fri Jun 23 18:44:21 CDT 2023 on Orion-login-1.HPC.MsState.Edu
---------------------------------------------------
Checkout:                      *SUCCESS*
Checkout: Completed at Fri Jun 23 18:46:35 CDT 2023
Build:                         *SUCCESS*
Build: Completed at Fri Jun 23 20:03:58 CDT 2023
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:06 CDT 2023 for experiment C48_S2S
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:10 CDT 2023 for experiment C96_atm3DVar
Created experiment:            *SUCCESS*
Case setup: Completed at Fri Jun 23 20:04:14 CDT 2023 for experiment C96C48_hybatmDA
Experiment C48_S2S completed: *SUCCESS*
Experiment C48_S2S Completed at Fri Jun 23 20:49:06 CDT 2023
with 18 successfully completed jobs
Experiment C96_atm3DVar completed: *SUCCESS*
Experiment C96_atm3DVar Completed at Fri Jun 23 22:00:08 CDT 2023
with 86 successfully completed jobs
Experiment C96C48_hybatmDA Terminated: *FAILED*
Experiment C96C48_hybatmDA Terminated with 1 tasks failed at Fri Jun 23 23:07:07 CDT 2023
Error logs:
/work2/noaa/stmp/GFS_CI_ROOT/PR/1700/RUNTESTS/C96C48_hybatmDA/COMROT/C96C48_hybatmDA/logs/2021122106/gdasvrfy.log

@aerorahul
Copy link
Contributor Author

The error from the failed Orion vrfy job is:
mv: cannot move ‘radmon_angle.tar.gz’ to a subdirectory of itself, ‘/work/noaa/global/mterry/monitor/radmon/stats/C96C48_hybatmDA/gdas.20211221/06/./radmon_angle.tar.gz’

I thought we fixed this issue? @WalterKolczynski-NOAA
This is not a failure in the changes in the PR. I suggest we merge since Hera has passed (and Orion passed before the reviewer requested shell norm changes)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the only changes here are for the shellnorm checks, is that correct or am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly yes.
Except lines 78-82

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the only changes here are for the shellnorm checks, is that correct or am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are correct.

if [[ "${DOIAU}" == "YES" ]]; then
export restart_interval="3 6"
export restart_interval="3"
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the IAU only rewinded (T - 0.5*DA window)? Further, why this change? This is confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restart_interval = 3 will write restarts every 3 hours.
restart_interval = "3 6" will write restarts at hours 3 and 6 only.

This change is necessary to address the issue this PR is resolving.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, OK got it. Sorry, this wasn't obvious to me. Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like only shellnorm and whitespace corrections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shellnorm only corrections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. And removing all data atm cases.

Copy link
Contributor

@HenryRWinterbottom HenryRWinterbottom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarifications. Approved.

@aerorahul aerorahul merged commit 4f03d8b into NOAA-EMC:develop Jun 26, 2023
@aerorahul aerorahul deleted the feature/restart_interval branch June 26, 2023 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully CI-Orion-Failed **Bot use only** CI testing on Orion for this PR has failed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Only use single integer for restart interval
4 participants