Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrapper PR for "Remove optional arguments from CCPP metadata, remove effective radii computation from Thompson MP" and "Fix/improve logic for convective transportable tracers, add GFS_checktracer debugging routine, correct spelling "janic" --> "janjic", fix inconsistent types related with noahmp and log functions", "Modify opnReqTest related scripts for fv3_gsd test" #892

Conversation

climbfuji
Copy link
Collaborator

@climbfuji climbfuji commented Nov 2, 2021

PR Checklist

  • Ths PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.

  • This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR

  • An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
    are specified below.

  • If new or updated input data is required by this PR, it is clearly stated in the text of the PR.

Description

This PR combines #875, #887, #890:

#875:

Update submodule pointers for fv3atm and ccpp-physics for the changes described in the associated PRs below: fix/improve logic for convective transportable tracers in CCPP, add GFS_checktracer debugging routine, correct spelling "janic" --> "janjic" in several CCPP standard names.

#887:

Update the submodule pointers for fv3atm, ccpp-framework and ccpp-physics to:

  • remove support for the optional attribute in the CCPP framework and metadata
  • remove invalid optional and intent attributes from host model metadata (variable or type definitions)
  • remove unused code for calculating cloud effective radii from Thompson MP wrapper (this is done in the radiation interstitials)

#890:

Modify operation requirement test related scripts for the fv3_gsd test. All code changes from @MinsukJi-NOAA (thanks for doing this!)

No changes to the regression test results or the input data.

Issue(s) addressed

Fixes NCAR/ccpp-framework#407
Fixes NCAR/ccpp-physics#751

Testing

Regression tests were run repeatedly for the two sets of PRs combined in this PR. For the combined PRs, regression tests were run on gaea against the existing baseline on 2021/11/02, all tests passed.

Full regression tests will be run on all tier-1 platforms when it is time to commit.

  • hera.intel
  • hera.gnu
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss_cray
  • wcoss_dell_p3
  • CI

Dependencies

Not a dependency, but an associated PR for the CCPP standard names dictionary:

ESCOMP/CCPPStandardNames#23

climbfuji and others added 15 commits October 18, 2021 09:57
@climbfuji climbfuji marked this pull request as ready for review November 2, 2021 17:56
@climbfuji climbfuji changed the title DRAFT - Cleanup ccpp and remove optional arguments combined Wrapper PR for "Remove optional arguments from CCPP metadata, remove effective radii computation from Thompson MP" and "Fix/improve logic for convective transportable tracers, add GFS_checktracer debugging routine, correct spelling "janic" --> "janjic", fix inconsistent types related with noahmp and log functions" Nov 2, 2021
@climbfuji climbfuji changed the title Wrapper PR for "Remove optional arguments from CCPP metadata, remove effective radii computation from Thompson MP" and "Fix/improve logic for convective transportable tracers, add GFS_checktracer debugging routine, correct spelling "janic" --> "janjic", fix inconsistent types related with noahmp and log functions" Wrapper PR for "Remove optional arguments from CCPP metadata, remove effective radii computation from Thompson MP" and "Fix/improve logic for convective transportable tracers, add GFS_checktracer debugging routine, correct spelling "janic" --> "janjic", fix inconsistent types related with noahmp and log functions", "Modify opnReqTest related scripts for fv3_gsd test" Nov 2, 2021
@junwang-noaa junwang-noaa added the No Baseline Change No Baseline Change label Nov 3, 2021
@junwang-noaa
Copy link
Collaborator

Code changes look good to me, will approve when all the RT tests are done.

@BrianCurtis-NOAA
Copy link
Collaborator

Automated RT Failure Notification
Machine: gaea
Compiler: intel
Job: RT
Repo location: /lustre/f2/pdata/ncep/emc.nemspara/autort/pr/771598162/20211103154509/ufs-weather-model
Please manually delete: /lustre/f2/scratch/emc.nemspara/FV3_RT/rt_40458
Test cpld_control_c192_p7 006 failed failed
Test cpld_control_c192_p7 006 failed in run_test failed
Test cpld_control_c384_p7 008 failed failed
Test cpld_control_c384_p7 008 failed in run_test failed
Test control_c192 020 failed failed
Test control_c192 020 failed in run_test failed
Test control_c384 021 failed failed
Test control_c384 021 failed in run_test failed
Test control_c384gdas 022 failed failed
Test control_c384gdas 022 failed in run_test failed
Test regional_control 030 failed failed
Test regional_control 030 failed in run_test failed
Test regional_quilt 032 failed failed
Test regional_quilt 032 failed in run_test failed
Test regional_quilt_hafs 034 failed failed
Test regional_quilt_hafs 034 failed in run_test failed
Test regional_quilt_netcdf_parallel 035 failed failed
Test regional_quilt_netcdf_parallel 035 failed in run_test failed
Please make changes and add the following label back:
gaea-intel-RT

@climbfuji
Copy link
Collaborator Author

Automated RT Failure Notification
Machine: gaea
Compiler: intel
Job: RT
Repo location: /lustre/f2/pdata/ncep/emc.nemspara/autort/pr/771598162/20211103154509/ufs-weather-model
Please manually delete: /lustre/f2/scratch/emc.nemspara/FV3_RT/rt_40458
Test cpld_control_c192_p7 006 failed failed
Test cpld_control_c192_p7 006 failed in run_test failed
Test cpld_control_c384_p7 008 failed failed
Test cpld_control_c384_p7 008 failed in run_test failed
Test control_c192 020 failed failed
Test control_c192 020 failed in run_test failed
Test control_c384 021 failed failed
Test control_c384 021 failed in run_test failed
Test control_c384gdas 022 failed failed
Test control_c384gdas 022 failed in run_test failed
Test regional_control 030 failed failed
Test regional_control 030 failed in run_test failed
Test regional_quilt 032 failed failed
Test regional_quilt 032 failed in run_test failed
Test regional_quilt_hafs 034 failed failed
Test regional_quilt_hafs 034 failed in run_test failed
Test regional_quilt_netcdf_parallel 035 failed failed
Test regional_quilt_netcdf_parallel 035 failed in run_test failed
Please make changes and add the following label back:
gaea-intel-RT

All these tests timed out. I will add back a 1hr wclock time for gaea and rerun the failed tests ...

@climbfuji
Copy link
Collaborator Author

@MinsukJi-NOAA Almost all the tests finished, just waiting for snail gaea. Please let me know if you posted those changes somewhere and I didn't see them. Thanks!

@MinsukJi-NOAA
Copy link
Contributor

MinsukJi-NOAA commented Nov 3, 2021

@climbfuji Apologize for the delay. Thanks in advance for making these changes!

diff --git a/tests/opnReqTests/dcp.sh b/tests/opnReqTests/dcp.sh
index b076e8c3..4dae196e 100644
--- a/tests/opnReqTests/dcp.sh
+++ b/tests/opnReqTests/dcp.sh
@@ -2,11 +2,14 @@ set -eu
 source $PATHRT/opnReqTests/std.sh
 
 if [[ $application == 'global' ]]; then
-  #temp=$INPES
-  #INPES=$JNPES
-  #JNPES=$temp
-  INPES=6
-  JNPES=4
+  if [[ $CI_TEST == 'true' ]]; then
+    temp=$INPES
+    INPES=$JNPES
+    JNPES=$temp
+  else
+    INPES=6
+    JNPES=4
+  fi
 elif [[ $application == 'regional' ]]; then
   if [[ $CI_TEST == 'true' ]]; then
     INPES=10
diff --git a/tests/opnReqTests/rst.sh b/tests/opnReqTests/rst.sh
index cdd6ab6f..1ee9d7d4 100644
--- a/tests/opnReqTests/rst.sh
+++ b/tests/opnReqTests/rst.sh
@@ -11,6 +11,14 @@ if [[ $application == 'global' ]]; then
   else
     RESTART_FILE_PREFIX="${SYEAR}${SMONTH}$(printf "%02d" $((SDAY+1))).$(printf "%02d" $(( SHOUR + FHROT - 24 )))0000"
   fi
+
+  if [[ ! -z $NSTF_NAME ]]; then
+    second_value=$(echo $NSTF_NAME | cut -d ',' -f 2)
+    if [[ $second_value -eq 1 ]]; then
+      NSTF_NAME=$(echo $NSTF_NAME | awk 'BEGIN{FS=OFS=","} { $2-=1; print}')
+    fi
+  fi
+
 elif [[ $application == 'regional' ]]; then
   echo "Regional application not yet implemented for restart"
   exit 1

@climbfuji
Copy link
Collaborator Author

@climbfuji Apologize for the delay. Thanks in advance for making these changes!

diff --git a/tests/opnReqTests/dcp.sh b/tests/opnReqTests/dcp.sh
index b076e8c3..4dae196e 100644
--- a/tests/opnReqTests/dcp.sh
+++ b/tests/opnReqTests/dcp.sh
@@ -2,11 +2,14 @@ set -eu
 source $PATHRT/opnReqTests/std.sh
 
 if [[ $application == 'global' ]]; then
-  #temp=$INPES
-  #INPES=$JNPES
-  #JNPES=$temp
-  INPES=6
-  JNPES=4
+  if [[ $CI_TEST == 'true' ]]; then
+    temp=$INPES
+    INPES=$JNPES
+    JNPES=$temp
+  else
+    INPES=6
+    JNPES=4
+  fi
 elif [[ $application == 'regional' ]]; then
   if [[ $CI_TEST == 'true' ]]; then
     INPES=10
diff --git a/tests/opnReqTests/rst.sh b/tests/opnReqTests/rst.sh
index cdd6ab6f..1ee9d7d4 100644
--- a/tests/opnReqTests/rst.sh
+++ b/tests/opnReqTests/rst.sh
@@ -11,6 +11,14 @@ if [[ $application == 'global' ]]; then
  else
    RESTART_FILE_PREFIX="${SYEAR}${SMONTH}$(printf "%02d" $((SDAY+1))).$(printf "%02d" $(( SHOUR + FHROT - 24 )))0000"
  fi
+
+  if [[ ! -z $NSTF_NAME ]]; then
+    second_value=$(echo $NSTF_NAME | cut -d ',' -f 2)
+    if [[ $second_value -eq 1 ]]; then
+      NSTF_NAME=$(echo $NSTF_NAME | awk 'BEGIN{FS=OFS=","} { $2-=1; print}')
+    fi
+  fi
+
elif [[ $application == 'regional' ]]; then
  echo "Regional application not yet implemented for restart"
  exit 1

Thanks, Minsuk. Please check the latest commit.

@github-actions github-actions bot removed the run-ci label Nov 3, 2021
@MinsukJi-NOAA
Copy link
Contributor

MinsukJi-NOAA commented Nov 3, 2021

@climbfuji NSTF_NAME is correct now, but I still see RESTART failing

baseline dir = /scratch1/NCEPDEV/stmp4/Minsuk.Ji/FV3_OPNREQ_TEST/OPNREQ_TEST/fv3_gsd_std_base
working dir  = /scratch1/NCEPDEV/stmp2/Minsuk.Ji/FV3_OPNREQ_TEST/opnReqTest_39402/fv3_gsd_rst
Checking test rst fv3_gsd results ....
 Comparing atmf012.tile1.nc .........OK
 Comparing atmf012.tile2.nc .........OK
 Comparing atmf012.tile3.nc .........OK
 Comparing atmf012.tile4.nc .........OK
 Comparing atmf012.tile5.nc .........OK
 Comparing atmf012.tile6.nc .........OK
 Comparing RESTART/coupler.res .........OK
 Comparing RESTART/fv_core.res.nc .........OK
 Comparing RESTART/fv_core.res.tile1.nc .........OK
 Comparing RESTART/fv_core.res.tile2.nc .........OK
 Comparing RESTART/fv_core.res.tile3.nc .........OK
 Comparing RESTART/fv_core.res.tile4.nc .........OK
 Comparing RESTART/fv_core.res.tile5.nc .........OK
 Comparing RESTART/fv_core.res.tile6.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile1.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile2.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile3.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile4.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile5.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile6.nc .........OK
 Comparing RESTART/fv_tracer.res.tile1.nc .........OK
 Comparing RESTART/fv_tracer.res.tile2.nc .........OK
 Comparing RESTART/fv_tracer.res.tile3.nc .........OK
 Comparing RESTART/fv_tracer.res.tile4.nc .........OK
 Comparing RESTART/fv_tracer.res.tile5.nc .........OK
 Comparing RESTART/fv_tracer.res.tile6.nc .........OK
 Comparing RESTART/phy_data.tile1.nc .........OK
 Comparing RESTART/phy_data.tile2.nc .........OK
 Comparing RESTART/phy_data.tile3.nc .........OK
 Comparing RESTART/phy_data.tile4.nc .........OK
 Comparing RESTART/phy_data.tile5.nc .........OK
 Comparing RESTART/phy_data.tile6.nc .........OK
 Comparing RESTART/sfc_data.tile1.nc ............ALT CHECK......NOT OK
 Comparing RESTART/sfc_data.tile2.nc ............ALT CHECK......NOT OK
 Comparing RESTART/sfc_data.tile3.nc ............ALT CHECK......NOT OK
 Comparing RESTART/sfc_data.tile4.nc ............ALT CHECK......NOT OK
 Comparing RESTART/sfc_data.tile5.nc ............ALT CHECK......NOT OK
 Comparing RESTART/sfc_data.tile6.nc ............ALT CHECK......NOT OK
 Comparing sfcf012.tile1.nc .........OK
 Comparing sfcf012.tile2.nc .........OK
 Comparing sfcf012.tile3.nc .........OK
 Comparing sfcf012.tile4.nc .........OK
 Comparing sfcf012.tile5.nc .........OK
 Comparing sfcf012.tile6.nc .........OK

  0: The total amount of wall time                        = 32.924707

Test rst fv3_gsd FAIL

@climbfuji
Copy link
Collaborator Author

@climbfuji NSTF_NAME is correct now, but I still see RESTART failing

baseline dir = /scratch1/NCEPDEV/stmp4/Minsuk.Ji/FV3_OPNREQ_TEST/OPNREQ_TEST/fv3_gsd_std_base
working dir  = /scratch1/NCEPDEV/stmp2/Minsuk.Ji/FV3_OPNREQ_TEST/opnReqTest_39402/fv3_gsd_rst
Checking test rst fv3_gsd results ....
 Comparing atmf012.tile1.nc .........OK
 Comparing atmf012.tile2.nc .........OK
 Comparing atmf012.tile3.nc .........OK
 Comparing atmf012.tile4.nc .........OK
 Comparing atmf012.tile5.nc .........OK
 Comparing atmf012.tile6.nc .........OK
 Comparing RESTART/coupler.res .........OK
 Comparing RESTART/fv_core.res.nc .........OK
 Comparing RESTART/fv_core.res.tile1.nc .........OK
 Comparing RESTART/fv_core.res.tile2.nc .........OK
 Comparing RESTART/fv_core.res.tile3.nc .........OK
 Comparing RESTART/fv_core.res.tile4.nc .........OK
 Comparing RESTART/fv_core.res.tile5.nc .........OK
 Comparing RESTART/fv_core.res.tile6.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile1.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile2.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile3.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile4.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile5.nc .........OK
 Comparing RESTART/fv_srf_wnd.res.tile6.nc .........OK
 Comparing RESTART/fv_tracer.res.tile1.nc .........OK
 Comparing RESTART/fv_tracer.res.tile2.nc .........OK
 Comparing RESTART/fv_tracer.res.tile3.nc .........OK
 Comparing RESTART/fv_tracer.res.tile4.nc .........OK
 Comparing RESTART/fv_tracer.res.tile5.nc .........OK
 Comparing RESTART/fv_tracer.res.tile6.nc .........OK
 Comparing RESTART/phy_data.tile1.nc .........OK
 Comparing RESTART/phy_data.tile2.nc .........OK
 Comparing RESTART/phy_data.tile3.nc .........OK
 Comparing RESTART/phy_data.tile4.nc .........OK
 Comparing RESTART/phy_data.tile5.nc .........OK
 Comparing RESTART/phy_data.tile6.nc .........OK
 Comparing RESTART/sfc_data.tile1.nc ............ALT CHECK......NOT OK
 Comparing RESTART/sfc_data.tile2.nc ............ALT CHECK......NOT OK
 Comparing RESTART/sfc_data.tile3.nc ............ALT CHECK......NOT OK
 Comparing RESTART/sfc_data.tile4.nc ............ALT CHECK......NOT OK
 Comparing RESTART/sfc_data.tile5.nc ............ALT CHECK......NOT OK
 Comparing RESTART/sfc_data.tile6.nc ............ALT CHECK......NOT OK
 Comparing sfcf012.tile1.nc .........OK
 Comparing sfcf012.tile2.nc .........OK
 Comparing sfcf012.tile3.nc .........OK
 Comparing sfcf012.tile4.nc .........OK
 Comparing sfcf012.tile5.nc .........OK
 Comparing sfcf012.tile6.nc .........OK

  0: The total amount of wall time                        = 32.924707

Test rst fv3_gsd FAIL

That is ok. Development in the authoritative repositories broke the GSL restart tests a while ago, we fixed it in the NOAA-GSL fork, gsl/develop branch, but these changes have not been merged back yet.

@climbfuji climbfuji added the Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. label Nov 3, 2021
Copy link
Collaborator

@junwang-noaa junwang-noaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the fv3 points back to the official repo, the PR can be committed.

@climbfuji
Copy link
Collaborator Author

Submodule pointer for fv3atm looks ok to me, please check and merge if you agree.

@junwang-noaa junwang-noaa merged commit f76123c into ufs-community:develop Nov 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
No Baseline Change No Baseline Change Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. Waiting for Reviews The PR is waiting for reviews from associated component PR's.
Projects
None yet
6 participants