Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Fixing bug: moved placing fix_lam tests' directories from common place (ufs-srweather-app) to each tests' run directory. #977

Merged
merged 5 commits into from
Dec 14, 2023

Conversation

RatkoVasic-NOAA
Copy link
Collaborator

DESCRIPTION OF CHANGES:

We noticed when running comprehensive tests that one particular test is failing (nco_grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_HRRR), but when submitted separately, that test is working fine. Problem was that tests that use pre-staged files (orography, vegetation, snow,...) their fix files were staged in source directory (ufs-srweather-app/fix) instead of run directory. It works fine as long as tests have different resolution, but tests with same resolution, but different domain dimension (i.e. C403: RRFS_CONUS_25km & RRFS_CONUScompact_25km) have same name and second overwrites first:

/scratch1/NCEPDEV/nems/role.epic/UFS_SRW_data/develop/FV3LAM_pregen/RRFS_CONUS_25km/C403_oro_data.tile7.halo0.nc  	   ---> fix/fix_lam/C403_oro_data.tile7.halo0.nc
/scratch1/NCEPDEV/nems/role.epic/UFS_SRW_data/develop/FV3LAM_pregen/RRFS_CONUScompact_25km/C403_oro_data.tile7.halo0.nc	   ---> fix/fix_lam/C403_oro_data.tile7.halo0.nc

Change was made to place fix directory in each tests' run directory.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

TESTS CONDUCTED:

  • hera.intel
  • orion.intel
  • hercules.intel
  • cheyenne.intel
  • cheyenne.gnu
  • derecho.intel
  • gaea.intel
  • gaeac5.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (hera-intel, hera-gnu, gaea-c4, gaea-c5, jet, orion, hercules)

ISSUE:

#974

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

LABELS (optional):

A Code Manager needs to add the following labels to this PR:

  • Work In Progress
  • bug
  • enhancement
  • documentation
  • release
  • high priority
  • run_ci
  • run_we2e_fundamental_tests
  • run_we2e_comprehensive_tests
  • Needs Cheyenne test
  • Needs Jet test
  • Needs Hera test
  • Needs Orion test
  • help wanted

@MichaelLueken MichaelLueken changed the title Fixing bug: moved placing fix_lam tests' directories from common place (ufs-srweather-app) to each tests' run directory. [develop] Fixing bug: moved placing fix_lam tests' directories from common place (ufs-srweather-app) to each tests' run directory. Nov 27, 2023
@MichaelLueken MichaelLueken added the bug Something isn't working label Nov 27, 2023
MichaelLueken and others added 2 commits November 29, 2023 15:43
…hera.intel.nco to allow the latter to successfully run without failure.
Switch custom grid WE2E tests for coverage.hera.gnu.com and coverage.hera.intel.nco
Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RatkoVasic-NOAA -

These changes look good to me! I was able to run the comprehensive tests on Hera and Orion. All tests successfully passed. Also, thank you for merging my modifications to the coverage.hera.gnu.com and coverage.hera.intel.nco suites. Once this PR is merged, all coverage and comprehensive WE2E tests should run and pass without issue.

@MichaelLueken MichaelLueken added the run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests label Dec 7, 2023
@MichaelLueken
Copy link
Collaborator

The WE2E coverage tests were manually run on Derecho and all successfully passed:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
custom_ESGgrid_IndianOcean_6km                                     COMPLETE              23.22
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_plot     COMPLETE              36.99
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16                COMPLETE              44.67
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_HRRR           COMPLETE              28.77
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta    COMPLETE              17.53
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_HRRR                COMPLETE              40.37
nco_grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_timeoffset_suite_  COMPLETE              23.79
pregen_grid_orog_sfc_climo                                         COMPLETE              15.12
specify_template_filenames                                         COMPLETE              14.33
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE             244.79

The Jenkins tests failed to build on Jet due to exceeding disk quota on the machine. Both Hera Intel and Hera GNU lost communication with Jenkins while attempting to run the Functional Workflow Task Tests. Both the Gaea and Orion tests encountered system bus related errors. The Gaea C5 and Hercules tests all successfully passed. Right now, all Jenkins runners are down, so I'm unable to requeue the pipeline. Hopefully the issue will be mitigated soon so that testing can resume and we can move forward with this PR.

@MichaelLueken
Copy link
Collaborator

The Jenkins tests have successfully run on Gaea:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
community                                                          COMPLETE              24.45
custom_ESGgrid_NewZealand_3km                                      COMPLETE              63.91
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta    COMPLETE              29.74
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RAP              COMPLETE              32.38
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_HRRR             COMPLETE              34.25
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15_thompson  COMPLETE             358.37
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR          COMPLETE              38.00
grid_RRFS_CONUScompact_3km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta     COMPLETE             369.92
grid_SUBCONUS_Ind_3km_ics_RAP_lbcs_RAP_suite_RRFS_v1beta_plot      COMPLETE              10.80
nco_ensemble                                                       COMPLETE              80.07
nco_grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15_thom  COMPLETE             354.19
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE            1396.08

Gaea-C5:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
community                                                          COMPLETE              44.12
custom_ESGgrid_NewZealand_3km                                      COMPLETE              48.51
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta    COMPLETE              26.67
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RAP              COMPLETE              29.67
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_HRRR             COMPLETE              29.67
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15_thompson  COMPLETE             314.03
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR          COMPLETE              30.62
grid_RRFS_CONUScompact_3km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta     COMPLETE             274.07
grid_SUBCONUS_Ind_3km_ics_RAP_lbcs_RAP_suite_RRFS_v1beta_plot      COMPLETE              16.93
nco_ensemble                                                       COMPLETE              98.94
nco_grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15_thom  COMPLETE             305.37
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE            1218.60

Jet:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used
----------------------------------------------------------------------------------------------------
community                                                          COMPLETE              17.61
custom_ESGgrid                                                     COMPLETE              17.63
custom_ESGgrid_Great_Lakes_snow_8km                                COMPLETE              12.28
custom_GFDLgrid                                                    COMPLETE               9.41
get_from_HPSS_ics_FV3GFS_lbcs_FV3GFS_fmt_nemsio_2021032018         COMPLETE               9.20
get_from_HPSS_ics_FV3GFS_lbcs_FV3GFS_fmt_netcdf_2022060112_48h     COMPLETE              49.34
get_from_HPSS_ics_RAP_lbcs_RAP                                     COMPLETE              15.84
grid_RRFS_AK_3km_ics_FV3GFS_lbcs_FV3GFS_suite_HRRR                 COMPLETE             223.65
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_plot     COMPLETE              39.54
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2        COMPLETE               8.37
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_RRFS_v1beta       COMPLETE             500.31
nco_grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_HRRR       COMPLETE              10.60
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE             913.78

Hercules:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
custom_GFDLgrid__GFDLgrid_USE_NUM_CELLS_IN_FILENAMES_eq_FALSE      COMPLETE               9.12
grid_CONUS_25km_GFDLgrid_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16      COMPLETE              11.38
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RRFS_v1beta      COMPLETE              28.59
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot  COMPLETE              18.39
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_HRRR             COMPLETE              25.07
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_RAP              COMPLETE              52.91
grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16   COMPLETE              13.59
grid_RRFS_NA_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RAP                 COMPLETE              65.32
grid_SUBCONUS_Ind_3km_ics_NAM_lbcs_NAM_suite_GFS_v16               COMPLETE              29.22
MET_verification_only_vx                                           COMPLETE               0.21
specify_EXTRN_MDL_SYSBASEDIR_ICS_LBCS                              COMPLETE               8.47
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE             262.27

And Orion:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
custom_ESGgrid_SF_1p1km                                            COMPLETE             161.63
deactivate_tasks                                                   COMPLETE               1.49
get_from_AWS_ics_GEFS_lbcs_GEFS_fmt_grib2_2022040400_ensemble_2me  COMPLETE             762.43
grid_CONUS_3km_GFDLgrid_ics_FV3GFS_lbcs_FV3GFS_suite_RRFS_v1beta   COMPLETE             260.03
grid_RRFS_AK_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_plot        COMPLETE             142.19
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_RRFS_v1beta            COMPLETE              17.05
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_HRRR              COMPLETE             383.55
grid_RRFS_CONUScompact_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16   COMPLETE              31.69
grid_RRFS_CONUScompact_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16    COMPLETE             287.85
grid_SUBCONUS_Ind_3km_ics_FV3GFS_lbcs_FV3GFS_suite_WoFS_v0         COMPLETE              16.30
nco                                                                COMPLETE               7.88
2020_CAD                                                           COMPLETE              34.19
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE            2106.28

Due to Hera allocation issues, the Hera Intel coverage tests were manually ran and all successfully passed:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
custom_ESGgrid_Peru_12km                                           COMPLETE              19.99
get_from_HPSS_ics_FV3GFS_lbcs_FV3GFS_fmt_grib2_2019061200          COMPLETE               6.99
get_from_HPSS_ics_GDAS_lbcs_GDAS_fmt_netcdf_2022040400_ensemble_2  COMPLETE             769.86
get_from_HPSS_ics_HRRR_lbcs_RAP                                    COMPLETE              14.60
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2        COMPLETE               7.95
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_plot     COMPLETE              13.71
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_RAP_suite_RAP                 COMPLETE              11.11
grid_RRFS_CONUS_25km_ics_GSMGFS_lbcs_GSMGFS_suite_GFS_v15p2        COMPLETE               7.30
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2         COMPLETE             235.71
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16           COMPLETE             311.38
grid_RRFS_CONUScompact_3km_ics_HRRR_lbcs_RAP_suite_HRRR            COMPLETE             379.63
pregen_grid_orog_sfc_climo                                         COMPLETE               8.72
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE            1786.95

Moving forward with the merge now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests
Projects
None yet
3 participants