Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Fix failure on warm start option of SRW-AQM #1065

Merged
merged 6 commits into from
Apr 4, 2024

Conversation

chan-hoo
Copy link
Collaborator

DESCRIPTION OF CHANGES:

  • Fix failure on the warm start option of SRW-AQM.
  • Change the sample script config.aqm.yaml for running a warm start.
  • Change cpreq to cp because it does not work correctly on other machines except for WCOSS2.
  • Add missing exclusion to .gitignore.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

  • WE2E fundamental tests on Orion

  • WE2E AQM test on Hera

  • New sample script for a warm start run on Hera

  • hera.intel

  • orion.intel

  • hercules.intel

  • cheyenne.intel

  • cheyenne.gnu

  • derecho.intel

  • gaea.intel

  • gaeac5.intel

  • jet.intel

  • wcoss2.intel

  • NOAA Cloud (indicate which platform)

  • Jenkins

  • fundamental test suite

  • comprehensive tests (specify which if a subset was used)

ISSUE:

Fixes Issue #1061

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chan-hoo -

Both the aqm_grid_AQM_NA13km_suite_GFS_v16 WE2E test and the aqm_AQMNA13km_warmstart sample config were tested on Hera and both successfully passed:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used
----------------------------------------------------------------------------------------------------
aqm_grid_AQM_NA13km_suite_GFS_v16_20240329190758                   COMPLETE            4992.21
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE            4992.21
       CYCLE                    TASK                       JOBID               STATE         EXIT STATUS     TRIES      DURATION
================================================================================================================================
202311100000               make_grid                    57819410           SUCCEEDED                   0         1          13.0
202311100000               make_orog                    57819464           SUCCEEDED                   0         1         170.0
202311100000          make_sfc_climo                    57819571           SUCCEEDED                   0         1          46.0
202311100000           nexus_gfs_sfc                    57819411           SUCCEEDED                   0         1           7.0
202311100000       nexus_emission_00                    57819465           SUCCEEDED                   0         1         561.0
202311100000       nexus_emission_01                    57819466           SUCCEEDED                   0         1         540.0
202311100000       nexus_emission_02                    57819468           SUCCEEDED                   0         1         594.0
202311100000        nexus_post_split                    57819892           SUCCEEDED                   0         1          86.0
202311100000           fire_emission                    57819412           SUCCEEDED                   0         1          10.0
202311100000            point_source                    57819470           SUCCEEDED                   0         1         217.0
202311100000             aqm_ics_ext                    57819842           SUCCEEDED                   0         1         118.0
202311100000                aqm_lbcs                    57819893           SUCCEEDED                   0         1          64.0
202311100000           get_extrn_ics                    57819413           SUCCEEDED                   0         1          60.0
202311100000          get_extrn_lbcs                    57819414           SUCCEEDED                   0         1         155.0
202311100000         make_ics_mem000                    57819683           SUCCEEDED                   0         1         106.0
202311100000        make_lbcs_mem000                    57819684           SUCCEEDED                   0         1         270.0
202311100000         run_fcst_mem000                    57819960           SUCCEEDED                   0         1        2477.0
202311100000    run_post_mem000_f000                    57820339           SUCCEEDED                   0         1          18.0
202311100000    run_post_mem000_f001                    57820340           SUCCEEDED                   0         1          20.0
202311100000    run_post_mem000_f002                    57820415           SUCCEEDED                   0         1          22.0
202311100000    run_post_mem000_f003                    57820416           SUCCEEDED                   0         1          16.0
202311100000    run_post_mem000_f004                    57820465           SUCCEEDED                   0         1          16.0
202311100000    run_post_mem000_f005                    57820464           SUCCEEDED                   0         1          15.0
202311100000    run_post_mem000_f006                    57820554           SUCCEEDED                   0         1          16.0
202311100000    run_post_mem000_f007                    57820553           SUCCEEDED                   0         1          16.0
202311100000    run_post_mem000_f008                    57820568           SUCCEEDED                   0         1          15.0
202311100000    run_post_mem000_f009                    57820569           SUCCEEDED                   0         1          15.0
202311100000    run_post_mem000_f010                    57820570           SUCCEEDED                   0         1          16.0
202311100000    run_post_mem000_f011                    57820605           SUCCEEDED                   0         1          15.0
202311100000    run_post_mem000_f012                    57820606           SUCCEEDED                   0         1          19.0
202311100000    run_post_mem000_f013                    57820703           SUCCEEDED                   0         1          17.0
202311100000    run_post_mem000_f014                    57820704           SUCCEEDED                   0         1          18.0
202311100000    run_post_mem000_f015                    57820742           SUCCEEDED                   0         1          17.0
202311100000    run_post_mem000_f016                    57820743           SUCCEEDED                   0         1          16.0
202311100000    run_post_mem000_f017                    57820803           SUCCEEDED                   0         1          15.0
202311100000    run_post_mem000_f018                    57820804           SUCCEEDED                   0         1          18.0
202311100000    run_post_mem000_f019                    57820877           SUCCEEDED                   0         1          18.0
202311100000    run_post_mem000_f020                    57820920           SUCCEEDED                   0         1          17.0
202311100000    run_post_mem000_f021                    57820919           SUCCEEDED                   0         1          17.0
202311100000    run_post_mem000_f022                    57821011           SUCCEEDED                   0         1          14.0
202311100000    run_post_mem000_f023                    57821012           SUCCEEDED                   0         1          14.0
202311100000    run_post_mem000_f024                    57821013           SUCCEEDED                   0         1          13.0
================================================================================================================================
202311110000           nexus_gfs_sfc                    57819415           SUCCEEDED                   0         1          49.0
202311110000       nexus_emission_00                    57819467           SUCCEEDED                   0         1         548.0
202311110000       nexus_emission_01                    57819471           SUCCEEDED                   0         1         524.0
202311110000       nexus_emission_02                    57819469           SUCCEEDED                   0         1         591.0
202311110000        nexus_post_split                    57819894           SUCCEEDED                   0         1          84.0
202311110000           fire_emission                    57819416           SUCCEEDED                   0         1          14.0
202311110000            point_source                    57819472           SUCCEEDED                   0         1         217.0
202311110000                 aqm_ics                    57821014           SUCCEEDED                   0         1         115.0
202311110000                aqm_lbcs                    57819895           SUCCEEDED                   0         1          36.0
202311110000           get_extrn_ics                    57819417           SUCCEEDED                   0         1          59.0
202311110000          get_extrn_lbcs                    57819418           SUCCEEDED                   0         1         156.0
202311110000         make_ics_mem000                    57819682           SUCCEEDED                   0         1         105.0
202311110000        make_lbcs_mem000                    57819681           SUCCEEDED                   0         1         271.0
202311110000         run_fcst_mem000                    57821161           SUCCEEDED                   0         1        2675.0
202311110000    run_post_mem000_f000                    57821616           SUCCEEDED                   0         1          16.0
202311110000    run_post_mem000_f001                    57821615           SUCCEEDED                   0         1          21.0
202311110000    run_post_mem000_f002                    57821675           SUCCEEDED                   0         1          16.0
202311110000    run_post_mem000_f003                    57821676           SUCCEEDED                   0         1          17.0
202311110000    run_post_mem000_f004                    57821966           SUCCEEDED                   0         1          16.0
202311110000    run_post_mem000_f005                    57821961           SUCCEEDED                   0         1          16.0
202311110000    run_post_mem000_f006                    57822041           SUCCEEDED                   0         1          16.0
202311110000    run_post_mem000_f007                    57822040           SUCCEEDED                   0         1          16.0
202311110000    run_post_mem000_f008                    57822101           SUCCEEDED                   0         1          15.0
202311110000    run_post_mem000_f009                    57822100           SUCCEEDED                   0         1          16.0
202311110000    run_post_mem000_f010                    57822256           SUCCEEDED                   0         1          19.0
202311110000    run_post_mem000_f011                    57822255           SUCCEEDED                   0         1          14.0
202311110000    run_post_mem000_f012                    57822291           SUCCEEDED                   0         1          19.0
202311110000    run_post_mem000_f013                    57822335           SUCCEEDED                   0         1          16.0
202311110000    run_post_mem000_f014                    57822334           SUCCEEDED                   0         1          16.0
202311110000    run_post_mem000_f015                    57822410           SUCCEEDED                   0         1          18.0
202311110000    run_post_mem000_f016                    57822411           SUCCEEDED                   0         1          16.0
202311110000    run_post_mem000_f017                    57822462           SUCCEEDED                   0         1          14.0
202311110000    run_post_mem000_f018                    57822461           SUCCEEDED                   0         1          13.0
202311110000    run_post_mem000_f019                    57822692           SUCCEEDED                   0         1          17.0
202311110000    run_post_mem000_f020                    57822691           SUCCEEDED                   0         1          13.0
202311110000    run_post_mem000_f021                    57822782           SUCCEEDED                   0         1          17.0
202311110000    run_post_mem000_f022                    57822872           SUCCEEDED                   0         1          19.0
202311110000    run_post_mem000_f023                    57822873           SUCCEEDED                   0         1          17.0
202311110000    run_post_mem000_f024                    57822874           SUCCEEDED                   0         1          15.0

Approving this PR now.

@chan-hoo
Copy link
Collaborator Author

chan-hoo commented Apr 1, 2024

@MichaelLueken, thank you so much for your quick approval!! :)

@RatkoVasic-NOAA
Copy link
Collaborator

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used
----------------------------------------------------------------------------------------------------
aqm_grid_AQM_NA13km_suite_GFS_v16_20240401183550                   COMPLETE            4868.32
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE            4868.32

Approving!

@MichaelLueken MichaelLueken added the run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests label Apr 1, 2024
@chan-hoo
Copy link
Collaborator Author

chan-hoo commented Apr 2, 2024

@RatkoVasic-NOAA, thank you soooo mush for your approval !!! :)

@MichaelLueken
Copy link
Collaborator

@chan-hoo -

The Functional WorkflowTaskTests stage failed on Derecho (didn't even make it to the actual test stage). Once the machine returns from maintenance, I will resubmit the Jenkins tests on Derecho.

@chan-hoo
Copy link
Collaborator Author

chan-hoo commented Apr 4, 2024

@MichaelLueken, the test on Derecho seems to be completed successfully.

@MichaelLueken
Copy link
Collaborator

The rerun of the Jenkins tests on Derecho successfully passed:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
custom_ESGgrid_IndianOcean_6km_20240403140142                      COMPLETE              27.38
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_plot_20  COMPLETE              56.90
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16_2024040314014  COMPLETE              51.09
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_HRRR_20240403  COMPLETE              48.33
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta_2  COMPLETE              21.03
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_HRRR_2024040314015  COMPLETE              46.74
pregen_grid_orog_sfc_climo_20240403140154                          COMPLETE              18.15
specify_template_filenames_20240403140157                          COMPLETE              18.48
2019_hurricane_barry_20240403140158                                COMPLETE              55.54
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE             343.64

Moving forward with merging this work now.

@MichaelLueken MichaelLueken merged commit d1401ec into ufs-community:develop Apr 4, 2024
4 of 5 checks passed
@chan-hoo chan-hoo deleted the feature/warm_start branch May 30, 2024 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SRW-AQM] Add warm start to AQM sample script
3 participants