Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attribute cleanup, allow coupled executable to run as standalone; update WW3 to use CPP instead of switches (was #880) #868

Merged
merged 126 commits into from
Nov 3, 2021

Conversation

DeniseWorthen
Copy link
Collaborator

@DeniseWorthen DeniseWorthen commented Oct 15, 2021

PR Checklist

  • Ths PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.

  • This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR

  • An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
    are specified below.

  • If new or updated input data is required by this PR, it is clearly stated in the text of the PR.

Instructions: All subsequent sections of text should be filled in as appropriate.

The information provided below allows the code managers to understand the changes relevant to this PR, whether those changes are in the ufs-weather-model repository or in a subcomponent repository. Ufs-weather-model code managers will use the information provided to add any applicable labels, assign reviewers and place it in the Commit Queue. Once the PR is in the Commit Queue, it is the PR owner's responsiblity to keep the PR up-to-date with the develop branch of ufs-weather-model.

Description

Updates NEMS to allow the coupled executable to run as the standalone (i.e., app=S2S run as app=ATM).

Issue(s) addressed

Testing

How were these changes tested? What compilers / HPCs was it tested with? Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Have regression tests and unit tests (utests) been run? On which platforms and with which compilers? (Note that unit tests can only be run on tier-1 platforms)

  • The full RTs were run for hera.intel, hera.gnu and cheyenne.intel. All baselines were B4B.

  • A second test was done on cheyenne.intel using the following rt.conf:

###################################################################################################################################################################################
# S2S tests                                                                                                                                                                    #
###################################################################################################################################################################################

COMPILE | -DAPP=S2SW -DCCPP_SUITES=FV3_GFS_v16_coupled_nsstNoahmpUGWPv1,FV3_GFS_v16_nsstNoahmpUGWPv1                              | - wcoss_cray                            | fv3 |
# Waves off
RUN     | cpld_control_p7                                                                                                         | - wcoss_cray                            | fv3 |

###################################################################################################################################################################################
# PROD tests                                                                                                                                                                      #
###################################################################################################################################################################################

#COMPILE | -DAPP=ATM -DCCPP_SUITES=FV3_GFS_v16,FV3_GFS_v15_thompson_mynn,FV3_GFS_v15_thompson_mynn_RRTMGP,FV3_GSD_v0,FV3_RAP,FV3_HRRR,FV3_RRFS_v1beta,FV3_RRFS_v1alpha,FV3_GFS_v16_nsstNoahmpUGWPv1 -D32BIT=ON |  | fv3 |
COMPILE | -DAPP=ATM -DCCPP_SUITES=FV3_GFS_v16,FV3_GFS_v15_thompson_mynn,FV3_GFS_v15_thompson_mynn_RRTMGP,FV3_GSD_v0,FV3_RAP,FV3_HRRR,FV3_RRFS_v1beta,FV3_RRFS_v1alpha,FV3_GFS_v16_nsstNoahmpUGWPv1 |  | fv3 |

RUN     | control_p7                                                                                                              |                                         | fv3 |

After the test completed, the coupled executable was copied into the control_p7 run directory and the test was re-run using the job_card. The forecast and restart files from this second run of control_p7 were B4B with the run using the original standalone executable.

  • hera.intel
  • hera.gnu
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss_cray
  • wcoss_dell_p3
  • CI 7188628

Dependencies

If testing this branch requires non-default branches in other repositories, list them. Those branches should have matching names (ideally).

DeniseWorthen and others added 30 commits March 27, 2021 12:30
This reverts commit 7b826d4.
@BrianCurtis-NOAA
Copy link
Collaborator

Automated RT Failure Notification
Machine: jet
Compiler: intel
Job: RT
Repo location: /lfs4/HFIP/h-nems/emc.nemspara/autort/pr/759069598/20211102164516/ufs-weather-model
Please manually delete: /lfs4/HFIP/h-nems/emc.nemspara/RT_RUNDIRS/emc.nemspara/FV3_RT/rt_233401
Test hafs_regional_datm_cdeps 080 failed in run_test failed
Test compile_012 failed in run_compile failed
Please make changes and add the following label back:
jet-intel-RT

@DeniseWorthen
Copy link
Collaborator Author

Jet failure for the cdeps test was

TEST 080 hafs_regional_datm_cdeps is waiting to enter the queue
TEST 080 hafs_regional_datm_cdeps is submitted
Slurm unknown status -. Check sacct ...
sacct: error: slurm_persist_conn_open_without_init: failed to open persistent connection to host:jetbqs2:6819: Connection refused
sacct: error: Sending PersistInit msg: Connection refused
sacct: error: Problem talking to the database: Connection refused

The compile_012 error was a long string of messages like

/lfs4/HFIP/h-nems/emc.nemspara/autort/pr/759069598/20211102164516/ufs-weather-model/stochastic_physics/random_numbers.F90(11): remark #15009: random_numbers_mp_random_01_cb_ has been targeted for automatic cpu dispatch

I'll re-run the affected tests.

@DeniseWorthen
Copy link
Collaborator Author

I got the same message on jet again with the slurm/sacct error.

@junwang-noaa
Copy link
Collaborator

Let me try it on jet. I don't think there is any change related to the stochastic physics though.

@DeniseWorthen
Copy link
Collaborator Author

I think it must be some system issue. Didn't Bin have trouble yesterday?

@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Nov 2, 2021 via email

@DeniseWorthen
Copy link
Collaborator Author

DeniseWorthen commented Nov 2, 2021

I submitted the three tests serially (no -e) and the compile 012 job (COMPILE | -DAPP=ATMW -DCCPP_SUITES=FV3_GFS_v16 -D32BIT=ON) at least completed and the first test is now pending.

@junwang-noaa
Copy link
Collaborator

I reran the failed case on jet, it finished successfully.

@DeniseWorthen
Copy link
Collaborator Author

@aliabdolali @JessicaMeixner-NOAA Ready to merge to ufs-weather-model once WW3 has been updated.

@JessicaMeixner-NOAA
Copy link
Collaborator

@DeniseWorthen, @aliabdolali just approved and merged the WW3 PR

@DeniseWorthen DeniseWorthen merged commit eb42fb8 into ufs-community:develop Nov 3, 2021
@DeniseWorthen DeniseWorthen deleted the feature/attrclnup branch January 3, 2022 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
No Baseline Change No Baseline Change Waiting for Reviews The PR is waiting for reviews from associated component PR's.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update WW3 to uses CPP instead of switches
5 participants