-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[develop] Transition the var_defns bash file to YAML. #1098
[develop] Transition the var_defns bash file to YAML. #1098
Conversation
# USHdir | ||
# | ||
# workflow: | ||
# EXPTDIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pattern that shows up regularly. I did my best to add a doc block to the top of each j-job and ex-script to list the variables that are used in the script and the sections they come from.
The run-time variables may be from Rocoto jobs or from the NCO preamble.
source_config_for_task "" ${GLOBAL_VAR_DEFNS_FP} | ||
for sect in user nco workflow ; do | ||
source_yaml ${GLOBAL_VAR_DEFNS_FP} ${sect} | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the pattern I've added to explicitly state which sections of the var_defns file should be "sourced". The bash util helps us run a UW tool that translates YAML to bash and then exports it line-by-line.
This sort of thing will go away with a full integration of uwtools drivers (planned work in upcoming EPIC PI 13).
@@ -16,7 +31,9 @@ | |||
#----------------------------------------------------------------------- | |||
# | |||
. $USHdir/source_util_funcs.sh | |||
source_config_for_task "" ${GLOBAL_VAR_DEFNS_FP} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This command sources everything in the bash file that does not start with "task_", so the environment is now narrowed to the specific script with the new implementation.
@@ -3,97 +3,25 @@ | |||
# | |||
#----------------------------------------------------------------------- | |||
# | |||
# This script generates grid and orography files in NetCDF format that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This text block was not consistent with what this script actually does, nor what its corresponding ex-script does.
set +u | ||
conda activate ${SRW_GRAPHICS_ENV} | ||
set -u | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This task uses a different environment to run graphics, so loads it here instead of in the load_modules_run_task.sh
file.
line=$(echo "$line" | sed -E "s/='\[(.*)\]'/=(\1)/") | ||
line=${line//,/} | ||
line=${line//\"/} | ||
line=${line/None/} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines handle the transition from YAML lists to bash lists, and "None" values being empty variables in bash.
@@ -468,7 +468,7 @@ workflow: | |||
# | |||
#----------------------------------------------------------------------- | |||
# | |||
WORKFLOW_ID: !nowtimestamp '' | |||
WORKFLOW_ID: "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This thing isn't supported in uwtools, and isn't actually needed for SRW.
@@ -639,7 +639,7 @@ def generate_FV3LAM_wflow( | |||
input_format="nml", | |||
output_file=FV3_NML_STOCH_FP, | |||
output_format="nml", | |||
supplemental_configs=[settings], | |||
update_config=get_nml_config(settings), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a change to the API in the new uwtools version.
@@ -1,5 +1,7 @@ | |||
#!/bin/bash | |||
|
|||
set +u |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some how this script wasn't subjected to this requirement.
else | ||
exec "${jjob_fp}" | ||
fi | ||
source "${jjob_fp}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running the script in the same shell provides access to modify the conda env.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The modification that you made to ush/setup.py
to correct the behavior with the DATE_FIRST_CYCL
variable allows both the AQM cold start WE2E test:
----------------------------------------------------------------------------------------------------
Experiment name | Status | Core hours used
----------------------------------------------------------------------------------------------------
aqm_grid_AQM_NA13km_suite_GFS_v16_20240718123320 COMPLETE 6325.57
----------------------------------------------------------------------------------------------------
Total COMPLETE 6325.57
and the AQM warm start configuration:
CYCLE TASK JOBID STATE EXIT STATUS TRIES DURATION
================================================================================================================================
202311100000 make_grid 1989729 SUCCEEDED 0 1 30.0
202311100000 make_orog 1989833 SUCCEEDED 0 1 185.0
202311100000 make_sfc_climo 1989939 SUCCEEDED 0 1 97.0
202311100000 nexus_gfs_sfc 1989730 SUCCEEDED 0 1 30.0
202311100000 nexus_emission_00 1989834 SUCCEEDED 0 1 658.0
202311100000 nexus_emission_01 1989835 SUCCEEDED 0 1 640.0
202311100000 nexus_emission_02 1989836 SUCCEEDED 0 1 727.0
202311100000 nexus_post_split 1990128 SUCCEEDED 0 1 103.0
202311100000 fire_emission 1989731 SUCCEEDED 0 1 92.0
202311100000 point_source 1989837 SUCCEEDED 0 1 284.0
202311100000 aqm_ics_ext 1990040 SUCCEEDED 0 1 117.0
202311100000 aqm_lbcs 1990129 SUCCEEDED 0 1 83.0
202311100000 get_extrn_ics 1989732 SUCCEEDED 0 1 23.0
202311100000 get_extrn_lbcs 1989733 SUCCEEDED 0 1 23.0
202311100000 make_ics_mem000 1990013 SUCCEEDED 0 1 116.0
202311100000 make_lbcs_mem000 1990014 SUCCEEDED 0 1 278.0
202311100000 run_fcst_mem000 1990154 SUCCEEDED 0 1 2659.0
202311100000 run_post_mem000_f000 1990315 SUCCEEDED 0 1 47.0
202311100000 run_post_mem000_f001 1990328 SUCCEEDED 0 1 27.0
202311100000 run_post_mem000_f002 1990329 SUCCEEDED 0 1 43.0
202311100000 run_post_mem000_f003 1990371 SUCCEEDED 0 1 29.0
202311100000 run_post_mem000_f004 1990372 SUCCEEDED 0 1 44.0
202311100000 run_post_mem000_f005 1990409 SUCCEEDED 0 1 92.0
202311100000 run_post_mem000_f006 1990410 SUCCEEDED 0 1 92.0
202311100000 run_post_mem000_f007 1990449 SUCCEEDED 0 1 28.0
202311100000 run_post_mem000_f008 1990450 SUCCEEDED 0 1 26.0
202311100000 run_post_mem000_f009 1990458 SUCCEEDED 0 1 28.0
202311100000 run_post_mem000_f010 1990459 SUCCEEDED 0 1 25.0
202311100000 run_post_mem000_f011 1990521 SUCCEEDED 0 1 25.0
202311100000 run_post_mem000_f012 1990532 SUCCEEDED 0 1 27.0
202311100000 run_post_mem000_f013 1990533 SUCCEEDED 0 1 26.0
202311100000 run_post_mem000_f014 1990606 SUCCEEDED 0 1 44.0
202311100000 run_post_mem000_f015 1990607 SUCCEEDED 0 1 28.0
202311100000 run_post_mem000_f016 1990625 SUCCEEDED 0 1 27.0
202311100000 run_post_mem000_f017 1990626 SUCCEEDED 0 1 42.0
202311100000 run_post_mem000_f018 1990666 SUCCEEDED 0 1 33.0
202311100000 run_post_mem000_f019 1990667 SUCCEEDED 0 1 38.0
202311100000 run_post_mem000_f020 1990730 SUCCEEDED 0 1 27.0
202311100000 run_post_mem000_f021 1990760 SUCCEEDED 0 1 43.0
202311100000 run_post_mem000_f022 1990761 SUCCEEDED 0 1 43.0
202311100000 run_post_mem000_f023 1990762 SUCCEEDED 0 1 43.0
202311100000 run_post_mem000_f024 1990763 SUCCEEDED 0 1 38.0
================================================================================================================================
202311110000 nexus_gfs_sfc 1989734 SUCCEEDED 0 1 30.0
202311110000 nexus_emission_00 1989838 SUCCEEDED 0 1 663.0
202311110000 nexus_emission_01 1989839 SUCCEEDED 0 1 676.0
202311110000 nexus_emission_02 1989840 SUCCEEDED 0 1 707.0
202311110000 nexus_post_split 1990041 SUCCEEDED 0 1 120.0
202311110000 fire_emission 1989735 SUCCEEDED 0 1 92.0
202311110000 point_source 1989841 SUCCEEDED 0 1 289.0
202311110000 aqm_ics 1990764 SUCCEEDED 0 1 107.0
202311110000 aqm_lbcs 1990130 SUCCEEDED 0 1 94.0
202311110000 get_extrn_ics 1989736 SUCCEEDED 0 1 23.0
202311110000 get_extrn_lbcs 1989737 SUCCEEDED 0 1 26.0
202311110000 make_ics_mem000 1990015 SUCCEEDED 0 1 116.0
202311110000 make_lbcs_mem000 1990016 SUCCEEDED 0 1 261.0
202311110000 run_fcst_mem000 1990814 SUCCEEDED 0 1 2789.0
202311110000 run_post_mem000_f000 1990982 SUCCEEDED 0 1 47.0
202311110000 run_post_mem000_f001 1990984 SUCCEEDED 0 1 47.0
202311110000 run_post_mem000_f002 1991030 SUCCEEDED 0 1 25.0
202311110000 run_post_mem000_f003 1991031 SUCCEEDED 0 1 25.0
202311110000 run_post_mem000_f004 1991061 SUCCEEDED 0 1 26.0
202311110000 run_post_mem000_f005 1991062 SUCCEEDED 0 1 26.0
202311110000 run_post_mem000_f006 1991093 SUCCEEDED 0 1 26.0
202311110000 run_post_mem000_f007 1991113 SUCCEEDED 0 1 26.0
202311110000 run_post_mem000_f008 1991114 SUCCEEDED 0 1 46.0
202311110000 run_post_mem000_f009 1991146 SUCCEEDED 0 1 28.0
202311110000 run_post_mem000_f010 1991147 SUCCEEDED 0 1 31.0
202311110000 run_post_mem000_f011 1991179 SUCCEEDED 0 1 36.0
202311110000 run_post_mem000_f012 1991180 SUCCEEDED 0 1 31.0
202311110000 run_post_mem000_f013 1991219 SUCCEEDED 0 1 27.0
202311110000 run_post_mem000_f014 1991220 SUCCEEDED 0 1 50.0
202311110000 run_post_mem000_f015 1991235 SUCCEEDED 0 1 47.0
202311110000 run_post_mem000_f016 1991261 SUCCEEDED 0 1 26.0
202311110000 run_post_mem000_f017 1991262 SUCCEEDED 0 1 51.0
202311110000 run_post_mem000_f018 1991277 SUCCEEDED 0 1 25.0
202311110000 run_post_mem000_f019 1991305 SUCCEEDED 0 1 27.0
202311110000 run_post_mem000_f020 1991306 SUCCEEDED 0 1 30.0
202311110000 run_post_mem000_f021 1991321 SUCCEEDED 0 1 24.0
202311110000 run_post_mem000_f022 1991322 SUCCEEDED 0 1 43.0
202311110000 run_post_mem000_f023 1991340 SUCCEEDED 0 1 46.0
202311110000 run_post_mem000_f024 1991341 SUCCEEDED 0 1 46.0
to successfully run through to completion.
I'm still confused as to why the transition from bash to YAML is causing a 10-digit string to be converted to a date/time format. Is this supposed to happen, or is this showing an issue with the implementation?
@MichaelLueken I was also confused and it took me a while to find it. I found that that some extra magic was happening behind the scenes when On line 1505 in
I think this example shows how much SRW is reliant on assumptions made explicit by SRW related to data types. If we can move away from that behavior and adopt well-documented, widely used type conversions offered by YAML and ensure these conversions are very explicit when they happen, we'll be in better shape overall (as opposed to burying them deep in our SRW-specific code). I think all of our type conversions will rely on the YAML rules, so even a bash output file with a datetime object will look the same. Here I used the uwtools API to dump out the same dict as bash into
I hope that helps, but if it's still foggy, let me know. Just a forward-looking comment on how dates will be handled with a full uwtools integration: We give users the ability to modify dates however they like from the datetime object instead of assuming. An example taken from the MPAS App defines this explicitly so we know it's always treated as a datetime object both in memory, and in config files:
And then drivers will always expect a format like that. Where we need to use the NWP-favored YYYYMMDDHH formats, we are giving the user direct access via Jinja to format the datestring however you like. The run directory might be formulated for the forecast like this to get our standard 10-digit date string:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the detailed explanation! Since all tests are successfully running now, and the date and time behavior was expected (and the fix is working), I will go ahead and approve these changes now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my concerns!
While running the Jenkins tests, the
These will need to be updated to point to Additionally, the When you have the chance, please look through these issue and correct them if possible. |
….yaml through source_yaml.
I opened PR #4 in your fork to add in the necessary changes to the wrapper scripts to work with transitioning the As for the failures of the
These tests were added when RRFS features were being added to the SRW. Is it possible that the old RRFS data on HPSS has been removed now that the RRFSv1 implementation has been suspended? We will need to remove these unit tests moving forward if the data is no longer available to be pulled from HPSS. |
[to_yaml] Update wrapper scripts to allow them to run using var_defns.yaml
@MichaelLueken Can you point me to the output of the failed unit tests for RAP Obs? I'm running those on Hera now, but they do take quite some time. Thanks so much for your engagement on this whole PR! |
After looking at the code, I'm curious how this has been passing in PRs since I removed the RAP_obs section in the data_locations.yaml file months ago! I'll push a change that removes the tests. |
@christinaholtNOAA - Thanks for removing the RAP_obs tests from Yeah, looking at the changes in PR #893, I do see the removal of RAP_obs from the |
@MichaelLueken Just checking in...are there any requests for changes I missed here? |
@christinaholtNOAA - Everything is fine. Both Hercules and Orion were down for maintenance yesterday, so I didn't relaunch the Jenkins tests. Now that maintenance is complete, relaunching now. |
The updates to the wrapper scripts and |
…ment to call to load_modules_run_task.sh
The final fix for the Functional WorkflowTaskTests to run successfully on Gaea has been completed and PR #5 has been opened in your fork. Once this PR has been merged, I will kick off the Gaea tests one last time, then we should be able to merge this work. |
[to_yaml] Update .cicd/scripts/wrapper_srw_ftest.sh to add third argument to call to load_modules_run_task.sh
@MichaelLueken Thanks! I merged your PR to my branch. |
The Jenkins tests successfully passed on Derecho, Gaea, Hera Intel, Hercules, and Orion. The NOMADS WE2E test is currently failing in The Jet tests were aborted due to running longer than 8 hours. The coverage WE2E tests were manually launched yesterday, and the Once the Hera GNU and Jet tests are complete, I will merge this PR. |
The Jet coverage WE2E tests have successfully passed:
|
The Hera GNU tests have successfully passed through the Jenkins pipeline. Merging PR now. |
…1098) Use YAML for the configuration language at run time. --------- Co-authored-by: Michael Lueken <63728921+MichaelLueken@users.noreply.github.com> Co-authored-by: Michael Kavulich <kavulich@ucar.edu> Co-authored-by: michael.lueken <Michael.Lueken@noaa.gov>
DESCRIPTION OF CHANGES:
Use YAML for the configuration language at run time.
Type of change
I could use some help on updating documentation. Just pointers to the most important pieces. Let's also talk about whether that could be separated into a follow-on PR.
TESTS CONDUCTED:
DEPENDENCIES:
n/a
DOCUMENTATION:
ISSUE:
CHECKLIST