-
Notifications
You must be signed in to change notification settings - Fork 25
UFS Weather Model Configuration File Types
The files described here are present in the version of the ufs-weather-model as of Aug 14, 2022 under the tests/parm directory.
The NOAA Air Quality Model is a subcomponent of UFS and requires the AQM Resource File, aqm.rc .
- This file can currently be parsed with pyyaml
- There are sections of the file that use a double colon (::) syntax to start a section that contains CSV data. These sections all seem to contain the same data type/structure.
- The double colon sections will require special handling to parse.
- Truthy values are pyyaml compliant, but may need to be converted back to lowercase strings when writing to file (need to check fortran code).
Recommendation for unified management: Use YAML as a dictable and let a tool do the formatting. Use "base" from RT suite.
The diag_table is described here. It uses the FMS-managed format, which is most like CSV. It's purpose is to control the output written by the diag_manager tool. Here are the described requirements (source) of the format:
- The diag_table is a plain text file that lists files and fields, and the frequency of output
- The diag_table contains the "base date" used by the model
- None of the lines in the diag_table can span multiple lines
The files contain a 2-line header:
- Line 1: An arbitrary string that describes the experiment. It cannot contain spaces.
- Line 2: The base date of the experiment
- Must the same or earlier than the model start time
- 6 integers separated by spaces
Examples of existing management:
- SRW maintains one file for each CCPP suite and fills in the header at run time using Jinja2 templates
- global_workflow maintains several use-case-specific diag_tables, then uses the cat utility in bash to add the header at run time.
- HAFS has two templates specific to their use cases (global nest or regional), and uses atparse to template the date and the output interval.
- weather-model RT maintains several files for the specific tests in the suite.
Recommendation for unified management: Jinja2 Templates housed in RT, filled by same tool in Apps.
The field_table file is described here. It manages the advected tracers and their associated options. The file is most like CSV, but specifically formatted by item and row of entry, making it possible to configure it with YAML-like files and develop a formatter for writing the files.
The format from the link above:
The first line of an entry should consist of three quoted strings:The first quoted string will tell the field manager what type of field it is. The string “TRACER” is used to declare a field entry.The second quoted string will tell the field manager which model the field is being applied to. The supported type at present is “atmos_mod” for the atmosphere model.The third quoted string should be a unique tracer name that the model will recognize.
The second and following lines are called methods. These lines can consist of two or three quoted strings. The first string will be an identifier that the querying module will ask for. The second string will be a name that the querying module can use to set up values for the module. The third string, if present, can supply parameters to the calling module that can be parsed and used to further modify values.
An entry is ended with a forward slash (/) as the final character in a row. Comments can be inserted in the field table by having a hash symbol (#) as the first character in the line.
For example, an entry in a field_table looks like:
"TRACER", "atmos_mod", "sphum"
"longname", "specific humidity"
"units", "kg/kg"
"profile_type", "fixed", "surface_value=3.e-6" /
This could be managed with YAML like so:
sphum:
longname: specific humidity
units: kg/kg
profile_type:
fixed:
surface_value: 3.e-6
Examples of existing management:
- SRW maintains one file for each CCPP suite. No modifications are needed for these files.
- global_workflow maintains several field_tables specific to physics suites. No modifications are needed for these files.
- HAFS has two templates specific to their use cases (global nest or regional). No modifications are needed for these files.
- weather-model RT maintains several files for the specific tests in the suite.
Recommendation for unified management: Use YAML as a dictable and let a tool do the formatting.
- Same general format as AQM files. Some formatting observations:
- Can be parsed with pyyaml
- The double-colon-specified CSV structures are not the same as those in AQM. Use is virtually the same, though.
- Lists are space-separated
- Truthy values are in Fortran syntax (i.e., ".false.")
Examples of existing management:
- weather-model RT maintains several files for the specific tests in the suite.
- global_workflow keeps copies of the files from the RT (maybe a subset?). All are linked at run time.
- Not used in SRW and HAFS
Recommendation for unified management: Use YAML as a dictable and let a tool do the formatting. Use "base" from RT suite.
MOM6 is the Modular Ocean Model. It's run-time parameters are set via Fortran namelist and MOM6 Parameter Files. These files should be able to be parsed by Python's configparse as they appear to be compatible with INI format (none of those in the UFS repos need to use the override syntax below). Here is the relevant format information from the MOM6 RTD page.
The general syntax for an entry in a MOM6 parameter file is
[!]#[override] PARAMETER_NAME = value[,value][...][!comments]
Parameter names must be constructed from the characters [A-Za-z0-9_] and by soft convention are upper case. The ! character is a remark or comment indicator; all subsequent text on that line is ignored.
Parameters that are not specified in the parameter files may assume a default value. It is not an error to specify a parameter more than once with the same value. It is an error to specify different values.
The keyword #override indicates that this parameter specification takes precedence over other specifications. It is an error to have two #override specifications for a single parameter with the same values. It is an error to have two #override statements with different values.
Examples of existing management:
- weather-model RT manages 4 versions as templates, each filled in with the atparse. It also has an "override" file that allows the specification of 2 additional #override parameters. (Could be managed without that)
- global_workflow manages 3 versions as templates, each filled in with the atparse
- Not used in SRW and HAFS
Recommendation for unified management: Use YAML as a dictable and let a tool do the formatting. Use "base" from RT suite. Possibly just stick with templating, but via Jinja if that's sufficient.
For UFS enabled with inline UPP, two namelists are managed for the model: input.nml for model specifics, and post_itag for the UPP settings. Fortran namelists are dictable with the Python 3rd party library f90nml.
Examples of existing management:
- SRW maintains a base namelist and a YAML configuration file that overrides that base namelist based on CCPP suite settings. The namelist is generated at configuration time.
- global_workflow holds the template in a bash file and fills it in with bash variables at run time.
- HAFS has templates specific to their use cases (global nest or regional, nested or not), and uses atparse to template fields to be changed.
- weather-model RT maintains several files for the specific tests in the suite and uses atparse to fill in specific variables.
Recommendation for unified management: Use base namelists from the regression test suites and YAML files to update settings in the Apps.
The model_configure file is described here. It is mostly YAML compliant. Its truthy values are in Fortran syntax ( i.e., .false.). Also, it supports space-separated lists, which needs some pre-processing (see example below).
Examples of existing management:
- SRW maintains a Jinja2 template filled in at run time.
- global_workflow holds the templates in bash files (separate for DATM and FV3) and fills it in with bash variables at run time.
- HAFS has templates specific to their use cases (global nest or regional), and uses atparse to template fields to be changed.
- weather-model RT maintains several files for the specific tests in the suite and uses atparse to fill in specific variables.
Recommendation for unified management: Use base YAML from the regression test suites and YAML files to update settings in the Apps. Use Jinja2 templates for the management in RT.
It seems that pyyaml has a function, add_implicit_resolver
that adds an implicit tag resolver for plain scalars. This is one potential that given enough effort and trial and error, we could potentially make use of for parsing the space-separated list.
A brute force, clunky workaround involves pre- and post-processing these types of files to allow parsing the list. Here's a very basic example of how we might accomplish such a task that we might wrap into the Config subclasses.
pattern = re.compile("(?<=')[^']+(?=')")
mc = "model_configure"
with open(mc, 'r') as config_path:
yaml_lines = config_path.readlines()
for i, l in enumerate(yaml_lines):
value = l.split(':', maxsplit=1)
if len(value) > 1 and re.findall(pattern, value[-1]):
items = [s for s in re.findall(pattern, value[-1]) if s != ' ']
if len(items) > 1:
yaml_lines[i] = f'{value[0]}: {", ".join(items)}'
print(yaml_lines[i])
cfg = yaml.load('\n'.join(yaml_lines), Loader=yaml.SafeLoader)
HAFS model_configure files are currently non-YAML compliant. They include sections that specify various output grids in a format like <output_grid_02> . This section format is luckily easily modified in the model Fortran code. It's specified with ESMF Config open and close labels like this in FV3/io/module_wrt_grid_comp.F90:
write(cf_open,'("<output_grid_",I2.2,">")') n
write(cf_close,'("</output_grid_",I2.2,">")') n
Those two lines could likely become YAML compliant by specifying the tags as as output_grid_02:
and indenting the section below it, then the close label could be an additional end_output_grid_02:
entry without content. The HAFS model_configure sections would change from this:
<output_grid_02>
output_grid: @[OUTPUT_GRID_2]
imo: @[IMO_2]
jmo: @[JMO_2]
....
dy: @[DY_2]
</output_grid_02>
<output_grid_03>
....
</output_grid_03>
to this YAML-compliant version:
output_grid_02:
output_grid: @[OUTPUT_GRID_2]
imo: @[IMO_2]
jmo: @[JMO_2]
....
dy: @[DY_2]
end_output_grid_02:
output_grid_03:
....
end_output_grid_03:
This has now been tested with the regression test suite, and would require the following changes in the model:
- write(cf_open,'("<output_grid_",I2.2,">")') n
- write(cf_close,'("</output_grid_",I2.2,">")') n
+ write(cf_open,'("output_grid_",I2.2,":")') n
+ write(cf_close,'("end_output_grid_",I2.2,":")') n
Some information on using NEMS Configure files to specify coupling and information exchange is here. These files are not similar to any other formatting language discussed here, or other standard formats. In the workflows that use them, they are typically templates filled in at run time and several of them are needed depending on the coupling configuration an experiment wishes to run.
Examples of existing management:
- SRW does not use the NEMS coupling mechanism and links in a simple file at run time.
- HAFS maintains 3 templates for regional and 1 for global nest that are updated at run time using sed
- global_workflow maintains many templates updated at run time using atparse.
- weather-model RT maintains many templates for various test cases and uses atparse to update them all (including HAFS-specific ones)
Recommendation for unified management: Use Jinja2 as the common templating language.
A CSV file that specifies the parameter information for grib2 fields in UPP. Used with inline UPP. It is unclear how this file is managed in the Apps. It is kept as a single data file in the weather model RT.
A unique format specifically designed for WW3 that seems to be a mix of key/value pairs and CSV formatting.
Examples of existing management:
- weather-model RT maintains a few templates for various test cases and uses atparse to update them
Recommendation for unified management: Use Jinja2 as the common templating language.