-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FMS2: Replace time_interp_external IDs with a MOM-defined type #376
FMS2: Replace time_interp_external IDs with a MOM-defined type #376
Conversation
Since this touches the NUOPC and MCT couplers, I think it's important to get feedback (if not approval) from NCAR. @gustavo-marques @alperaltuntas @mnlevy1981 Are any of you able to review this? Also feedback from @MJHarrison-GFDL @andrew-c-ross @kshedstrom would be appreciated. Having said that, there will probably be a more intensive PR coming later which replaces the actual |
Codecov Report
@@ Coverage Diff @@
## dev/gfdl #376 +/- ##
============================================
+ Coverage 38.38% 38.39% +0.01%
============================================
Files 268 268
Lines 76010 76021 +11
Branches 13987 13987
============================================
+ Hits 29174 29186 +12
- Misses 41597 41598 +1
+ Partials 5239 5237 -2
... and 2 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@marshallward since there are changes in NUOPC_cap, I am going to test it in UFS. |
Thanks @jiandewang I didn't realize you were using NUOPC. |
2fde2f0
to
4621217
Compare
In repro mode, my Supercritical test is hanging after printing:
It runs fine in debug mode, where the next thing printed is:
It also runs fine with one core in repro mode. The ones that hang are using 2-4 cores. |
It runs on gaea with ifort22. |
Never mind, it's an FMS problem, not a problem with this patch. |
I migrated a lot of the CFC_cap code out of the NUOPC driver and into MOM_tracer_flow_control.F90 and MOM_CFC_cap.F90 with NCAR#242. So if dev/gfdl gets to mom-ocean/main before dev/ncar, we'll need to merge these mods into dev/ncar, or you'll get to do it if dev/ncar gets to mom-ocean/main before dev/gfdl. |
@klindsay28 Enough problems are starting to bubble up that I would not expect this in the next PR to main, so most likely we'll have to wrangle with the merge conflicts. Thanks for the heads up though. |
@marshallward it failed to read in UFS restart file. I think it is because the time dimension is 1 instead of unlimited in that file. See sample file at /lustre/f2/dev/ncep/Jiande.Wang/For-Marshall. |
Thanks @jiandewang I had a look at your error:
This may be an error already in I think you are right that |
All instances of an FMS ID to the internal interpolation content is replaced with a derived type containing additional metadata recording the field's origin filename and fieldname. This additional information is required in order to replicate the axis data from the field, which is no longer provided by FMS2. The abstraction of this type also allows us to either extend it or redefine it in other frameworks as needed in the future. This primarily affects the usage of the following functions: - init_external_field - time_interp_external - horiz_interp_and_extrap_tracer The following solvers are updated: - MOM_open_boundary - MOM_ice_shelf - MOM_oda_driver - MOM_MEKE - MOM_ALE_sponge - MOM_diabatic_aux Of these, OBC was the most significant. The integer handle (fid) was previously used to determine if each segment field was constant or (if negative) read from a file. After being replaced by the derived type, a new flag was added to make this determination. All of the coupled drivers have been modified, since they support time interpolation of T and S fields. - FMS - MCT - NUOPC The NUOPC driver also includes modifications to its CFC11 and CFC12 fields. Changes to the MOM CFC modules replaces an `id == -1`-like test, which is not used by the derived type. This check has been removed, and we now solely rely on the `present(cfc_handle)` test. While this could change behavior, there does not seem to be any scenario where init_external_field would return -1 but would be passed to the function. (But I may eat these words.)
With removal of axis-based operations in FMS2 I/O, this patch removes references to these calls and replaces them with MOM `axes_info` types. References to FMS1 read into an `axistype`, but the contents are transferred to an `axis_info`. FMS2 directly populates the `axis_info` content. The `get_external_field_info` calls are modified to return `axis_info` rather than `axistype`. The redundant `get_axis_data` function is also removed from `MOM_interp_infra`, since `get_axis_info` provides an equivalent operation. Generally speaking, this is not an improvement of the codebase. The FMS1 layer does a redundant copy of data from `axistype` to `axis_info`. The FMS2 layer is significantly worse, and re-opens the file to read the axis data for each field! But if the intention is to leverage the existing API, then I don't think we have any choice at the moment. Assuming this is a relatively infrequent operation, this should not cause any measureable issues, but it needs to be watched carefully.
This patch shifts all remaining time_interp_external functions from time_interp_external to equivalent ones in time_interp_external2. Internally, time-interpolated fields are initialized with `ongrid` set to `.true.`, and such fields are assumed to be on-grid. This seems to hold for all existing instances of `time_interp_external`, but needs to be monitored in the future somehow.
4621217
to
35e3642
Compare
I updated this PR to replace the This appears to be a safe assumption, but I could be wrong about it. But if we assume it, then it resolves the few outstanding issues with adopting |
Another issue has emerged: It seems that the FMS1 |
I added a commit to support case-insensitive field access in FMS2, which appears to work in my local testing. As it is, I believe this particular PR is ready to merge. However, it might be helpful to resolve the (presumably unrelated) issue detected by @jiandewang before merging, so that it can be tested in UFS. |
The current version of this runs the dumbbell default tests, the dumbbell scaled tests, the dumbbell rotated test, but not the dumbbell scaled and rotated test. Stack trace:
Something isn't right with the rotation of data.in. |
@kshedstrom Are you saying this fails in |
@marshallward Sorry, answers don't change from dev/gfdl. It's just that I was in debugger mode to look at something else and it tripped over the invalid numbers. But I do need to figure out what's wrong with my rotated cases now that I have the debug executable. |
Running on chinook04 (gcc 11.3), answers match between the two MOM6 versions. Going back to chinook01 (gcc 8.3) with dev/gfdl, the answers still mostly match in repro mode, also in debug mode without this flag: -ffpe-trap=invalid,zero,overflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I agree with the broad direction of this PR, it has introduced (for the first time) the use of code from the src/framework
directory by code in the config_src/infra
directories. I think that this is potentially dangerous, in that it could lead to predecessor cycles, and that it could inhibit the compilation of parts of MOM6 as libraries. I would like to see this PR revised to avoid this backward dependency, or else I would like to have a broader discussion about the relative roles of the code in the various src/
and config_src/
directories before we collectively adopt this substantial revision to the effective code structure of the model.
IMHO, I see no problem with |
The FMS1 implementation of init_external_field is case-insensitive, but the FMS2 implementation is case-sensitive, which can cause errors in older established input files. This patch sweeps through the fields of the input files and checks for a case-insensitive match (using lowercase()). This requires an additional open/close of the file.
d5ce336
to
c010f0e
Compare
After further consideration and discussion, there is not a problem with config_src/infra code using modules from the src/framework directory, so long as there is no predecessor cycle created between modules. This PR does not create any problematic predecessor cycles, so no changes are needed to address this potential concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are moving the code in the right direction and addressing important problems that have arisen between SIS2 and some versions of FMS. It should be merged into dev/gfdl now that the pipeline regression testing has passed at gitlab.gfdl.noaa.gov/ogrp/MOM6/-/pipelines/19606 .
All instances of an FMS ID to the internal interpolation content is replaced with a derived type containing additional metadata recording the field's origin filename and fieldname.
This additional information is required in order to replicate the axis data from the field, which is no longer provided by FMS2.
The abstraction of this type also allows us to either extend it or redefine it in other frameworks as needed in the future.
This primarily affects the usage of the following functions:
References to FMS1 move data into an
axistype
, but the contents are transferred to anaxis_info
when leaving the FMS1 layer. FMS2 directly populates theaxis_info
content.The
get_external_field_info
calls are modified to returnaxis_info
rather thanaxistype
.The redundant
get_axis_data
function is also removed fromMOM_interp_infra
, sinceget_axis_info
provides an equivalent operation.The following solvers are updated:
Of these, OBC was the most significant. The integer handle (fid) was previously used to determine if each segment field was constant or (if negative) read from a file. After being replaced by the derived type, a new flag was added to make this determination.
All of the coupled drivers have been modified, since they support time interpolation of T and S fields.
The NUOPC driver also includes modifications to its CFC11 and CFC12 fields. Changes to the MOM CFC modules replaces an
id == -1
-like test, which is not used by the derived type. This check has been removed, and we now solely rely on thepresent(cfc_handle)
test.This does not resolve all of our dependencies on FMS1, since content from
time_interp_external
must be replaced with content fromtime_interp_external2
. But this PR allows us to begin the work of using the replacement module.