-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Depth list repro #914
Depth list repro #914
Conversation
This patch appends a checksum for the dependencies of the depth and area lists stored in the Depth_list.nc file, which are used to compute diagnostics based on APE. The data in Depth_list.nc depends on the grid fields, and may not be reproducible when such grids are constructed internally using compiled code within the executable. This issue was observed in the 'double_gyre' experiment when a PGI-compiled executable was tested using a Depth_list.nc file generated by a GNU-compiled executable. By appending a checksum for the grid fields used to compute Depth_list.nc, we can ensure that the data is consistent with the experiment grid data. Grid data which is read from external files, such as mosaic or topography fields, are unaffected by this issue. This patch improves the reproducibilty of standard diagnostics, such as total energy, but has no impact on the reproducibility of the internal model dynamics, which does not depend on Depth_list.nc. Checksums are computed for the G%bathyT and masked G%areaT grid fields using the FMS mpp_checksum subroutine, which require collective operations, and are stored as hex strings in global attributes of the netCDF file. Strings are used to remain consistent with FMS restart checksums, and to avoid an observed re-casting of 8-byte integers to 4-bytes by the netCDF library. Attribute names are based on the grid variable names. Two flags have been introduced to control this behavior: REQUIRE_DEPTH_LIST_CHECKSUMS (default: True) This flag will abort the run if the Depth_list.nc file is present and checksums are absent from the file. Although this could impose greater restrictions on existing runs, few runs are configured to save the depth list file (READ_DEPTH_LIST) and the default behavior is to reconstruct these lists on every run. UPDATE_DEPTH_LIST_CHECKSUMS (default: False) When REQUIRE_DEPTH_LIST_CHECKSUMS is set to false, this flag will automatically update the checksums of the Depth_list.nc file. While this can affect the reproducibility of APE diagnostics, it will ensure the reproducibility of such diagnostics in subsequent runs.
Additional documentation of the parameters used to store Depth_list.nc attribute names was added.
This PR is being tested with https://gitlab.gfdl.noaa.gov/ogrp/MOM6/pipelines/7818 |
Thanks for this contribution @marshallward. I like the improved error checking with this new capability. Looking over the code, I have a couple of suggestions:
|
The depth checksum is now replaced with masked depth, mask2dT * bathyT, and the calculation of the depth list has also been updated to use the masked depth. Various style conformance changes, such as contraction of do and if terminations (enddo, endif) and reduction of whitespace in various multiline function call, has also been applied. Finally, the attribute name docstrings were updated for clarity.
PR has been updated, including the use of masked depth. Style has also been updated to better match the codebase. As discussed offline, we will leave the default parameter values as they are, and update the double_gyre test to use non-default values. |
This updated PR is being tested with https://gitlab.gfdl.noaa.gov/ogrp/MOM6/pipelines/7824. |
Confirmed that changing
to
in
and reverting the change restores the old results. |
Using the masked depth (mask2dT * bathyT) was observed to change energy values within floating point precision, so the changes have been reverted. This may be revised at a later time, when we are prepared to update the energy stats to the new values in the regression tests. The depth checksum attribute has also been renamed to reflect this change. This will allow us to re-define the variable as masked at some later date, and can distinguish between the masked and unmasked checksums during testing.
Last commit passed on pipeline https://gitlab.gfdl.noaa.gov/ogrp/MOM6/pipelines/7840 |
Pull request for the
Depth_list.nc
code changes. Primary commit log attached below.Checksum support for
Depth_list.nc
This patch appends a checksum for the dependencies of the depth and area lists stored in the
Depth_list.nc
file, which are used to compute diagnostics based on APE.The data in
Depth_list.nc
depends on the grid fields, and may not be reproducible when such grids are constructed internally using compiled code within the executable. This issue was observed in thedouble_gyre
experiment when a PGI-compiled executable was tested using aDepth_list.nc
file generated by a GNU-compiled executable.By appending a checksum for the grid fields used to compute Depth_list.nc, we can ensure that the data is consistent with the experiment grid data. Grid data which is read from external files, such as mosaic or topography fields, are unaffected by this issue.
This patch improves the reproducibilty of standard diagnostics, such as total energy, but has no impact on the reproducibility of the internal model dynamics, which does not depend on Depth_list.nc.
Checksums are computed for the
G%bathyT
and maskedG%areaT
grid fields using the FMSmpp_checksum
subroutine, which require collective operations, and are stored as hex strings in global attributes of the netCDF file. Attribute names are based on the grid variable names.Two flags have been introduced to control this behavior:
REQUIRE_DEPTH_LIST_CHECKSUM
(default: True)This flag will abort the run if the
Depth_list.nc
file is present and checksums are absent from the file. Although this could impose greater restrictions on existing runs, few runs are configured to save the depth list file (READ_DEPTH_LIST
) and the default behavior is to reconstruct these lists on every run.UPDATE_DEPTH_LIST_CHECKSUM
(default: False)When
REQUIRE_DEPTH_LIST_CHECKSUM
is set to false, this flag will automatically update the checksums of theDepth_list.nc
file. While this can affect the reproducibility of APE diagnostics, it will ensure the reproducibility of such diagnostics in subsequent runs.This variable defaults to false, since we do not want to deliberately modify an existing file without user consent. As newer files are constructed with the checksums, this will presumably be less of an issue.