Refactors the EOS modules to use classes #522

adcroft · 2023-11-15T19:29:00Z

We've speculated that branching (select case or deeply nested if trees) are contributing to poor performance. The equation of state (EOS) is a part of MOM6 where the module functions are called many times from all over the code and we have multiple choices to select from. I did try out procedure pointers but the code got messy quickly. Writing each EOS as an extension of a base class allows selection of the EOS when allocating/constructing the instance. This does manage to deliver a branch-free code and seem to deliver a performance improvement with the caveat that I've only evaluated performance using a new micro test (see section below). I suggest some proper performance analysis by a professional before we commit to this approach.

Changes

Added a base class in MOM_EOS_base_type.F90
All EOS modules now extend this base class
- This reduces some replicated code between the EOS modules
All existing APIs in MOM_EOS now avoid branching associated with the type of EOS and ultimately pass through to a low-level elemental function implementation of the actual EOS
Added a new elemental function exposed by MOM_EOS (currently not used in the main model but will be in a future regridding algorithm)
There is a speed up over the previous form of EOS due to the reduced branching
- For some functions, a local implementation of the base class member is needed to gain performance. I deliberately did not implement this optimization for UNESCO or Jackett06 so that the generic implementation of the base class is utilized and we have code coverage.

Micro test results

The micro test in config_src/drivers/timing_tests/time_MOM_EOS.F90 was added previously. The perfmon workflow runs the tests using the new code and PR-target code and presents the results as a percentage change. We are somewhat skeptical that timings can be relied upon in a noisy and shared environment such as the cloud (GitHub actions) but the following screenshot of a results is surprisingly representative of what I obtained in a quiet environment on gaea. There is undoubtedly some variability so we should not heavily rely on GH actions performance but it looks like we could use the GH workflows as a soft indicator.

Screenshot of Github actions perfmon (using gnu compiler) (https://github.com/adcroft/MOM6/actions/runs/6881376496/job/18717587399?pr=2)

Screenshot of timing comparison on Gaea c5 (AMD using intel compiler)

In these tests I am using the base class functions for Jackett06 and UNESCO in order to exercise those generic functions (i.e. get code coverage) but it has the advantage of showing the advantage of inlining by the compiler when the elemental functions are in the same module as the wrapper function.

codecov · 2023-11-15T19:31:57Z

Codecov Report

Attention: 95 lines in your changes are missing coverage. Please review.

Comparison is base (cce4b3d) 37.46% compared to head (98b1c93) 37.49%.

Files	Patch %	Lines
src/equation_of_state/MOM_EOS.F90	38.09%	45 Missing and 7 partials ⚠️
src/equation_of_state/MOM_EOS_base_type.F90	54.09%	27 Missing and 1 partial ⚠️
src/equation_of_state/MOM_EOS_Roquet_SpV.F90	97.61%	2 Missing and 1 partial ⚠️
src/equation_of_state/MOM_EOS_Roquet_rho.F90	97.34%	2 Missing and 1 partial ⚠️
src/equation_of_state/MOM_EOS_TEOS10.F90	90.90%	3 Missing ⚠️
src/equation_of_state/MOM_EOS_Wright_full.F90	96.51%	2 Missing and 1 partial ⚠️
src/equation_of_state/MOM_EOS_Wright_red.F90	96.47%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           dev/gfdl     #522      +/-   ##
============================================
+ Coverage     37.46%   37.49%   +0.03%     
============================================
  Files           270      271       +1     
  Lines         79763    79533     -230     
  Branches      14830    14816      -14     
============================================
- Hits          29880    29818      -62     
+ Misses        44349    44175     -174     
- Partials       5534     5540       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/equation_of_state/MOM_EOS_UNESCO.F90

src/equation_of_state/MOM_EOS_base_type.F90

Hallberg-NOAA

I have examined all of these changes, and they do seem to make sense to me. Apart from noting that there are some units descriptions in repeated comments (highlighted in a separate comment attached to one specific example), I think that this is on the right track, assuming that the performance improvements are robust. I think that these unit descriptions need to be fixed, but otherwise this could be close to ready to go.

The one other comment that I have is that the variable name "this" that is used repeatedly for a variable described as "This EOS" strikes me as odd. In this context, "this" is effectively acting as an adjective on the noun "EOS", so wouldn't it make more sense to use "EOS" - the noun - as the variable name instead?

marshallward · 2023-11-16T23:27:57Z

The one other comment that I have is that the variable name "this" that is used repeatedly for a variable described as "This EOS" strikes me as odd. In this context, "this" is effectively acting as an adjective on the noun "EOS", so wouldn't it make more sense to use "EOS" - the noun - as the variable name instead?

this% is fairly standard in object oriented programming, despite its grammatically odd usage.

- Added a base class in MOM_EOS_base_type.F90 - All EOS modules now extend this base class - This reduces replicated code between the EOS modules - All existing APIs in MOM_EOS now avoid branching for the type of EOS and ultimately pass through to a low-level elemental function implementation of the actual EOS - Added a new elemental function exposed by MOM_EOS (currently not used in the main model) - There is a speed up over the previous form of EOS due to the reduced branching - For some functions, a local implementation of the base class member is needed to gain performance. I deliberately did not implement this optimization for UNESCO or Jackett06 so that the generic implementation of the base class is utilized and we have code coverage.

- Added rules to .testing/Makefile to invoke build.timing, run.timing for the "target" code checked out for regression tests - Appended to existing GH "perfmon" workflow

marshallward · 2023-11-22T22:41:18Z

Gaea regression: https://gitlab.gfdl.noaa.gov/ogrp/MOM6/-/pipelines/21436 ✔️

Hallberg-NOAA added the refactor Code cleanup with no changes in functionality or results label Nov 16, 2023

Hallberg-NOAA reviewed Nov 16, 2023

View reviewed changes

src/equation_of_state/MOM_EOS_UNESCO.F90 Outdated Show resolved Hide resolved

Hallberg-NOAA reviewed Nov 16, 2023

View reviewed changes

src/equation_of_state/MOM_EOS_base_type.F90 Outdated Show resolved Hide resolved

Hallberg-NOAA requested changes Nov 16, 2023

View reviewed changes

adcroft force-pushed the eos-class branch from d752193 to f396575 Compare November 19, 2023 18:49

adcroft added 2 commits November 19, 2023 13:52

Add rule and workflow to compare results of micro-timing tests

98b1c93

- Added rules to .testing/Makefile to invoke build.timing, run.timing for the "target" code checked out for regression tests - Appended to existing GH "perfmon" workflow

adcroft force-pushed the eos-class branch from f396575 to 98b1c93 Compare November 19, 2023 19:00

Hallberg-NOAA approved these changes Nov 19, 2023

View reviewed changes

marshallward merged commit 7cc191a into NOAA-GFDL:dev/gfdl Nov 22, 2023
12 checks passed

This was referenced May 16, 2024

GFDL to main (2024-05-16) mom-ocean/MOM6#1626

Closed

GFDL to main 2024-05-16 mom-ocean/MOM6#1627

Closed

marshallward mentioned this pull request May 31, 2024

GFDL to main (2024-05-31) mom-ocean/MOM6#1631

Merged

adcroft deleted the eos-class branch October 24, 2024 12:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactors the EOS modules to use classes #522

Refactors the EOS modules to use classes #522

adcroft commented Nov 15, 2023

codecov bot commented Nov 15, 2023 •

edited

Loading

Hallberg-NOAA left a comment

marshallward commented Nov 16, 2023

marshallward commented Nov 22, 2023

Refactors the EOS modules to use classes #522

Refactors the EOS modules to use classes #522

Conversation

adcroft commented Nov 15, 2023

Changes

Micro test results

codecov bot commented Nov 15, 2023 • edited Loading

Codecov Report

Hallberg-NOAA left a comment

Choose a reason for hiding this comment

marshallward commented Nov 16, 2023

marshallward commented Nov 22, 2023

codecov bot commented Nov 15, 2023 •

edited

Loading