From 17f1c40dfa881a4a834cbca61dc2ee9a41f6d6aa Mon Sep 17 00:00:00 2001
From: Marshall Ward <marshall.ward@gmail.com>
Date: Fri, 26 Jan 2024 10:39:54 -0500
Subject: [PATCH] Cuberoot: Apply first iteration explicitly (#11)

Applying the first iteration explicitly appears to speed up the cuberoot
function by a bit over 20%:

 Before:
 Halley Final:  0.14174999999999999

 After:
 Halley Final:  0.11080000000000001

There is an assumption that compilers will precompute the constants like
`0.7 * (0.7)**3`, and that all will do so in the same manner.
---
 src/framework/MOM_intrinsic_functions.F90 | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/framework/MOM_intrinsic_functions.F90 b/src/framework/MOM_intrinsic_functions.F90
index 8d8cdde39f..4327cfa5a6 100644
--- a/src/framework/MOM_intrinsic_functions.F90
+++ b/src/framework/MOM_intrinsic_functions.F90
@@ -64,8 +64,11 @@ elemental function cuberoot(x) result(root)
     ! and it is therefore more computationally efficient.
 
     ! This first estimate gives the same magnitude of errors for 0.125 and 1.0 after two iterations.
-    num = 0.707106 ; den = 1.0
-    do itt=1,3
+    ! The first iteration is applied explicitly.
+    num = 0.707106 * (0.707106**3 + 2.0 * asx)
+    den = 2.0 * (0.707106**3) + asx
+
+    do itt=1,2
       ! Halley's method iterates estimates as Root = Root * (Root**3 + 2.*asx) / (2.*Root**3 + asx).
       num_prev = num ; den_prev = den
       num = num_prev * (num_prev**3 + 2.0 * asx * (den_prev**3))