-
Notifications
You must be signed in to change notification settings - Fork 1
Performance FMT
se_nsplit = 1
se_rsplit = 6
se_qsplit = 1
se_hypervis_subcycle = 10
se_nu_div = 1E15
se_nu = 1E15
se_sponge_del4_nu_div_fac = 10
se_sponge_del4_nu_fac = 5
se_sponge_del4_lev = 3
se_hypervis_subcycle_sponge = 3
stable for 180 days!
se_nsplit = 2
se_rsplit = 3
se_qsplit = 1
se_hypervis_subcycle = 3
se_nu_div = 1E15
se_nu = 1E15
se_sponge_del4_nu_div_fac = 2.5
se_sponge_del4_nu_fac = 1
se_sponge_del4_lev = 45
se_hypervis_subcycle_sponge = 3
unstable
se_nsplit = 2
se_rsplit = 3
se_qsplit = 1
se_hypervis_subcycle = 15
se_nu_div = 1E15
se_nu = 1E15
se_sponge_del4_nu_div_fac = 15
se_sponge_del4_nu_fac = 5
se_sponge_del4_lev = 10
se_hypervis_subcycle_sponge = 3
unstable (after 40 days)
se_nsplit = 2
se_rsplit = 3
se_qsplit = 1
se_hypervis_subcycle = 3
se_nu_div = 1E15
se_nu = 1E15
se_sponge_del4_nu_div_fac = 3
se_sponge_del4_nu_fac = 3
se_sponge_del4_lev = 10
se_hypervis_subcycle_sponge = 3
unstable
se_nsplit = 2
se_rsplit = 3
se_qsplit = 1
se_hypervis_subcycle = 7
se_nu_div = 1E15
se_nu = 1E15
se_sponge_del4_nu_div_fac = 7.5
se_sponge_del4_nu_fac = 5
se_sponge_del4_lev = 10
se_hypervis_subcycle_sponge = 3
unstable
Current default namelist setting for ~80km top model (FMT):
se_hypervis_subcycle = 3
se_hypervis_subcycle_q = 1
se_hypervis_subcycle_sponge = 1
se_nsplit = 2
se_nu = 1E15
se_nu_div = 2.5E15
se_nu_p = 1E15
se_nu_top = 1.0e6
se_qsplit = 1
se_rsplit = 6
se_sponge_del4_lev = 3
se_sponge_del4_nu_div_fac = 5
se_sponge_del4_nu_fac = 7.5
These settings are scaled for max wind of 400m/s (also used for WACCM)
With HB diffusion (when CLUBB is not active) the max winds are 200m/s and there is no need for increased del4 viscosity in sponge. Hence we can optimize time-stepping signficantly:
se_nsplit = 2
se_rsplit = 5
se_qsplit = 1
se_hypervis_subcycle = 1
se_nu_div = 1E15
se_nu = 1E15
se_sponge_del4_nu_div_fac = 1
se_sponge_del4_nu_fac = 1
se_sponge_del4_lev = 1
Unstable:
se_nsplit = 2
se_rsplit = 3
se_qsplit = 1
se_hypervis_subcycle = 3
se_nu_div = 1E15
se_nu = 1E15
se_sponge_del4_nu_div_fac = 3
se_sponge_del4_nu_fac = 3
se_sponge_del4_lev = 3
se_hypervis_subcycle_sponge = 3
Out-of-the box:

se_hypervis_subcycle = 2
se_hypervis_subcycle_q = 1
se_hypervis_subcycle_sponge = 1
se_nsplit = 2
se_nu = 1E15
se_nu_div = 1E15
se_nu_top = 1.0e6
se_qsplit = 1
se_rsplit = 9
se_sponge_del4_lev = 3
se_sponge_del4_nu_div_fac = 5
se_sponge_del4_nu_fac = 7.5

Performance difference (~23% faster):

se_nsplit = 2
se_rsplit = 5
se_qsplit = 1
se_hypervis_subcycle = 1
se_nu_div = 1E15
se_nu = 1E15
se_sponge_del4_nu_div_fac = 1
se_sponge_del4_nu_fac = 1
se_sponge_del4_lev = 1

Opt 2 is ~19% faster than opt1:

Opt2 is ~43% faster than ref:

Conclusion: hyperviscosity operator is very expensive

Computational performance (~10% faster than opt3 and 2x faster than ref):


Note: p_d_coupling takes 160s. All of CSLAM advection takes 539s. Seems way too long ...
Note: spectral-element advection even with one tracer is quite expensive; try not to advect spectral-element tracers at all (try moist baroclinic wave to see how well balance is maintained)