Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UFS weather model with mixed mode FMS library #1036

Closed
MinsukJi-NOAA opened this issue Feb 3, 2022 · 28 comments
Closed

UFS weather model with mixed mode FMS library #1036

MinsukJi-NOAA opened this issue Feb 3, 2022 · 28 comments
Assignees
Labels
enhancement New feature or request

Comments

@MinsukJi-NOAA
Copy link
Contributor

MinsukJi-NOAA commented Feb 3, 2022

Description

This issue is created to keep track of issues, test statuses, etc. and everyone involved up to date as the mixed mode FMS library is getting ready to be released.

Solution

Relevant branches, PR's, issues, and test statuses will be listed here.

Alternatives

Communicate via email.

Related to

FMS
NOAA-GFDL/FMS#857

MOM6
NOAA-EMC/MOM6#88

GFDL_atmos_cubed_sphere
NOAA-GFDL/GFDL_atmos_cubed_sphere#163

UFS-weather-model:
https://github.com/MinsukJi-NOAA/ufs-weather-model/tree/fms_mixedmode_20220330

Testing:
Hera/Intel
Hera/Gnu
WCOSS Dell/Intel
Gaea/Intel

@junwang-noaa @binli2337 @jiandewang @SMoorthi-emc

Some more information on mixed mode FMS can be found here

@MinsukJi-NOAA MinsukJi-NOAA added the enhancement New feature or request label Feb 3, 2022
@MinsukJi-NOAA
Copy link
Contributor Author

MinsukJi-NOAA commented Feb 3, 2022

Mixed mode FMS library was built with Intel and GNU

For a specific platform, compiler, debug combination, please uncomment one of the setenv lines in modulefiles/ufs_common and modulefiles/ufs_common_debug:

ufs_common

#module load fms/2021.03
# For mixed Mode FMS tests, please uncomment the appropriate line below
# according to the machine (Hera or WCOSS Dell) and compiler (Intel or GNU)
# Hera Intel
#setenv FMS_ROOT /scratch1/NCEPDEV/stmp4/Minsuk.Ji/FMSLIB_1
# Hera GNU
#setenv FMS_ROOT /scratch1/NCEPDEV/stmp4/Minsuk.Ji/FMSLIB_1_GNU
# WCOSS Dell Intel
setenv FMS_ROOT /gpfs/dell2/emc/modeling/noscrub/Minsuk.Ji/FMSLIB_1

ufs_common_debug

#module load fms/2021.03
# For mixed Mode FMS tests, please uncomment the appropriate line below
# according to the machine (Hera or WCOSS Dell) and compiler (Intel or GNU)
# Hera Intel
#setenv FMS_ROOT /scratch1/NCEPDEV/stmp4/Minsuk.Ji/FMSLIB_1_DEBUG
# Hera GNU
#setenv FMS_ROOT /scratch1/NCEPDEV/stmp4/Minsuk.Ji/FMSLIB_1_GNU_DEBUG
# WCOSS Dell Intel
setenv FMS_ROOT /gpfs/dell2/emc/modeling/noscrub/Minsuk.Ji/FMSLIB_1

@MinsukJi-NOAA
Copy link
Contributor Author

MinsukJi-NOAA commented Feb 3, 2022

All regression tests on Hera and WCOSS Dell for S2SW, S2S, NG-GODAS applications passed, including GNU compiler and debug cases.

@MinsukJi-NOAA
Copy link
Contributor Author

Preliminary tests by @SMoorthi-emc for a 10 day run of C1152L127 coupled model resulted in approximately 30% time savings.

@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Feb 10, 2022

@binli2337 Would you please check why datm_cdeps_debug_cfsr test does not reproduce the baseline? Thanks.

@junwang-noaa @jiandewang When the mixed-mode FMS is compiled without the "-march=core-avx2" option, the baseline results can be reproduced on Hera.

@binli2337
Copy link
Contributor

The mixed_mode FMS has been built on Gaea and tests show that the following 4 regression tests can reproduce the baseline using the mixed_mode FMS:

  1. datm_cdeps_control_cfsr
  2. datm_cdeps_debug_cfsr
  3. cpld_control_p8
  4. cpld_debug_p8

The ufs-weather-model code used: https://github.com/binli2337/ufs-weather-model/tree/feature/mixed_mode
The test log file is https://github.com/binli2337/ufs-weather-model/blob/feature/mixed_mode/tests/RegressionTests_gaea.intel.log.

To reproduce the results on Gaea, please use the following commands:

git clone https://github.com/binli2337/ufs-weather-model
cd ufs-weather-model
git checkout feature/mixed_mode
git submodule update --init --recursive
cd tests
./rt.sh -e -k -l rt.conf1

@binli2337
Copy link
Contributor

Results on Hera using gnu compiler:
Test "cpld_control_c96_p8" passed RT test.
Test "cpld_debug_p8" failed to run due to segmentation fault.

@MinsukJi-NOAA
Copy link
Contributor Author

MinsukJi-NOAA commented Mar 3, 2022

Results on Hera using gnu compiler: Test "cpld_control_c96_p8" passed RT test. Test "cpld_debug_p8" failed to run due to segmentation fault.

Hera gnu debug test passed:
https://github.com/MinsukJi-NOAA/ufs-weather-model/blob/fms_mixedmode_20220302/tests/RegressionTests_hera.gnu.log
Note that FMS is compiled without core-avx2 for gnu debug.

@jiandewang
Copy link
Collaborator

@MinsukJi-NOAA @binli2337 : sorry for my late reply as I was out of town last week. I just noticed that Niki made minor further code change in MOM_diag_manager_infra.F90 so I updated my MOM6 branch. Note in this updating infra/FMS2 code also has been updated, thus we need to test that too in UFS.

@MinsukJi-NOAA
Copy link
Contributor Author

As discussed with @jiandewang , two changes were made:

  1. Use Jiande's latest commit to his MOM6 branch: jiandewang/MOM6@486a4ee
  2. Use FMS2 instead of FMS1 in mom6_files.cmake

All Intel and GNU tests (including debug) passed on Hera.

@jiandewang
Copy link
Collaborator

@MinsukJi-NOAA can you also try GAEA and wcoss intel ?

@MinsukJi-NOAA
Copy link
Contributor Author

@MinsukJi-NOAA can you also try GAEA and wcoss intel ?

All (coupled and cdeps) RT's passed on wcoss intel.
All RT's passed on Gaea as reported by @binli2337

@jiandewang
Copy link
Collaborator

based on today's MOM6 meeting discussion, we will do in this order:
(1) Niki creates a PR to dev/emc
(2) EMC fully test it and commit to dev/emc
(3) EMC creates PR to push back to MOM6 main branch

@MinsukJi-NOAA
Copy link
Contributor Author

Latest test results are reported here

@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Apr 25, 2022

Waiting for mom-ocean/MOM6 PR #1566 to be committed

@binli2337
Copy link
Contributor

binli2337 commented Jun 20, 2022

@jiandewang The following 4 tests have been done using the updated MOM6 code with mixed mode FMS on Gaea. The MOM6 is from https://github.com/jiandewang/MOM6/tree/test/NCAR-20220603. Three tests can reproduce the baseline.

  1. datm_cdeps_control_cfsr (PASS)
  2. datm_cdeps_debug_cfsr (PASS)
  3. cpld_debug_p8 (PASS)
  4. cpld_control_p8 (Results are different from the baseline.)

@jiandewang
Copy link
Collaborator

@binli2337 this is what I expected. MOM6 code you used is from latest NCAR-candidate (mom-ocean/MOM6#1571) which contains one wave-related bug fixing and that will change answer for S2SW jobs. Thanks for the testing.

@binli2337
Copy link
Contributor

binli2337 commented Jul 6, 2022

The FMS 2022.03-alpha1 tag has been tested on Gaea with the full set of UFS regression tests.

The FMS 2022.03-alpha1 tag is located at https://github.com/NOAA-GFDL/FMS/releases/tag/2022.03-alpha1.
The FV3ATM branch is located at https://github.com/binli2337/fv3atm/tree/update_0629_mixedmode.
The ufs-weather-model branch is at https://github.com/binli2337/ufs-weather-model/tree/update_0629_mixedmode.

On Gaea, there are 104 tests that use r4 FMS library and 32 tests that use r8 FMS library.

Both r4 and r8 libraries of FMS 2022.03-alpha1 have been installed on Gaea to do the testing. Tests show that the baseline of current UFS regression tests can be completely reproduced when the FMS 2022.03-alpha1 tag is used.

The mixed-mode UFS has also been tested. The mixed-mode UFS includes FV3 component that is compiled with single precision (32-bit) and MOM6 component that is compiled with double precision (64-bit). The atmosphere physics component is compiled with double precision.

The following three coupled tests with mixed-mode UFS model have been run on Gaea and the results are expected:

cpld_control_p8 (wall time reduction: 10% in a 6-hr run)
cpld_control_c192_p8 (wall time reduction: 13% in a 30-hr run)
cpld_bmark_p8 (wall time reduction: 6.5% in a 6-hr run)

@binli2337
Copy link
Contributor

In the previous tests using FMS2022.03-alpha1 tag, FMS1 in MOM6-interface/mom6_files.cmake is used.

The following four tests have also been run using FMS2 in MOM6-interface/mom6_files.cmake and the current UFS baseline can be reproduced.

datm_cdeps_control_cfsr
datm_cdeps_debug_cfsr
cpld_control_p8
cpld_debug_p8

@junwang-noaa
Copy link
Collaborator

@binli2337 May I ask what is "FMS1"? Do you mean fms_io?

@jiandewang
Copy link
Collaborator

@binli2337 May I ask what is "FMS1"? Do you mean fms_io?

in current MOM6 code, there are infra/FMS1 and infra/FMS2 code, both works. This is why whenever there is any new updating, I ask Bin test both. Eventually infra/FMS1 will fade away and we will switch to infra/FMS2 code.

@binli2337
Copy link
Contributor

@junwang-noaa MOM6/config_src/infra/FMS1 uses fms_io and MOM6/config_src/infra/FMS2 uses fms2_io.

@junwang-noaa
Copy link
Collaborator

@jiandewang Thanks for the explanation.

@binli2337 Can you confirm all the coupled tests that use MOM6 with fms2_io run with the FMS2022.03-alpha1 tag? Thanks

@jiandewang
Copy link
Collaborator

@binli2337 can you make sure that datm_cdeps_iau_gefs and datm_cdeps_stochy_gefs jobs are fine ? these two jobs have extra fixed files to be read in and I had trouble in one of previous MOM6 code updating due to netcdf header format incompatible with fms2_io (the trouble have been fixed and current dev/emc and MOM main code are safe to use fms2_io). Just want to make sure no issue will be brought in in mixed mode.

@binli2337
Copy link
Contributor

To test the updated MOM6 code (https://github.com/jiandewang/MOM6/tree/test/GFDL_candidate-20220721),
the following 8 tests have been run using mixed-mode FMS tag 2022.03-alpha1.

a) Using FMS1 in MOM6-interface/mom6_files.cmake
datm_cdeps_control_cfsr
datm_cdeps_debug_cfsr
cpld_control_p8
cpld_debug_p8

b) Using FMS2 in MOM6-interface/mom6_files.cmake
datm_cdeps_control_cfsr
datm_cdeps_debug_cfsr
cpld_control_p8
cpld_debug_p8

Results show that the current UFS baseline can be reproduced.

The ufs-weather-model code is located at https://github.com/binli2337/ufs-weather-model/tree/update_0727_mixed_mode.

@DeniseWorthen
Copy link
Collaborator

@binli2337 create issue hpc-stack for installation of the new FMS on all RDHPCS platforms.

@binli2337
Copy link
Contributor

hpc-stack issue NOAA-EMC/hpc-stack#480

@binli2337
Copy link
Contributor

binli2337 commented Aug 3, 2022

The fms_mixedmode branch of NOAA-EMC/GFDL_atmos_cubed_sphere was tested with FMS 2022.03 in the ufs-weather-model. All tests passed. The current baseline of ufs-weather-model regression tests can be reproduced.

The ufs-weather-model code used: https://github.com/binli2337/ufs-weather-model/tree/fms_mixed_mode.

@DeniseWorthen
Copy link
Collaborator

Closing. Mixed mode is now added and tested in UFS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants