Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update for better vectorization with gcc #1793

Conversation

juntangc
Copy link
Contributor

The first line should be a single-line "purpose" for this change

TYPE: choose one of [bug fix, enhancement, new feature, feature removed, no impact, text only]

KEYWORDS: 5 to 10 words related to commit, separated by commas

SOURCE: Either "developer's name (affiliation)" .XOR. "internal" for a WRF Dev committee member

DESCRIPTION OF CHANGES:
Problem:
Generally or specifically, what was wrong and needed to be addressed?

Solution:
What was down algorithmically and in the source code to address the problem?

ISSUE: For use when this PR closes an issue.
Fixes #123

LIST OF MODIFIED FILES: list of changed files (use git diff --name-status master to get formatted list)

TESTS CONDUCTED:

  1. Do mods fix problem? How can that be demonstrated, and was that test conducted?
  2. Are the Jenkins tests all passing?

RELEASE NOTE: Include a stand-alone message suitable for the inclusion in the minor and annual releases. A publication citation is appropriate.

kkeene44 and others added 4 commits September 6, 2022 13:32
TYPE: new feature

KEYWORDS: WSM6, TL/AD,  4DVAR

Description of Change:
This PR adds a regularized version (i.e., change discontinuous functions to continuous functions) of WSM6 microphysics scheme (MP option 106, named as 'WSM6R') and its tangent linear and adjoint (TL/AD), which enables WRF-4DVar to run with ice-phase hydrometeor analysis variables. Note that the non-linear version of WSM6R is kept just for the code reference of deriving TL/AD, and it is NOT recommended to be used for the WRF model forecast. In addition, there is an improved handling of the file unit related to the background error covariance.

SOURCE:  Sen YANG, Deqin LI, Liqiang CHEN (Institute of Atmospheric Environment, China Meteorological Administration, Shenyang)

LIST OF MODIFIED FILES: 
M       Registry/Registry.EM_COMMON
M       Registry/registry.var
M       Registry/registry.wrfplus
M       main/depend.common
M       phys/Makefile
M       phys/module_microphysics_driver.F
A       phys/module_mp_wsm6s.F
M       var/da/da_setup_structures/da_scale_background_errors.inc
M       var/da/da_setup_structures/da_setup_be_regional.inc
M       var/da/da_transfer_model/da_transfer_wrftltoxa.inc
M       var/da/da_transfer_model/da_transfer_wrftltoxa_adj.inc
M       var/da/da_transfer_model/da_transfer_xatowrftl.inc
M       var/da/da_transfer_model/da_transfer_xatowrftl_adj.inc
M       wrftladj/Makefile
M       wrftladj/depend.wrftladj
M       wrftladj/module_microphysics_driver_ad.F
M       wrftladj/module_microphysics_driver_tl.F
A       wrftladj/module_mp_wsm6s_ad.F
A       wrftladj/module_mp_wsm6s_tl.F
M       wrftladj/solve_em_ad.F
M       wrftladj/solve_em_tl.F

TESTS CONDUCTED: 
1. Jenkins tests all passed;
2. WRFDA regression test passed on Cheyenne;
3. wrfplus and 4dvar tests succeeded using mp_physics & mp_physics_ad=106.

RELEASE NOTE: Add a regularized version of WSM6 and its TL/AD for 4DVar with ice-phase hydrometeor analysis variables.
Yang, S., D. Q. Li, L. Q. Chen, Z. Liu, X.-Y. Huang, and X. Pan, 2022: The regularized WSM6 microphysical scheme and its validation in WRF 4D-Var. Adv. Atmos. Sci., in press.
TYPE: Enhancement 

KEYWORDS: WRFDA, AHI, Himawari-8

SOURCE: Craig Schwartz (NCAR/MMM)

DESCRIPTION OF CHANGES:
This PR makes several enhancements for assimilating Himawari-8 radiance data, including
(1) Introduction of an all-sky obs error model (Harnisch et al., 2016) for all-sky AHI DA;
(2) Optional read and use of AHI level-2 product (e.g., cloud mask); 
(3) More efficient read of a sub-area of the full disk data;
(4) Allow the use of offline statistics of constant bias correction values;
(5) More diagnostic output (peak of weighting functions, cloud ice water path, cloud flag) in omb/oma files.

LIST OF MODIFIED FILES:
var/da/da_radiance/module_radiance.f90 Registry/registry.var
var/da/da_define_structures/da_define_structures.f90
var/da/da_monitor/da_rad_diags.f90
var/da/da_radiance/da_allocate_rad_iv.inc
var/da/da_radiance/da_deallocate_radiance.inc
var/da/da_radiance/da_get_innov_vector_crtm.inc
var/da/da_radiance/da_initialize_rad_iv.inc
var/da/da_radiance/da_qc_ahi.inc
var/da/da_radiance/da_radiance.f90
var/da/da_radiance/da_radiance1.f90
var/da/da_radiance/da_radiance_init.inc
var/da/da_radiance/da_read_obs_netcdf4ahi_jaxa.inc
var/da/da_radiance/da_write_iv_rad_ascii.inc
var/da/da_radiance/module_radiance.f90
var/run/ahi_info
var/run/radiance_info/himawari-8-ahi.info

TESTING:
(1) WRFDA regression test passed;
(2) AHI all-sky DA runs also Ok.

Release Note: Enhancements for AHI radiance DA, including all-sky observation error model, Level-2 AHI product read, and more diagnostic output.
Xu, D. M., Z. Q. Liu, S. Y. Fan, M. Chen, and F. F. Shen, 2021: Assimilating all-sky infrared radiances from Himawari-8 using the 3DVar method for the prediction of a severe storm over North China. Adv. Atmos. Sci., 38(4), 661-676.
This update is to resolve different simulation results on aarch64 and x86 for the same dataset (caused by round up error in the last position for some transcendental functions) 

TYPE: bug fix

KEYWORDS: round up error, transcendental, compiler optimization

SOURCE: Jun Tang, Amazon

DESCRIPTION OF CHANGES:
Problem:
After investigation, different simulation results on aarch64 and x86 for the same dataset with gcc 10.2 are caused by round up error in the last position for some transcendental functions; and several optimization flags (FMA, inverse square root, ...).  

Solution:
Use double precision for the computation of transcendental function at selected locations can make sure exact same output models are produced on aarch64 and x86 (this patch only covers a few PBL and cumulus scheme). Also disabling some risky optimization can guarantee same output as lower optimization level.  

LIST OF MODIFIED FILES: list of changed files (use `git diff --name-status master` to get formatted list)
M       arch/configure.defaults
M       phys/module_cu_tiedtke.F
M       phys/module_sf_myjsfc.F

TESTS CONDUCTED: 
1. The mods fixes the correctness problem between aarch64 and x86 for two WRF models (conus2.5km and conus12km).  The output models are matched, bit by bit, on the two specified platform with the patch.

2. The regression tests have passed - as indicated the change should not affect its results.

RELEASE NOTE: Fix numerical divergence on x86 and arm64. For best performance for WRF on arm64 please use armclang. (https://github.com/juntangc/notes/blob/main/release-note.pdf).
@juntangc juntangc requested review from a team as code owners December 15, 2022 22:08
@juntangc juntangc closed this Dec 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants