Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break out module_dm external subroutines into separate files #2069

Merged
merged 10 commits into from
Sep 26, 2024

Conversation

islas
Copy link
Collaborator

@islas islas commented Jun 14, 2024

TYPE: enhancement

KEYWORDS: intel, compilation, llvm, memory

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
The Intel oneAPI compilers (and others like nvhpc) struggle with some of the larger (15k+ lines of code) files within WRF. This causes intense memory usage that is not often available to the average user not in a resource-rich environment. This often limits compilation to single threaded if even possible or to a dedicated environment with enough memory if available. If neither of those is available to a user, they will be unable to use these configurations entirely.

Solution:
This PR focuses on the module_dm sections of code to reduce its individual file size to manageable levels. This and its helper subroutines are instead broken out into many smaller files.

TESTS CONDUCTED:
Attached to this PR are plots of the respective effects of theses changes. Changes were tested with intel and gcc compilers, but only intel memory usage is shown as it exacerbates the memory usage issue.

@islas islas requested review from a team as code owners June 14, 2024 22:40
@islas
Copy link
Collaborator Author

islas commented Jun 14, 2024

Highlighted is the region during compilation which memory usage spikes that this PR addresses (module_dm) before these changes take place :
pre_module_dm

This usage is then dropped when using this PR's edits, splitting across multiple other files :
post_module_dm_split

Zooming in we can now see the external subroutines originally in module_dm comprise a number of smaller compilation units :
post_module_dm_split_zoom

@weiwangncar
Copy link
Collaborator

The regression test results:

Test Type              | Expected  | Received |  Failed
= = = = = = = = = = = = = = = = = = = = = = = =  = = = =
Number of Tests        : 23           24
Number of Builds       : 60           57
Number of Simulations  : 158           150        0
Number of Comparisons  : 95           86        0

Failed Simulations are: 
None
Which comparisons are not bit-for-bit: 
None

@weiwangncar
Copy link
Collaborator

I tested code before and after this PR, and model produces identical results in my test. Also with 4 processors, the compile time is about 12 minutes!

Copy link
Collaborator

@mgduda mgduda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't yet done any nested runs to test these changes. The only potential issue that I can see so far is related to the use of .F90 suffixes for files that are compiled based on rules in configure.wrf. Specifically, configure.wrf has a rule for turning .F90 files into .o files: https://github.com/wrf-model/WRF/blob/v4.6.0/arch/postamble#L206-L212 . For case-insensitive filesystems, the intermediate .f90 files could be problematic, and so it may be better to use .F as the suffix for the new feedback_domain_em_part1.F90, etc. files.
Note that other .F90 files in RSL_LITE (e.g., f_xpose.F90) are compiled with RSL_LITE's rules that don't generate intermediate .f90 files.

@mgduda mgduda self-requested a review September 19, 2024 22:16
mgduda
mgduda previously approved these changes Sep 19, 2024
@mgduda mgduda self-requested a review September 19, 2024 23:44
@mgduda
Copy link
Collaborator

mgduda commented Sep 19, 2024

After cleaning the build, it looks like the following processed source files aren't being removed:

	external/RSL_LITE/feedback_domain_em_part1.f90
	external/RSL_LITE/feedback_domain_em_part2.f90
	external/RSL_LITE/force_domain_em_part2.f90
	external/RSL_LITE/interp_domain_em_part1.f90
	external/RSL_LITE/interp_domain_em_part2.f90
	external/RSL_LITE/interp_domain_em_part3.f90
	external/RSL_LITE/interp_domain_em_small.f90

@islas
Copy link
Collaborator Author

islas commented Sep 20, 2024

I removed the *.f90 clean addition in the external/RSL_LITE/makefile, especially as I don't know if on case insensitive systems it would remove actual source files like external/RSL_LITE/f_pack.F90. I think adding back something like *domain_em*.f90 would suffice but look odd inside the clean command

@islas islas merged commit d810865 into wrf-model:release-v4.6.1 Sep 26, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants