Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compilation load by splitting stdlib_stats_moment submodule #283

Merged
merged 2 commits into from
Dec 28, 2020

Conversation

awvwgk
Copy link
Member

@awvwgk awvwgk commented Dec 28, 2020

This is an attempt to reduce the memory usage when compiling stdlib with default max rank of 15. Tested with ninja generator (-G Ninja) and GCC 10.2 on a laptop with two physical cores and 8GB RAM.

The first commit on this branch allows compilation with one ninja job and peak memory usage of ~7GB. The second commit on this branch allows compilation with two ninja jobs and peak memory usage of ~7GB.

- max rank 15 and one ninja job result in ~7 GB peak memory usage
- with default max rank and two ninja jobs ~7 GB peak memory usage
@LKedward
Copy link
Member

This is a significant improvement for quite a straightforward restructuring, great stuff @awvwgk! I get similar results with the "Makefiles" generator on my smaller laptop which previously would not have handled it.

@awvwgk
Copy link
Member Author

awvwgk commented Dec 28, 2020

The peak memory usage should be around ~3.5GB for one build job. We also tried building on a bigger machine (8 cores, 64GB RAM, thanks @rivwgk) and got a peak memory usage of ~10GB. Maybe we could add a minimum system requirement for building stdlib in the README as well based on this data.

@jvdp1
Copy link
Member

jvdp1 commented Dec 28, 2020

Nice job @awvwgk. I much prefer this solution than limiting the number of ranks.
I tested it on my computer with gfortran 9.3.1. The peak memory usage was <3.5GB with 1 job.

Maybe we could add a minimum system requirement for building stdlib in the README as well based on this data.

Good idea. #282 is then not needed. Or it could be in complement to #282 (in case somenone would like stdlib on a computer that cannot handle the compilation with 15 ranks)?

@LKedward
Copy link
Member

I much prefer this solution than limiting the number of ranks.

I agree, and perhaps recommended practice for future contributions can be to limit the number of generated procedures per submodule?

@awvwgk
Copy link
Member Author

awvwgk commented Dec 28, 2020

Or it could be in complement to #282 (in case somenone would like stdlib on a computer that cannot handle the compilation with 15 ranks)?

There should be some documentation regarding the option for the maximum rank. Reducing the maximum supported rank is still increasing the compilation speed significantly and most projects might never go beyond rank 4 arrays.

I agree, and perhaps recommended practice for future contributions can be to limit the number of generated procedures per submodule?

This is hard to measure and enforce unless we find something automatic, e.g. conda-build can report peak resource usage after a successful build.

@LKedward
Copy link
Member

This is hard to measure and enforce unless we find something automatic

There's no need to enforce or automate this; I suggest merely a recommended practice for contributors to limit the number of generated procedures per submodule (e.g. <200). This is easily determined from the template files.

@awvwgk
Copy link
Member Author

awvwgk commented Dec 28, 2020

Lines of code is significant as well here, there is quite a difference in the resource usage between moment, moment_scalar and moment_all in this submodule, the moment routines are compile intense enough to require splitting the mask version from the normal version.

Copy link
Member

@milancurcic milancurcic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvement, thank you!

@milancurcic
Copy link
Member

This improves the build process and doesn't change the library itself, so I'll go ahead and merge it given at least 2 approvals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to build stdlib on low memory computer Very slow building of stdlib, and high memory usage
4 participants