-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use popcnt intrinsic for bitcount #1011
Conversation
Profiling of the test suite showed a large amount of time (nearly 1/3) devoted to computing of bitcounts used as checksums for diagnostics. This patch replaces the bit loop with the popcnt intrinsic, which produces the same result and uses the hardware assembly instruction when available (e.g. popcnt in x86). This change appears to have reduced the runtime of the test suite from 4.5 minutes to under 3 minutes.
Codecov Report
@@ Coverage Diff @@
## dev/gfdl #1011 +/- ##
============================================
+ Coverage 40.49% 43.16% +2.66%
============================================
Files 213 213
Lines 62355 62352 -3
============================================
+ Hits 25253 26915 +1662
+ Misses 37102 35437 -1665
Continue to review full report at Codecov.
|
It looks like total test time dropped from 14.5 to 12 minutes. Still dominated by other stuff (VM boot, building (esp. FMS), CodeCov uploads, etc). But this seems like an improvement. Gaea regression testing: https://gitlab.gfdl.noaa.gov/ogrp/MOM6/pipelines/9072 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming gaea tests pass
Gaea tests have passed. |
Profiling of the test suite showed a large amount of time (nearly 1/3)
devoted to computing of bitcounts used as checksums for diagnostics.
This patch replaces the bit loop with the popcnt intrinsic, which
produces the same result and uses the hardware assembly instruction when
available (e.g. popcnt in x86).
Note that popcnt is a Fortran 2008 feature.
This change appears to have reduced the runtime of the test suite from
4.5 minutes to under 3 minutes.