Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use popcnt intrinsic for bitcount #1011

Merged
merged 2 commits into from
Oct 1, 2019
Merged

Conversation

marshallward
Copy link
Collaborator

Profiling of the test suite showed a large amount of time (nearly 1/3)
devoted to computing of bitcounts used as checksums for diagnostics.

This patch replaces the bit loop with the popcnt intrinsic, which
produces the same result and uses the hardware assembly instruction when
available (e.g. popcnt in x86).

Note that popcnt is a Fortran 2008 feature.

This change appears to have reduced the runtime of the test suite from
4.5 minutes to under 3 minutes.

Profiling of the test suite showed a large amount of time (nearly 1/3)
devoted to computing of bitcounts used as checksums for diagnostics.

This patch replaces the bit loop with the popcnt intrinsic, which
produces the same result and uses the hardware assembly instruction when
available (e.g. popcnt in x86).

This change appears to have reduced the runtime of the test suite from
4.5 minutes to under 3 minutes.
@codecov-io
Copy link

codecov-io commented Oct 1, 2019

Codecov Report

Merging #1011 into dev/gfdl will increase coverage by 2.66%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff              @@
##           dev/gfdl    #1011      +/-   ##
============================================
+ Coverage     40.49%   43.16%   +2.66%     
============================================
  Files           213      213              
  Lines         62355    62352       -3     
============================================
+ Hits          25253    26915    +1662     
+ Misses        37102    35437    -1665
Impacted Files Coverage Δ
src/framework/MOM_checksums.F90 77.32% <100%> (-0.08%) ⬇️
src/core/MOM.F90 69.2% <0%> (+0.08%) ⬆️
...parameterizations/vertical/MOM_diabatic_driver.F90 58.26% <0%> (+0.17%) ⬆️
src/framework/MOM_file_parser.F90 67.08% <0%> (+0.25%) ⬆️
src/diagnostics/MOM_debugging.F90 12.97% <0%> (+0.31%) ⬆️
src/framework/MOM_io.F90 42.37% <0%> (+0.48%) ⬆️
src/initialization/MOM_grid_initialize.F90 66.26% <0%> (+0.52%) ⬆️
...parameterizations/vertical/MOM_set_diffusivity.F90 63.16% <0%> (+0.56%) ⬆️
src/parameterizations/vertical/MOM_kappa_shear.F90 68.23% <0%> (+0.63%) ⬆️
src/framework/MOM_string_functions.F90 93.46% <0%> (+1%) ⬆️
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3418c1d...e62524b. Read the comment docs.

@marshallward
Copy link
Collaborator Author

marshallward commented Oct 1, 2019

It looks like total test time dropped from 14.5 to 12 minutes.

Still dominated by other stuff (VM boot, building (esp. FMS), CodeCov uploads, etc). But this seems like an improvement.

Gaea regression testing: https://gitlab.gfdl.noaa.gov/ogrp/MOM6/pipelines/9072

Copy link
Collaborator

@adcroft adcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming gaea tests pass

@marshallward
Copy link
Collaborator Author

Gaea tests have passed.

@adcroft adcroft merged commit 9ca5fb9 into mom-ocean:dev/gfdl Oct 1, 2019
@marshallward marshallward deleted the popcnt branch February 13, 2020 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants