Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare local and weekly benchmarks using Hatchet #1317

Merged
merged 66 commits into from
Feb 12, 2025

Conversation

chapman39
Copy link
Collaborator

@chapman39 chapman39 commented Jan 23, 2025

  • create script to compare benchmarks of a local build and weekly shared benchmarks on LC
  • improve handling of cmake build type in config-build.py
  • create optional, manual CI pipeline (ruby-gcc, ruby-clang, lassen-clang) to test current PR
  • documentation on how to use this script and run the manual CI pipeline

tmp todo:

How the script works

The script matches two caliper files (one from weekly shared location /usr/workspace/smithdev/califiles/serac, one from a specified build location), and creates a Hatchet "graph frame" from the difference between these two files. If the maximum difference between any section of the graph is greater than X seconds, that benchmark will "fail." The script will do this for each benchmark.

Example

../scripts/llnl/compare_benchmarks.py --current-cali-dir . --verbose --depth 2 --metric-columns "Max time/rank (inc)"

(not all graphs are shown)
Screenshot 2025-01-29 at 1 40 12 PM

You can now see the baseline and current benchmark times, as well as the difference between the two. You can also choose which "metric column" you want to see (defaults to average time per rank) and set the level of depth of the tree you wish to view. At the moment, it only displays the difference trees.

Some problems

LC system performance is inconsistent. You can run the same benchmark multiple times and get wildly different results. My understanding is this is due to the node(s) you get allocated, how busy the machine is, among other things. That being said, while this is a nice feature to look at, I'm skeptical to make this a required CI check at this time.

Improving config-build.py

This PR fixes args.buildtype, so that it's set based on CMAKE_BUILD_TYPE, assuming the CMake variable is set.

Before, if you set -DCMAKE_BUILD_TYPE=Release when configuring Serac, the build directory would incorrectly have debug in the name, since the args.buildtype variable remained Debug.

Links

Sorry, something went wrong.

@chapman39 chapman39 added CI Continuous Integration testing Related to testing labels Jan 23, 2025
@chapman39 chapman39 self-assigned this Jan 23, 2025
a
@chapman39 chapman39 mentioned this pull request Jan 27, 2025
10 tasks
@chapman39 chapman39 marked this pull request as ready for review January 28, 2025 01:31

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
chapman39 and others added 5 commits February 10, 2025 16:25

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Co-authored-by: Chris White <white238@llnl.gov>
Copy link
Member

@btalamini btalamini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@chapman39 chapman39 merged commit bccd6c0 into develop Feb 12, 2025
13 checks passed
@chapman39 chapman39 deleted the feature/chapman39/hatchet branch February 12, 2025 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration testing Related to testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants