Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional scatterplot to benchcomp output #3077

Merged
merged 18 commits into from
Mar 20, 2024

Conversation

tautschnig
Copy link
Member

Scatterplots should make it easier to immediately spot performance trends (and indeed any differences) rather than having to process a (large) number of table rows.

Uses mermaid-js to produce markdown-embedded plots that will display on the job summary page. Scatterplots are not directly supported by mermaid-js at this point (xycharts only do line or bar charts), so quadrant plots are employed with various diagram items drawn in white to make them disappear.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

@tautschnig tautschnig self-assigned this Mar 14, 2024
@tautschnig tautschnig added the Z-BenchCI Tag a PR to run benchmark CI label Mar 14, 2024
Scatterplots should make it easier to immediately spot performance trends (and
indeed any differences) rather than having to process a (large) number
of table rows.

Uses mermaid-js to produce markdown-embedded plots that will display on
the job summary page. Scatterplots are not directly supported by
mermaid-js at this point (xycharts only do line or bar charts), so
quadrant plots are employed with various diagram items drawn in white to
make them disappear.
@tautschnig tautschnig added Z-BenchCI Tag a PR to run benchmark CI and removed Z-BenchCI Tag a PR to run benchmark CI labels Mar 15, 2024
@tautschnig tautschnig removed their assignment Mar 15, 2024
@tautschnig tautschnig marked this pull request as ready for review March 15, 2024 15:24
@tautschnig tautschnig requested a review from a team as a code owner March 15, 2024 15:24
@tautschnig
Copy link
Member Author

See https://github.com/model-checking/kani/actions/runs/8296849261?pr=3077 for what those diagrams now look like.

Copy link
Contributor

@jaisnan jaisnan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the images shown, can you leave a comment here or in documentation somewhere on how to interpret those values? What do the values on the top-right of the plot mean, what do the values on the bottom-left mean etc?

tools/benchcomp/test/test_regression.py Show resolved Hide resolved
@tautschnig
Copy link
Member Author

From the images shown, can you leave a comment here or in documentation somewhere on how to interpret those values? What do the values on the top-right of the plot mean, what do the values on the bottom-left mean etc?

I am now adding text output to spell out the axis ranges, thank you for calling this out!

Copy link
Contributor

@karkhaz karkhaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, just a few comments.

To be honest the latest version is still a bit hard to read. It's less the axis labels and more that I'm not confident that I would be able to spot a 20% increase or decrease from the y=x line, for example. Also if a particular point has moved far away from the y=x line, it's not that easy to determine which proof or benchmark that actually corresponds to.

As a totally different approach to quickly identifying outlying benchmarks---would it be easier if I actually implemented the benchcomp filter command, which allows you to filter benchmarks based on certain criteria before they get emitted to the Markdown table?

That way, we could print a "filtered" set of Markdown tables (containing only regressions) followed by the complete results. That might be superior to the current approach where regressions are highlighted in bold in the complete table.

So the action would look something like this

      - name: Perf Regression Table
        run: |
          echo "# Regressions" >> "$GITHUB_STEP_SUMMARY"
          new/tools/benchcomp/bin/benchcomp \
            --config new/tools/benchcomp/configs/perf-regression.yaml \
            visualize --only dump_markdown_results_table >> "$GITHUB_STEP_SUMMARY"

      - name: Perf All Results Table
        run: |
          echo "# All Results" >> "$GITHUB_STEP_SUMMARY"
          new/tools/benchcomp/bin/benchcomp \
            --config new/tools/benchcomp/configs/perf-all-results.yaml \
            visualize --only dump_markdown_results_table >> "$GITHUB_STEP_SUMMARY"

      - name: Run other visualizations
        run: |
          new/tools/benchcomp/bin/benchcomp \
            --config new/tools/benchcomp/configs/perf-regression.yaml \
            visualize --except dump_markdown_results_table

tools/benchcomp/benchcomp/visualizers/__init__.py Outdated Show resolved Hide resolved
tools/benchcomp/benchcomp/visualizers/__init__.py Outdated Show resolved Hide resolved
tools/benchcomp/benchcomp/visualizers/__init__.py Outdated Show resolved Hide resolved
tautschnig and others added 3 commits March 18, 2024 15:33
Co-authored-by: Kareem Khazem <karkhaz@karkhaz.com>
Co-authored-by: Kareem Khazem <karkhaz@karkhaz.com>
Co-authored-by: Kareem Khazem <karkhaz@karkhaz.com>
@tautschnig
Copy link
Member Author

To be honest the latest version is still a bit hard to read. It's less the axis labels and more that I'm not confident that I would be able to spot a 20% increase or decrease from the y=x line, for example. Also if a particular point has moved far away from the y=x line, it's not that easy to determine which proof or benchmark that actually corresponds to.

Yes, I'd indeed love to be able to draw that diagonal line into the diagram, but I haven't been able to figure out a way to do this in Mermaid. Perhaps it's just about time to create bunch of feature requests against Mermaid.

As a totally different approach to quickly identifying outlying benchmarks---would it be easier if I actually implemented the benchcomp filter command, which allows you to filter benchmarks based on certain criteria before they get emitted to the Markdown table?

Perhaps we should have all of these tools available and then see what helps us the most?

Copy link
Contributor

@karkhaz karkhaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, looking forward to seeing this in CBMC also!

@tautschnig tautschnig merged commit ec2b8b3 into model-checking:main Mar 20, 2024
20 checks passed
@tautschnig tautschnig deleted the benchcomp-scatterplot branch March 20, 2024 06:21
tautschnig added a commit that referenced this pull request Apr 5, 2024
Updated version in all `Cargo.toml` files (via
`find . -name Cargo.toml -exec sed -i 's/version = "0.48.0"/version =
"0.49.0"/' {} \;`) and ran `cargo build-dev` to have `Cargo.lock` files
updated.

GitHub generated release notes:

## What's Changed
* Upgrade Rust toolchain to 2024-03-14 by @zhassan-aws in
#3081
* Disable removal of storage markers by @zhassan-aws in
#3083
* Automatic cargo update to 2024-03-18 by @github-actions in
#3086
* Bump tests/perf/s2n-quic from `1a7faa8` to `9e39ca0` by @dependabot in
#3087
* Upgrade toolchain to nightly-2024-03-15 by @celinval in
#3084
* Add optional scatterplot to benchcomp output by @tautschnig in
#3077
* Benchcomp scatterplots: quote axis labels by @tautschnig in
#3097
* Expand ${var} in benchcomp variant `env` by @karkhaz in
#3090
* Add test for #3099 by @zhassan-aws in
#3100
* Automatic cargo update to 2024-03-25 by @github-actions in
#3103
* Bump tests/perf/s2n-quic from `1a7faa8` to `0a60ec1` by @dependabot in
#3104
* Implement validity checks by @celinval in
#3085
* Add `benchcomp filter` command by @karkhaz in
#3105
* Add CI test for --use-local-toolchain by @jaisnan in
#3074
* Upgrade Rust toolchain to `nightly-2024-03-21` by @adpaco-aws in
#3102
* Use `intrinsic_name` to get the intrinsic name by @adpaco-aws in
#3114
* Bump tests/perf/s2n-quic from `0a60ec1` to `2d5e891` by @dependabot in
#3118
* Allow modifies clause for verification only by @feliperodri in
#3098
* Automatic cargo update to 2024-04-01 by @github-actions in
#3117
* Automatic cargo update to 2024-04-04 by @github-actions in
#3122
* Remove bookrunner by @tautschnig in
#3123
* Upgrade Rust toolchain to nightly-2024-03-29 by @feliperodri in
#3116
* Remove unnecessary build step for some workflows by @zhassan-aws in
#3124
* Ensure storage markers are kept in std code by @zhassan-aws in
#3080


**Full Changelog**:
kani-0.48.0...kani-0.49.0
zpzigi754 pushed a commit to zpzigi754/kani that referenced this pull request May 8, 2024
Updated version in all `Cargo.toml` files (via
`find . -name Cargo.toml -exec sed -i 's/version = "0.48.0"/version =
"0.49.0"/' {} \;`) and ran `cargo build-dev` to have `Cargo.lock` files
updated.

GitHub generated release notes:

## What's Changed
* Upgrade Rust toolchain to 2024-03-14 by @zhassan-aws in
model-checking#3081
* Disable removal of storage markers by @zhassan-aws in
model-checking#3083
* Automatic cargo update to 2024-03-18 by @github-actions in
model-checking#3086
* Bump tests/perf/s2n-quic from `1a7faa8` to `9e39ca0` by @dependabot in
model-checking#3087
* Upgrade toolchain to nightly-2024-03-15 by @celinval in
model-checking#3084
* Add optional scatterplot to benchcomp output by @tautschnig in
model-checking#3077
* Benchcomp scatterplots: quote axis labels by @tautschnig in
model-checking#3097
* Expand ${var} in benchcomp variant `env` by @karkhaz in
model-checking#3090
* Add test for model-checking#3099 by @zhassan-aws in
model-checking#3100
* Automatic cargo update to 2024-03-25 by @github-actions in
model-checking#3103
* Bump tests/perf/s2n-quic from `1a7faa8` to `0a60ec1` by @dependabot in
model-checking#3104
* Implement validity checks by @celinval in
model-checking#3085
* Add `benchcomp filter` command by @karkhaz in
model-checking#3105
* Add CI test for --use-local-toolchain by @jaisnan in
model-checking#3074
* Upgrade Rust toolchain to `nightly-2024-03-21` by @adpaco-aws in
model-checking#3102
* Use `intrinsic_name` to get the intrinsic name by @adpaco-aws in
model-checking#3114
* Bump tests/perf/s2n-quic from `0a60ec1` to `2d5e891` by @dependabot in
model-checking#3118
* Allow modifies clause for verification only by @feliperodri in
model-checking#3098
* Automatic cargo update to 2024-04-01 by @github-actions in
model-checking#3117
* Automatic cargo update to 2024-04-04 by @github-actions in
model-checking#3122
* Remove bookrunner by @tautschnig in
model-checking#3123
* Upgrade Rust toolchain to nightly-2024-03-29 by @feliperodri in
model-checking#3116
* Remove unnecessary build step for some workflows by @zhassan-aws in
model-checking#3124
* Ensure storage markers are kept in std code by @zhassan-aws in
model-checking#3080


**Full Changelog**:
model-checking/kani@kani-0.48.0...kani-0.49.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Z-BenchCI Tag a PR to run benchmark CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants