Add benchmarks #2153

infogulch · 2023-11-08T20:29:30Z

This PR is an idea for how we can add benchmarks to scryer-prolog. The ultimate goal would be to have an array of benchmarks and publish a report that displays the measurements over time. But first things first...

Benchmarks in CI?

The challenge of running benchmarks on public runners like GitHub Actions is that the variation in wall clock time can be as high as 20% due to environment differences out of your control like noisy vm neighbors. This obliterates the utility of using automated benchmark results to judge whether a proposed change actually helps performance.

So this PR pursues a strategy of measuring the number of instructions executed during the benchmark. Unlike wall time this metric is very stable and sometimes even deterministic. Admittedly the instructions executed metric is only correlated to the wall time, but it's generally a good tradeoff to allow running it in an otherwise noisy automation environment.

This is accomplished by integrating with valgrind, specifically the callgrind api that uses architecture specific features to do precise measurements of the code execution. See iai's Comparison with Criterion-rs for a general overview of this strategy. This PR uses the iai-callgrind library which is an actively maintained fork of the original iai.

Benchmark design

This uses the new library api (#1880) to execute prolog from rust, and executes the edges.pl benchmark. There are two benchmark suites, benches/run.rs that uses rust's built-in benchmarking tool to allow doing wall-time measurements locally, and run_iai.rs which does the same using callgrind to track metrics and which runs significantly slower (2s vs 6m). They share some benchmark setup for consistency. I'm pretty undecided whether I like the way this is laid out, any feedback here is welcome.

lib_machine.rs is incomplete

Unfortunately, running this benchmark fails (see the report job), and I'm not sure how to fix it. I don't know if this is a bug in the benchmark or in the library api. Help here would be greatly appreciated.

I've worked around the error for now by just not showing the problematic variable. In any case it seems that lib_machine.rs is unable to parse all possible values output by the top level.

This pr includes the commit in #2152.

Future?

Add more benchmarks, see Add benchmarking suite to CI #1782
Collect more metrics: avg/max memory, inferences could be another extremely useful metric, user-time (noisy but could be useful for comparisons over a long time frame)
Aggregate benchmark results into a separate repo, and publish a report showing changes over time
Benchmarking with valgrind also produces profiles, so maybe enable PGO using the previous build's profile?

triska · 2023-11-08T22:21:57Z

Awesome, thank you a lot for working on this!

One small detail I noticed: pairs_keys_values/3 has since become available in library(pairs) which can be used to shorten the code a bit.

infogulch · 2023-11-09T02:05:55Z

I fixed the issue, and changed it to use library(pairs).

Here's a sample of the instruction counting benchmark output:

 run_iai::bench_scryer::bench_edges
  Instructions:         23835883675
  L1 Hits:              33799880719
  L2 Hits:                391485553
  RAM Hits:                 3449801
  Total read+write:     34194816073
  Estimated Cycles:     35878051519

https://github.com/mthom/scryer-prolog/actions/runs/6806340622/job/18507436925?pr=2153#step:8:272

infogulch · 2023-11-11T19:04:28Z

This latest version is ready I think.

Added benches/README.md that explains the design and how to add new benchmarks
Removed unnecessary changes to parsed_results.rs
Reorganized the benchmark definitions to make it more clear where they are defined and what the harnesses are

Any comments welcome.

Edit: Rebased and fixed the formatting issues.

infogulch marked this pull request as draft November 8, 2023 20:31

infogulch force-pushed the benchmark branch from 71cb6c5 to 7b76449 Compare November 9, 2023 03:31

infogulch marked this pull request as ready for review November 9, 2023 03:36

infogulch force-pushed the benchmark branch 2 times, most recently from b0a7c08 to 4bb6ebe Compare November 11, 2023 04:07

infogulch changed the title ~~Add one benchmark~~ Add benchmarks Nov 11, 2023

infogulch mentioned this pull request Nov 11, 2023

Machine readable output iai-callgrind/iai-callgrind#31

Closed

infogulch marked this pull request as draft November 11, 2023 15:41

infogulch force-pushed the benchmark branch 3 times, most recently from dcbd273 to ba01a5d Compare November 11, 2023 18:51

infogulch marked this pull request as ready for review November 11, 2023 19:04

infogulch added 3 commits November 11, 2023 13:32

CI: Setup rust in new action; split style & report actions

1656d9d

Add benchmarks using library interface

8570f11

Fix formatting issues

f704fcb

infogulch force-pushed the benchmark branch from ba01a5d to f704fcb Compare November 11, 2023 19:33

mthom merged commit 3b7d4a7 into mthom:master Nov 12, 2023
13 checks passed

infogulch deleted the benchmark branch November 12, 2023 06:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmarks #2153

Add benchmarks #2153

infogulch commented Nov 8, 2023 •

edited

Loading

triska commented Nov 8, 2023

infogulch commented Nov 9, 2023 •

edited

Loading

infogulch commented Nov 11, 2023 •

edited

Loading

Add benchmarks #2153

Add benchmarks #2153

Conversation

infogulch commented Nov 8, 2023 • edited Loading

Benchmarks in CI?

Benchmark design

lib_machine.rs is incomplete

Future?

triska commented Nov 8, 2023

infogulch commented Nov 9, 2023 • edited Loading

infogulch commented Nov 11, 2023 • edited Loading

infogulch commented Nov 8, 2023 •

edited

Loading

infogulch commented Nov 9, 2023 •

edited

Loading

infogulch commented Nov 11, 2023 •

edited

Loading