Adding coalton-based benchmark system #1286

Izaakwltn · 2024-10-03T07:10:19Z

Here's a candidate for a potential benchmarking system, written almost entirely in Coalton.

To try it out:

CL-USER> (asdf:load-system :coalton/benchmarks)
CL-USER> (in-package :coalton-benchmarks)
COALTON-BENCHMARKS> (run-coalton-benchmarks)
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                         Package 'coalton-benchmarks/fibonacci'                          │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│                          System: ARM64 OS-MACOSX SBCL2.2.4-WIP                          │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│                   Coalton development mode without heuristic inlining                   │
├──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┤
│  Benchmark   │  Time (ms)   │Avg Time (ms) │ Time std dev │  Space (B)   │  # Samples   │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ REC-FIB0     │            2 │            0 │          n/a │            0 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ REC-FIB1     │           21 │            0 │          n/a │            0 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ REC-FIB2     │          220 │            0 │          n/a │            0 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ REC-FIB3     │         2434 │            2 │          n/a │            0 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ REC-FIB-GENE │            6 │            0 │ 0.0013149470 │            0 │          502 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ REC-FIB-GENE │           44 │            0 │ 0.0055944519 │            0 │          502 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ REC-FIB-GENE │          885 │            1 │ 0.0782520336 │            0 │          954 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ REC-FIB-GENE │         5515 │           10 │ 0.4263708338 │            0 │          545 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ REC-FIB-LISP │           97 │            0 │          n/a │            0 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ REC-FIB-MONO │          277 │            0 │          n/a │            0 │         1000 │
└───────────────────────┴─────────────────────┴─────────────────────┴─────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                         Package 'coalton-benchmarks/big-float'                          │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│                          System: ARM64 OS-MACOSX SBCL2.2.4-WIP                          │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│                   Coalton development mode without heuristic inlining                   │
├──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┤
│  Benchmark   │  Time (ms)   │Avg Time (ms) │ Time std dev │  Space (B)   │  # Samples   │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ BIG-TRIG     │          466 │            0 │          n/a │       872848 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ BIG-INV-TRIG │            2 │            0 │          n/a │       876080 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ BIG-LN-EXP   │          481 │            0 │          n/a │       458640 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ BIG-SQRT     │            1 │            0 │          n/a │       327664 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ BIG-MULT-CON │            0 │            0 │          n/a │       196528 │         1000 │
└───────────────────────┴─────────────────────┴─────────────────────┴─────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                        Package 'coalton-benchmarks/gabriel/tak'                         │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│                          System: ARM64 OS-MACOSX SBCL2.2.4-WIP                          │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│                   Coalton development mode without heuristic inlining                   │
├──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┤
│  Benchmark   │  Time (ms)   │Avg Time (ms) │ Time std dev │  Space (B)   │  # Samples   │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ TAK          │         1022 │            1 │          n/a │            0 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ LISP-TAK     │           82 │            0 │          n/a │            0 │         1000 │
└───────────────────────┴─────────────────────┴─────────────────────┴─────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                        Package 'coalton-benchmarks/gabriel/takr'                        │

Izaakwltn · 2024-10-04T02:39:02Z

~~Next step is adding tabularized printouts, either on this PR or as a subsequent PR~~

Izaakwltn · 2024-10-04T08:55:45Z

Current interface:

Defining benchmarks:

;; Defining a Coalton benchmark
(define-benchmark stak 1000 ; iterations
  (fn ()
    (stak 18 12 6)
    Unit))

;; Defining a Lisp Benchmark
(define-benchmark lisp-stak 1000 ; iterations
  (fn ()
    (lisp Unit ()
      (lisp-stak 18 12 6)
      Unit)))

Running package benchmarks:

This returns a PackageBenchmarkResults object, and also prints each benchmark to the repl as it is completed (if *verbose-benchmarking* is True)

COALTON-BENCHMARKS> (coalton (run-package-benchmarks "coalton-benchmarks/gabriel/tak"))
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                        Package 'coalton-benchmarks/gabriel/tak'                         │
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                          System: ARM64 OS-MACOSX SBCL2.2.4-WIP                          │
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                   Coalton development mode without heuristic inlining                   │
├─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┤
│    Benchmark    │    Run time     │    Real time    │  Bytes consed   │  # Iterations   │

├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│       TAK       │   1.043406 s    │   1.044723 s    │      65520      │      1000       │

├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│    LISP-TAK     │   0.082777 s    │   0.082867 s    │      65520      │      1000       │

└─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┘

#.(PACKAGEBENCHMARKRESULTS "coalton-benchmarks/gabriel/tak" #.(COALTON-BENCHMARKING/BENCHMARKING::BENCHMARKSYSTEM "ARM64" "OS-MACOSX" "SBCL" "2.2.4-WIP" COMMON-LISP:NIL COMMON-LISP:NIL) #(#.(BENCHMARKRESULTS "TAK" 1000 1040557 1041583 95888)
                                                                                                                                                                                            #.(BENCHMARKRESULTS "LISP-TAK" 1000 83104 83040 65520)))

Run all Coalton benchmarks:

COALTON-BENCHMARKS> (run-coalton-benchmarks)
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                              Package 'coalton-benchmarks'                               │
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                          System: ARM64 OS-MACOSX SBCL2.2.4-WIP                          │
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                   Coalton development mode without heuristic inlining                   │
├─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┤
│    Benchmark    │    Run time     │    Real time    │  Bytes consed   │  # Iterations   │

├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│     REC-FIB     │   0.235165 s    │   0.235361 s    │     273808      │      1000       │

├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ REC-FIB-GENERIC │   0.892972 s    │   0.893901 s    │      65520      │      1000       │

├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│  REC-FIB-LISP   │   0.096640 s    │   0.096748 s    │      65520      │      1000       │

├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│  REC-FIB-MONO   │   0.290469 s    │   0.290814 s    │      65536      │      1000       │

├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│    BIG-TRIG     │   0.465693 s    │   0.466405 s    │     1125376     │      1000       │

├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│  BIG-INV-TRIG   │   0.001911 s    │   0.001917 s    │     1786896     │      1000       │

.
.
.

Reexporting benchmarks in different packages:

(reexport-benchmarks
   "coalton-benchmarks/fibonacci"
   "coalton-benchmarks/big-float"
   "coalton-benchmarks/gabriel")

stylewarning · 2024-10-07T08:57:31Z

I think we should also have bytes consed.

jbouwman

Approved, with a few suggestions:

major:

update benchmarking/README.md ('run-benchmark' no longer exists)
update make bench target to complete successfully (I had to drop 'without-gcing' to get to the underlying error, the gc macro shouldn't have to appear in the makefile, anyways)

minor:

add bytes consed
add (micro?) seconds per iteration
possibly, print all timings in scientific notation, since there are a large range of possible values.

benchmarks/benchmarking.lisp

Izaakwltn · 2024-10-09T20:40:44Z

~~This will now depend on #1298 and #1299~~ no PR dependencies

Izaakwltn · 2024-10-18T18:21:00Z

Added Brainfold benchmarks:

CL-USER> (asdf:load-system "small-coalton-programs")
CL-USER> (in-package #:brainfold)
#<COMMON-LISP:PACKAGE "BRAINFOLD">
BRAINFOLD> (run-brainfold-benchmarks)
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                                   Package 'brainfold'                                   │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│                          System: ARM64 OS-MACOSX SBCL2.2.4-WIP                          │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│                   Coalton development mode without heuristic inlining                   │
├──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┤
│  Benchmark   │  Time (ms)   │Avg Time (ms) │ Time std dev │  Space (B)   │  # Samples   │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ BF-HELLO     │          611 │            1 │ 0.0496747498 │    976165568 │          930 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ BF-HELLO1000 │          653 │            1 │          n/a │   1079498464 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ BF-GNARLY    │           34 │            1 │ 0.0118789756 │     54060176 │           55 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ BF-GNARLY1000│          579 │            1 │          n/a │    992671728 │         1000 │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ SQUARES      │      1343953 │          500 │ 18.513726506 │2733021904224 │         2687 │
└─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┘

Izaakwltn · 2024-10-19T00:43:19Z

~~Use hand-rolled print/trace that doesn't add a new line during table generation~~

Izaakwltn · 2024-10-25T21:36:35Z

~~Right aligned cells (especially with numbers), maybe left aligned too~~
~~"just an idea"~~
~~-RS~~

Izaakwltn · 2024-10-31T21:31:54Z

Added parameterized benchmarks, fixed alignment, added timing standard deviation and dynamic sample scaling based on standard deviation convergence.

Izaakwltn · 2024-11-07T19:26:29Z

I think the define-benchmark macro should perhaps have a more robust set of optional keywords:

time limit, sample limit, etc

Izaakwltn marked this pull request as draft October 3, 2024 07:10

Izaakwltn mentioned this pull request Oct 3, 2024

Split benchmark packages #1276

Open

Izaakwltn changed the title ~~Adding coalton-based benchmark system WIP~~ Adding coalton-based benchmark system Oct 3, 2024

Izaakwltn marked this pull request as ready for review October 4, 2024 02:36

Izaakwltn force-pushed the benchmark-system branch 4 times, most recently from 5e7fb7c to 08aba17 Compare October 4, 2024 08:52

Izaakwltn force-pushed the benchmark-system branch from 08aba17 to ec5b34b Compare October 4, 2024 18:07

jbouwman approved these changes Oct 8, 2024

View reviewed changes

benchmarks/benchmarking.lisp Outdated Show resolved Hide resolved

Izaakwltn force-pushed the benchmark-system branch 5 times, most recently from 83951d5 to 775c382 Compare October 9, 2024 18:59

Izaakwltn force-pushed the benchmark-system branch 9 times, most recently from 2e19abf to d69335d Compare October 10, 2024 20:29

Izaakwltn requested a review from macrologist October 10, 2024 21:20

Izaakwltn force-pushed the benchmark-system branch from d69335d to 62e0c2b Compare October 10, 2024 22:12

Izaakwltn requested a review from stylewarning October 10, 2024 22:19

Izaakwltn force-pushed the benchmark-system branch 7 times, most recently from 5f669d8 to c9fc076 Compare October 18, 2024 00:25

Izaakwltn requested a review from jbouwman October 18, 2024 00:49

Izaakwltn force-pushed the benchmark-system branch 2 times, most recently from f16309e to a3e5b46 Compare October 18, 2024 18:14

Izaakwltn requested a review from YarinHeffes October 18, 2024 23:30

Izaakwltn force-pushed the benchmark-system branch 3 times, most recently from be59946 to a40508c Compare October 25, 2024 17:21

Izaakwltn force-pushed the benchmark-system branch 5 times, most recently from 12ef323 to 4eb3080 Compare October 31, 2024 21:29

Izaakwltn force-pushed the benchmark-system branch from 4eb3080 to 79e7472 Compare November 7, 2024 18:29

Izaakwltn mentioned this pull request Nov 7, 2024

Benchmark sanitization and minor additions #1207

Closed

Izaakwltn force-pushed the benchmark-system branch from 79e7472 to 32cc544 Compare November 7, 2024 18:42

Adding coalton-based benchmark system

4de376c

Izaakwltn force-pushed the benchmark-system branch from 32cc544 to 4de376c Compare November 7, 2024 19:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding coalton-based benchmark system #1286

Adding coalton-based benchmark system #1286

Izaakwltn commented Oct 3, 2024 •

edited

Loading

Izaakwltn commented Oct 4, 2024 •

edited

Loading

Izaakwltn commented Oct 4, 2024 •

edited

Loading

stylewarning commented Oct 7, 2024

jbouwman left a comment

Izaakwltn commented Oct 9, 2024 •

edited

Loading

Izaakwltn commented Oct 18, 2024 •

edited

Loading

Izaakwltn commented Oct 19, 2024 •

edited

Loading

Izaakwltn commented Oct 25, 2024 •

edited

Loading

Izaakwltn commented Oct 31, 2024

Izaakwltn commented Nov 7, 2024

Adding coalton-based benchmark system #1286

Are you sure you want to change the base?

Adding coalton-based benchmark system #1286

Conversation

Izaakwltn commented Oct 3, 2024 • edited Loading

Izaakwltn commented Oct 4, 2024 • edited Loading

Izaakwltn commented Oct 4, 2024 • edited Loading

Current interface:

Defining benchmarks:

Running package benchmarks:

Run all Coalton benchmarks:

Reexporting benchmarks in different packages:

stylewarning commented Oct 7, 2024

jbouwman left a comment

Choose a reason for hiding this comment

Izaakwltn commented Oct 9, 2024 • edited Loading

Izaakwltn commented Oct 18, 2024 • edited Loading

Izaakwltn commented Oct 19, 2024 • edited Loading

Izaakwltn commented Oct 25, 2024 • edited Loading

Izaakwltn commented Oct 31, 2024

Izaakwltn commented Nov 7, 2024

Izaakwltn commented Oct 3, 2024 •

edited

Loading

Izaakwltn commented Oct 4, 2024 •

edited

Loading

Izaakwltn commented Oct 4, 2024 •

edited

Loading

Izaakwltn commented Oct 9, 2024 •

edited

Loading

Izaakwltn commented Oct 18, 2024 •

edited

Loading

Izaakwltn commented Oct 19, 2024 •

edited

Loading

Izaakwltn commented Oct 25, 2024 •

edited

Loading