Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add levensthtein for Clojure and Babashka (and Java for good measure) #234

Merged
merged 3 commits into from
Dec 11, 2024

Conversation

PEZ
Copy link
Contributor

@PEZ PEZ commented Dec 11, 2024

The Clojure and Java solutions are not as close to C's as with the other benchmarks, but still decent.

Java:

Evaluation count : 66 in 6 samples of 11 calls.
             Execution time mean : 10.133643 ms
    Execution time std-deviation : 721.632889 µs
   Execution time lower quantile : 9.165525 ms ( 2.5%)
   Execution time upper quantile : 10.955189 ms (97.5%)
                   Overhead used : 1.295061 ns

Clojure:

Evaluation count : 60 in 6 samples of 10 calls.
             Execution time mean : 10.776057 ms
    Execution time std-deviation : 670.658990 µs
   Execution time lower quantile : 10.254974 ms ( 2.5%)
   Execution time upper quantile : 11.682582 ms (97.5%)
                   Overhead used : 1.295061 ns

It's too few runs to draw any conclusions about the difference between these two. But we can assume that there is a small overhead for Clojure in about the amount indicated.

Start times again

Of course, with this benchmark it gets even more obvious that we need to do something to mitigate for the start times of the languages. Because running this with hyperfine:

Benchmarking C
Benchmark 1:  ./c/code aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbb
  Time (mean ± σ):       5.6 ms ±   0.1 ms    [User: 4.2 ms, System: 1.0 ms]
  Range (min … max):     5.5 ms …   5.7 ms    3 runs
 

Benchmarking Java
Benchmark 1: java jvm.code aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbb
  Time (mean ± σ):      72.5 ms ±   0.7 ms    [User: 45.5 ms, System: 22.3 ms]
  Range (min … max):    71.8 ms …  73.2 ms    3 runs
 

Benchmarking Clojure
Benchmark 1: java -cp clojure/classes:src:/Users/pez/.m2/repository/org/clojure/clojure/1.12.0/cloju
  Time (mean ± σ):     302.8 ms ±   5.8 ms    [User: 464.0 ms, System: 47.2 ms]
  Range (min … max):   298.5 ms … 309.4 ms    3 runs
 

Benchmarking Babashka
Benchmark 1: bb bb/code.clj aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbb
  Time (mean ± σ):      2.018 s ±  0.010 s    [User: 1.991 s, System: 0.024 s]
  Range (min … max):    2.011 s …  2.029 s    3 runs 

So while Java and Clojure may be some 2-3 times slower than C on this algorithm. The benchmark makes it look more like 10X and 50X, respectively... Related:

The new run.sh needs attention

While adding this I noticed that the new run.sh fails to check file existence for Java and Clojure (and probably a lot of other languages too). It can't be assumed that the argument to the command for running the program is a file. To solve it so that I could run the benchmark for the languages I added I used this:

function run {
  if [ -f ${2} ]; then
    echo ""
    echo "Benchmarking $1"
    input=`cat input.txt`
    hyperfine -i --shell=none --runs 3 --warmup 2 "${3} ${4} ${input}" | cut -c1-100
  fi
}

run "C" "./c/code" "" "./c/code"
run "Java" "./jvm/code.class" "java" "jvm.code"
run "Clojure" "./clojure/code.clj" "java -cp clojure/classes:$(clojure -Spath)" "code"
run "Babashka" "bb/code.clj" "bb" "bb/code.clj"

So, introducing the file to check for as the second parameter to run.

Separate Babashka and Clojure code

Note that this has separate code for Clojure and Babashka, so this PR is related:

(inc (get-in matrix [i (dec j)])) ;; Insertion
(+ (get-in matrix [(dec i) (dec j)]) cost))))) ;; Substitution
matrix)))
(get-in matrix [m n])))))))))
Copy link
Owner

@bddicken bddicken Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

)))))))))

🫡

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha. In Clojure everything is an expression, so you can think of each ) as closing off an expression. Anyway, the length of that paren trail is a bit telling that we could consider refactoring a bit to make the code clearer. Threading macros to the rescue! 😄

@bddicken
Copy link
Owner

It can't be assumed that the argument to the command for running the program is a file

Why is this not safe to assume? I did notice that the line for Octave is wrong:

run "Octave" "octave ./octave/code.m 40"

It should be changed to

run "Octave" "octave" "./octave/code.m 40"

But I designed the script so that ${3} is just the name of the file to be executed / interpreted. I'll go ahead and merge this, but feel free to continue the discussion here.

@bddicken bddicken merged commit 1789db4 into bddicken:main Dec 11, 2024
@PEZ
Copy link
Contributor Author

PEZ commented Dec 12, 2024

It can't be assumed that the argument to the command for running the program is a file

Why is this not safe to assume?

For e.g. Java the argument for the executable is jvm.code. But there is no jvm.code file to be found. There's a file ./jvm/code.class, though. So the Java benchmarks are never run. Same for Clojure, and probably a lot of other languages.

@PEZ
Copy link
Contributor Author

PEZ commented Dec 13, 2024

I notice that in CI there actually is a jvm.code file. But it is built by the Graal native-image benchmark, which I think makes the Java benchmark mute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants