Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks: are memowise results reliable? #349

Open
tagliala opened this issue Sep 18, 2024 · 21 comments
Open

Benchmarks: are memowise results reliable? #349

tagliala opened this issue Sep 18, 2024 · 21 comments

Comments

@tagliala
Copy link
Contributor

Hello,

I've checked the main branch ac1231e and changed the version number.

diff --git a/lib/memo_wise/version.rb b/lib/memo_wise/version.rb
index 428f25a..a4fb14d 100644
--- a/lib/memo_wise/version.rb
+++ b/lib/memo_wise/version.rb
@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 
 module MemoWise
-  VERSION = "1.10.0"
+  VERSION = "1.10.0.dev"
 end

then I've tried to run the benchmarks

~/dev/memo_wise/benchmarks on ac1231e [!$]
$ bundle exec ruby benchmarks.rb
Skipping memoist (not loaded on 3.3.3)

Will BENCHMARK_GEMS:
	dry-core (1.0.1)
	memoist3 (1.0.0)
	memo_wise-local (1.10.0)
	alt_memery (2.1.0)
	memery (1.6.0)
	memo_wise-github-main (1.10.0)
ruby 3.3.3 (2024-06-12 revision f1c7b6f435) [arm64-darwin23]

I was expecting to see 1.10.0.dev in memo_wise-local and most important I was expecting to. see 1.00 (or something like that) in the comparison between memowise local and memowise github main, but I'm getting results like:

Method arguments memo_wise-github-main (1.10.0)
() (none) 0.99x
(a) 1.02x
(a, b) 1.00x
(a:) 1.01x
(a:, b:) 1.00x
(a, b:) 1.01x
(a, *args) 0.97x
(a:, **kwargs) 0.86x
(a, *args, b:, **kwargs) 1.00x

I'm asking because I'm getting different results for (a, *args) and (a:, **kwargs) almost all times

My config:

  • different ruby versions, from 2.7 to 3.3 (both arm and x64)
  • macOS 15.0 (24A335)
  • rvm 1.29.12-next (master)
@JacobEvelyn
Copy link
Member

Hmm good catch; I too am seeing strange behavior. @pboling I expect puts MemoWise::VERSION to not succeed if I put it right after the doff_and_don call, but it prints 1.10.0. I similarly can't get it to print 1.10.0.dev if I modify that locally.

cc @ms-ati

@tagliala
Copy link
Contributor Author

Thanks for the feedback. I would like to add that I was having this issue (or both issues, with non-expected benchmark results) even with the old approach with Dir.mktmpdir

@pboling
Copy link
Contributor

pboling commented Sep 18, 2024

Interesting... I think this is the result of some confusion.

The local copy is always the 1.00x, and is thus never displayed, and thus its version is never displayed.

The result you are seeing is from the git checkout of github's main branch of the code, (see benchmarks/Gemfile) and that one doesn't have the modified version.

@pboling
Copy link
Contributor

pboling commented Sep 18, 2024

Effectively you can compare local development (at whatever version you have checked out locally, which is never displayed) against github's main branch.

@tagliala
Copy link
Contributor Author

Effectively you can compare local development (at whatever version you have checked out locally, which is never displayed) against github's main branch.

If I do this change

$ git diff
diff --git a/lib/memo_wise/internal_api.rb b/lib/memo_wise/internal_api.rb
index 3a80a8b..504b643 100644
--- a/lib/memo_wise/internal_api.rb
+++ b/lib/memo_wise/internal_api.rb
@@ -1,5 +1,7 @@
 # frozen_string_literal: true
 
+raise 'Required'
+
 module MemoWise
   class InternalAPI
     # Create initial mutable state to store memoized values if it doesn't

Should the benchmark raise an error?

@pboling
Copy link
Contributor

pboling commented Sep 18, 2024

I expect puts MemoWise::VERSION to not succeed if I put it right after the doff_and_don call, but it prints 1.10.0

I can't explain that. Perhaps bundler is autoloading it somehow... I can't find any code that would load the gem prior to that line unless the require: false in the Gemfile isn't working.

@pboling
Copy link
Contributor

pboling commented Sep 18, 2024

Oh, I know the issue. Regarding the Local code & GitHub checkout...

The GitHub checkout of the gem is being copied, and re-namespaced, but the original checkout, with its file paths are still there, because it is in the Gemfile. Normally the point of doing require: false in the Gemfile is so that you can manually require the gem later when you need it, and for that it has to be in the load path. But that's not our use case...

So, we have a memo_wise library in the load path already...

Now when we do this in benchmarks.rb:

require_relative "../lib/memo_wise"

It does load the local entry-point of the gem.

But, and this is where it gets very interesting, the gem does not use require_relative to load its own internal files... (as is standard "good" practice for modern Ruby libraries; require is reserved for external files, never internal files).

So when the local gem does this:

require "memo_wise/internal_api"
require "memo_wise/version"

It picks up the github checkout of the gem, which is still there, known to Bundler.

The fix is, and this should be applied generally to every library and app in Ruby, to use require_relative always when loading an internal file, to do this:

require_relative "memo_wise/internal_api"
require_relative "memo_wise/version"

@pboling
Copy link
Contributor

pboling commented Sep 18, 2024

You have found a very niche bug in memo_wise, but it is easy to fix! 🌟

@pboling
Copy link
Contributor

pboling commented Sep 18, 2024

Which reminds me that I need to make the same fix in gem_bench (and many others of my gems), which I originally wrote 11 years ago... before require_relative existed!

require_relative is Ruby >= 1.9.2.

@tagliala
Copy link
Contributor Author

tagliala commented Sep 19, 2024

The fix is, and this should be applied generally to every library and app in Ruby, to use require_relative always when loading an internal file, to do this:

Thanks,

but I guess that in order to use GemBench, PR to hundreds of libraries should be done, given the fact that require_relative is not so used in this context

I've checked previously bundled gems like logger and it appears to have switch to require_relative for performance

ruby/logger@1e2aab4

There is an interesting case in webrick:

https://github.com/ruby/webrick/blame/master/lib/webrick.rb

with this commit: https://github.com/ruby/webrick/blame/f5faca9222541591e1a7c3c97552ebb0c92733c7/lib/webrick.rb#L214-L232

But Rails, SimpleForm (just to give some examples) are not using require_relative

@JacobEvelyn are you going to change to require_relative here?


Another interesting PR:

ruby/psych#522


activeadmin/arbre#622

tagliala added a commit to tagliala/arbre that referenced this issue Sep 19, 2024
`require_relative` is preferred over `require` for files within the same
project  because it uses paths relative to the current file, making code
more portable and less dependent on the load path.

This change updates internal requires to use `require_relative` for
consistency, performance, and improved portability.

Ref:
- ruby/psych#522
- ruby/logger#20
- ruby/rdoc#658
- panorama-ed/memo_wise#349
- rubocop/rubocop#8748
tagliala added a commit to tagliala/arbre that referenced this issue Sep 19, 2024
`require_relative` is preferred over `require` for files within the same
project because it uses paths relative to the current file, making code
more portable and less dependent on the load path.

This change updates internal requires to use `require_relative` for
consistency, performance, and improved portability.

Ref:
- ruby/psych#522
- ruby/logger#20
- ruby/rdoc#658
- panorama-ed/memo_wise#349
- rubocop/rubocop#8748
tagliala added a commit to activeadmin/arbre that referenced this issue Sep 19, 2024
`require_relative` is preferred over `require` for files within the same
project because it uses paths relative to the current file, making code
more portable and less dependent on the load path.

This change updates internal requires to use `require_relative` for
consistency, performance, and improved portability.

Ref:
- ruby/psych#522
- ruby/logger#20
- ruby/rdoc#658
- panorama-ed/memo_wise#349
- rubocop/rubocop#8748
tagliala added a commit to tagliala/memo_wise that referenced this issue Sep 19, 2024
`require_relative` is preferred over `require` for files within the same
project because it uses paths relative to the current file, making code
more portable and less dependent on the load path.

This change updates internal requires to use `require_relative` for
consistency, performance, and improved portability.

Ref:
- ruby/psych#522
- ruby/logger#20
- ruby/rdoc#658
- panorama-ed#349
- rubocop/rubocop#8748
tagliala added a commit to tagliala/memo_wise that referenced this issue Sep 19, 2024
`require_relative` is preferred over `require` for files within the same
project because it uses paths relative to the current file, making code
more portable and less dependent on the load path.

This change updates internal requires to use `require_relative` for
consistency, performance, and improved portability.

Ref:
- ruby/psych#522
- ruby/logger#20
- ruby/rdoc#658
- panorama-ed#349
- rubocop/rubocop#8748
@pboling
Copy link
Contributor

pboling commented Sep 19, 2024

I looked at what gem_bench does to copy gems in the Jersey class, and the required file path it uses is an absolute path starting with /, which is a different use case, and ignores the $LOAD_PATH of Ruby.

When it loads gems that themselves use require instead of require_relative (which is a lot of them) it can have problems like this, but that is unavoidable. The ruby community has a lot of work to do.

And that includes gem_bench itself loading itself... I will have a new release with that fix soon!

@pboling pboling mentioned this issue Sep 20, 2024
2 tasks
tagliala added a commit to tagliala/memo_wise that referenced this issue Sep 22, 2024
According to the order of run, we have 14% difference among executions
for `(a:, **kwargs)`

### GitHub main tested before local

```
$ bundle exec ruby benchmarks.rb 

Will BENCHMARK_GEMS:
	memo_wise-github-main (1.10.0)
	memo_wise-local (1.10.0.dev)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-darwin23]
Warming up --------------------------------------
memo_wise-github-main (1.10.0): (a:, **kwargs)
                       125.193k i/100ms
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                       143.649k i/100ms
Calculating -------------------------------------
memo_wise-github-main (1.10.0): (a:, **kwargs)
                          1.146M (±23.0%) i/s  (872.35 ns/i) -      5.508M in   5.067507s
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                          1.309M (±18.0%) i/s  (764.21 ns/i) -      6.464M in   5.169169s

Comparison:
memo_wise-local (1.10.0.dev): (a:, **kwargs):  1308539.2 i/s
memo_wise-github-main (1.10.0): (a:, **kwargs):  1146322.9 i/s - same-ish: difference falls within error

|Method arguments|`memo_wise-github-main` (1.10.0)|
|--|--|
|`(a:, **kwargs)`|1.14x|

|Method arguments||
|--|
|`(a:, **kwargs)`||
```

### local tested before GitHub main
```
$ bundle exec ruby benchmarks.rb 

Will BENCHMARK_GEMS:
	memo_wise-local (1.10.0.dev)
	memo_wise-github-main (1.10.0)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-darwin23]
Warming up --------------------------------------
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                       130.824k i/100ms
memo_wise-github-main (1.10.0): (a:, **kwargs)
                       150.842k i/100ms
Calculating -------------------------------------
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                          1.184M (±23.6%) i/s  (844.93 ns/i) -      5.756M in   5.139460s
memo_wise-github-main (1.10.0): (a:, **kwargs)
                          1.381M (±17.8%) i/s  (723.94 ns/i) -      6.637M in   5.026159s

Comparison:
memo_wise-github-main (1.10.0): (a:, **kwargs):  1381332.2 i/s
memo_wise-local (1.10.0.dev): (a:, **kwargs):  1183526.0 i/s - same-ish: difference falls within error

|Method arguments|`memo_wise-github-main` (1.10.0)|
|--|--|
|`(a:, **kwargs)`|0.86x|

|Method arguments||
|--|
|`(a:, **kwargs)`||
```

Ref: panorama-ed#349
tagliala added a commit to tagliala/memo_wise that referenced this issue Sep 22, 2024
According to the order of run, we have 14% difference among executions
for `(a:, **kwargs)`

### GitHub main tested before local

```
$ bundle exec ruby benchmarks.rb 

Will BENCHMARK_GEMS:
	memo_wise-github-main (1.10.0)
	memo_wise-local (1.10.0.dev)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-darwin23]
Warming up --------------------------------------
memo_wise-github-main (1.10.0): (a:, **kwargs)
                       125.193k i/100ms
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                       143.649k i/100ms
Calculating -------------------------------------
memo_wise-github-main (1.10.0): (a:, **kwargs)
                          1.146M (±23.0%) i/s  (872.35 ns/i) -      5.508M in   5.067507s
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                          1.309M (±18.0%) i/s  (764.21 ns/i) -      6.464M in   5.169169s

Comparison:
memo_wise-local (1.10.0.dev): (a:, **kwargs):  1308539.2 i/s
memo_wise-github-main (1.10.0): (a:, **kwargs):  1146322.9 i/s - same-ish: difference falls within error
```

|Method arguments|`memo_wise-github-main` (1.10.0)|
|--|--|
|`(a:, **kwargs)`|1.14x|

|Method arguments||
|--|
|`(a:, **kwargs)`||

### local tested before GitHub main
```
$ bundle exec ruby benchmarks.rb 

Will BENCHMARK_GEMS:
	memo_wise-local (1.10.0.dev)
	memo_wise-github-main (1.10.0)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-darwin23]
Warming up --------------------------------------
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                       130.824k i/100ms
memo_wise-github-main (1.10.0): (a:, **kwargs)
                       150.842k i/100ms
Calculating -------------------------------------
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                          1.184M (±23.6%) i/s  (844.93 ns/i) -      5.756M in   5.139460s
memo_wise-github-main (1.10.0): (a:, **kwargs)
                          1.381M (±17.8%) i/s  (723.94 ns/i) -      6.637M in   5.026159s

Comparison:
memo_wise-github-main (1.10.0): (a:, **kwargs):  1381332.2 i/s
memo_wise-local (1.10.0.dev): (a:, **kwargs):  1183526.0 i/s - same-ish: difference falls within error
```

|Method arguments|`memo_wise-github-main` (1.10.0)|
|--|--|
|`(a:, **kwargs)`|0.86x|

|Method arguments||
|--|
|`(a:, **kwargs)`||

Ref: panorama-ed#349
tagliala added a commit to tagliala/memo_wise that referenced this issue Sep 22, 2024
According to the order of run, we have 14% difference among executions
for `(a:, **kwargs)`

### GitHub main tested before local

```
$ bundle exec ruby benchmarks.rb 

Will BENCHMARK_GEMS:
	memo_wise-github-main (1.10.0)
	memo_wise-local (1.10.0.dev)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-darwin23]
Warming up --------------------------------------
memo_wise-github-main (1.10.0): (a:, **kwargs)
                       125.193k i/100ms
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                       143.649k i/100ms
Calculating -------------------------------------
memo_wise-github-main (1.10.0): (a:, **kwargs)
                          1.146M (±23.0%) i/s  (872.35 ns/i) -      5.508M in   5.067507s
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                          1.309M (±18.0%) i/s  (764.21 ns/i) -      6.464M in   5.169169s

Comparison:
memo_wise-local (1.10.0.dev): (a:, **kwargs):  1308539.2 i/s
memo_wise-github-main (1.10.0): (a:, **kwargs):  1146322.9 i/s - same-ish: difference falls within error
```

|Method arguments|`memo_wise-github-main` (1.10.0)|
|--|--|
|`(a:, **kwargs)`|1.14x|

### local tested before GitHub main
```
$ bundle exec ruby benchmarks.rb 

Will BENCHMARK_GEMS:
	memo_wise-local (1.10.0.dev)
	memo_wise-github-main (1.10.0)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-darwin23]
Warming up --------------------------------------
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                       130.824k i/100ms
memo_wise-github-main (1.10.0): (a:, **kwargs)
                       150.842k i/100ms
Calculating -------------------------------------
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                          1.184M (±23.6%) i/s  (844.93 ns/i) -      5.756M in   5.139460s
memo_wise-github-main (1.10.0): (a:, **kwargs)
                          1.381M (±17.8%) i/s  (723.94 ns/i) -      6.637M in   5.026159s

Comparison:
memo_wise-github-main (1.10.0): (a:, **kwargs):  1381332.2 i/s
memo_wise-local (1.10.0.dev): (a:, **kwargs):  1183526.0 i/s - same-ish: difference falls within error
```

|Method arguments|`memo_wise-github-main` (1.10.0)|
|--|--|
|`(a:, **kwargs)`|0.86x|

Ref: panorama-ed#349
@tagliala
Copy link
Contributor Author

tagliala commented Sep 22, 2024

@pboling beside the require_relative issue, there is another one related to significant variation of results according to the orders of execution

I've tried to summarize it here: tagliala/memo_wise@b849956

Reporting the commit message here in markdown for simplicity:

According to the order of run, we have 14% difference among executions
for (a:, **kwargs)

GitHub main tested before local

$ bundle exec ruby benchmarks.rb 

Will BENCHMARK_GEMS:
	memo_wise-github-main (1.10.0)
	memo_wise-local (1.10.0.dev)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-darwin23]
Warming up --------------------------------------
memo_wise-github-main (1.10.0): (a:, **kwargs)
                       125.193k i/100ms
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                       143.649k i/100ms
Calculating -------------------------------------
memo_wise-github-main (1.10.0): (a:, **kwargs)
                          1.146M (±23.0%) i/s  (872.35 ns/i) -      5.508M in   5.067507s
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                          1.309M (±18.0%) i/s  (764.21 ns/i) -      6.464M in   5.169169s

Comparison:
memo_wise-local (1.10.0.dev): (a:, **kwargs):  1308539.2 i/s
memo_wise-github-main (1.10.0): (a:, **kwargs):  1146322.9 i/s - same-ish: difference falls within error
Method arguments memo_wise-github-main (1.10.0)
(a:, **kwargs) 1.14x

local tested before GitHub main

$ bundle exec ruby benchmarks.rb 

Will BENCHMARK_GEMS:
	memo_wise-local (1.10.0.dev)
	memo_wise-github-main (1.10.0)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-darwin23]
Warming up --------------------------------------
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                       130.824k i/100ms
memo_wise-github-main (1.10.0): (a:, **kwargs)
                       150.842k i/100ms
Calculating -------------------------------------
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                          1.184M (±23.6%) i/s  (844.93 ns/i) -      5.756M in   5.139460s
memo_wise-github-main (1.10.0): (a:, **kwargs)
                          1.381M (±17.8%) i/s  (723.94 ns/i) -      6.637M in   5.026159s

Comparison:
memo_wise-github-main (1.10.0): (a:, **kwargs):  1381332.2 i/s
memo_wise-local (1.10.0.dev): (a:, **kwargs):  1183526.0 i/s - same-ish: difference falls within error
Method arguments memo_wise-github-main (1.10.0)
(a:, **kwargs) 0.86x

@pboling
Copy link
Contributor

pboling commented Sep 22, 2024

Interesting, so whichever is tested first has a significant (0.14x) advantage. I'm not sure what could be the cause of this...

@tagliala
Copy link
Contributor Author

Interesting, so whichever is tested first has a significant (0.14x) advantage. I'm not sure what could be the cause of this...

as you can notice, for some reasons, even if the latter is 14% faster, ips considers them "same-ish", like that 14% does not matter, which is surprising, but at the same time the correct behavior:

memo_wise-github-main (1.10.0): (a:, **kwargs):  1381332.2 i/s
memo_wise-local (1.10.0.dev): (a:, **kwargs):  1183526.0 i/s - same-ish: difference falls within error

@pboling
Copy link
Contributor

pboling commented Sep 24, 2024

Oh! I fully read those results wrong. It is the one that is run last that performs best each time.

I would guess that this is due to Ruby's internal optimizations, and that since the two versions of memo_wise tested are nearly identical, the one running second will benefit from the previous run due to various low-level Ruby optimizations.

There is supporting evidence of this as well, in that the first run benefits from itself, as in both cases whichever runs first has a much greater variance in iterations/second.

memo_wise-github-main (1.10.0): (a:, **kwargs)
                          1.146M (±23.0%) i/s  (872.35 ns/i) -      5.508M in   5.067507s
memo_wise-local (1.10.0.dev): (a:, **kwargs)
                          1.309M (±18.0%) i/s  (764.21 ns/i) -      6.464M in   5.169169s

and

memo_wise-local (1.10.0.dev): (a:, **kwargs)
                          1.184M (±23.6%) i/s  (844.93 ns/i) -      5.756M in   5.139460s
memo_wise-github-main (1.10.0): (a:, **kwargs)
                          1.381M (±17.8%) i/s  (723.94 ns/i) -      6.637M in   5.026159s

Around 23% variance for whichever ran first, vs 18% variance for whichever ran second. That's a delta of about a quarter in the variance between first and second runs, which tracks almost exactly in both cases.

I think what you are seeing is just Ruby being Ruby. Maybe there is a way to clear/reset Ruby in between runs, but then I'm not sure of the point of the warmup stage.

@pboling
Copy link
Contributor

pboling commented Sep 24, 2024

I think we can safely chalk this up to "all benchmarks are bullshit (to some degree)", and differences of 14% aren't anything to worry about, because they aren't that precise (in this kind of scenario).

@tagliala
Copy link
Contributor Author

but then I'm not sure of the point of the warmup stage.

This is surprising me too, too little warmup?

@pboling
Copy link
Contributor

pboling commented Sep 24, 2024

Perhaps bumping up the warmup could help... Ideally I'd expect to see similar variance for first and last runs.

Technically, with gem_bench you could even benchmark a gem against itself at the same exact version, and then the only difference possible would be how Ruby is optimized internally when it sees the same code (strings, symbols, etc).

@JacobEvelyn
Copy link
Member

Thanks for digging in! I'm open to trying a longer warmup period to try to stabilize the results a bit more. I can make a PR for that.

tagliala added a commit to tagliala/memo_wise that referenced this issue Sep 25, 2024
`require_relative` is preferred over `require` for files within the same
project because it uses paths relative to the current file, making code
more portable and less dependent on the load path.

This change updates internal requires to use `require_relative` for
consistency, performance, and improved portability.

Ref:
- ruby/psych#522
- ruby/logger#20
- ruby/rdoc#658
- panorama-ed#349
- rubocop/rubocop#8748
@tagliala
Copy link
Contributor Author

I can make a PR for that.

Does that actually help?

Answering myself: it does

    x.config(suite: suite, warmup: warmup)

x64 Ruby:

2 seconds: ~13%
3 seconds: ~8%
4 seconds: ~4%
5 seconds: ~2%
6 seconds: ~2%
7 seconds: ~2%

arm64 Ruby:

2 seconds: ~19%
3 seconds: ~11%
4 seconds: ~4%
5 seconds: ~0%
6 seconds: ~0%
7 seconds: ~0%

@pboling
Copy link
Contributor

pboling commented Sep 26, 2024

That is so awesome. Well done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants