-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does not suppress allocation & GC time as much as BenchmarkTools #78
Comments
You're benchmarking the runtime of a noisy function that takes >100ms with the default benchmarking runtime budget of 100ms. Consequently, all Charimarks can do while respecting your runtime budget is a single noisy evaluation. When you see strange results, it can often be helpful to run x@fedora:~$ julia
o | Version 1.11.0-alpha1 (2024-03-01)
o o | Official https://julialang.org/ release
julia> using Chairmarks
julia> @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
88.744 ms (3 allocs: 762.940 MiB, 4.49% gc time)
90.265 ms (3 allocs: 762.940 MiB, 6.25% gc time)
julia> @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
89.377 ms (3 allocs: 762.940 MiB, 4.55% gc time)
89.461 ms (3 allocs: 762.940 MiB, 5.54% gc time)
julia> using BenchmarkTools
julia> @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
91.414 ms (3 allocs: 762.940 MiB, 4.48% gc time)
92.910 ms (3 allocs: 762.940 MiB, 6.91% gc time)
julia> @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
90.399 ms (3 allocs: 762.940 MiB, 4.45% gc time)
90.649 ms (3 allocs: 762.940 MiB, 4.60% gc time)
julia> @btime rand(100, 10000, 100);
89.504 ms (3 allocations: 762.94 MiB)
julia> @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
91.491 ms (3 allocs: 762.940 MiB, 4.79% gc time)
95.220 ms (3 allocs: 762.940 MiB, 5.61% gc time)
julia> @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
91.431 ms (3 allocs: 762.940 MiB, 4.82% gc time)
91.563 ms (3 allocs: 762.940 MiB, 4.73% gc time) But if you see an unacceptable level of variability for your usecase and only a single sample, you should increase the runtime budget.
The current syntax for triggering garbage collection before benchmarking is julia> using Chairmarks
julia> GC.gc(); @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
90.845 ms (2 allocs: 762.940 MiB, 5.37% gc time)
98.337 ms (2 allocs: 762.940 MiB, 10.28% gc time)
julia> GC.gc(); @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
90.597 ms (2 allocs: 762.940 MiB, 4.96% gc time)
91.979 ms (2 allocs: 762.940 MiB, 5.56% gc time)
julia> using BenchmarkTools
julia> GC.gc(); @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
91.146 ms (2 allocs: 762.940 MiB, 5.64% gc time)
98.766 ms (2 allocs: 762.940 MiB, 10.12% gc time)
julia> GC.gc(); @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
91.029 ms (2 allocs: 762.940 MiB, 5.59% gc time)
91.210 ms (2 allocs: 762.940 MiB, 4.92% gc time)
julia> @btime rand(100, 10000, 100);
86.141 ms (2 allocations: 762.94 MiB)
julia> GC.gc(); @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
86.430 ms (2 allocs: 762.940 MiB)
91.915 ms (2 allocs: 762.940 MiB, 5.50% gc time)
julia> GC.gc(); @be rand(100, 10000, 100)
Benchmark: 2 samples with 1 evaluation
92.999 ms (2 allocs: 762.940 MiB, 5.36% gc time)
101.280 ms (2 allocs: 762.940 MiB, 10.08% gc time) Do you see an improvement when running |
No, none of them can make any different.
And then I tried 1.11.0-a1:
Sadly 1.11.0-a1 can't help, too. It even make this trick lose efficacy. |
You're benchmarking a function with a variable runtime that depends on wheather or not GC runs. BenchmarkTools manages to exclude GC time from that benchmark result on your machine. Reliably suppressing GC is not within the scope of Chairmarks at this time. I had hoped However, getting reliable measurements in the presence of GC is within scope, provided Chairmarks has an adequate runtime budget. Does setting
|
same...
(the min time cost changed permanently after execute
Quite a strange thing, in a magic way. |
I can reproduce strange and unpredictable GC behavior on 1.10.0 and 1.10.2 once I increase the memory pressure on my system. Pardon the hard to read logs. I still don't know exactly how to reproduce this so I'm copying exact logs when I succeed. julia> using Chairmarks
julia> @be rand(100, 10000, 100) seconds=1
Benchmark: 1 sample with 1 evaluation
663.473 ms (2 allocs: 762.940 MiB, 72.74% gc time)
julia> @be rand(100, 10000, 100) seconds=1
Benchmark: 2 samples with 1 evaluation
750.043 ms (2 allocs: 762.940 MiB, 78.03% gc time)
1.275 s (2 allocs: 762.940 MiB, 88.27% gc time)
julia> for _ in 1:30
rand(100, 10000, 100)
end
julia> @be rand(100, 10000, 100) seconds=1
Benchmark: 4 samples with 1 evaluation
97.464 ms (2 allocs: 762.940 MiB, 4.30% gc time)
100.342 ms (2 allocs: 762.940 MiB, 9.47% gc time)
322.246 ms (2 allocs: 762.940 MiB)
357.622 ms (2 allocs: 762.940 MiB, 70.14% gc time)
julia> @be rand(100, 10000, 100) seconds=1
Benchmark: 10 samples with 1 evaluation
min 89.993 ms (2 allocs: 762.940 MiB, 3.97% gc time)
median 95.832 ms (2 allocs: 762.940 MiB, 5.44% gc time)
mean 99.304 ms (2 allocs: 762.940 MiB, 6.43% gc time)
max 124.403 ms (2 allocs: 762.940 MiB, 15.31% gc time)
julia> using BenchmarkTools
julia> @btime rand(100, 10000, 100);
120.989 ms (2 allocations: 762.94 MiB)
julia> @be rand(100, 10000, 100) seconds=1
Benchmark: 2 samples with 1 evaluation
558.540 ms (2 allocs: 762.940 MiB)
1.197 s (2 allocs: 762.940 MiB, 86.95% gc time)
julia> @time rand(100, 10000, 100);
0.169106 seconds (2 allocations: 762.940 MiB)
julia> @time rand(100, 10000, 100);
0.521368 seconds (2 allocations: 762.940 MiB, 61.98% gc time)
julia> @time rand(100, 10000, 100);
0.112622 seconds (2 allocations: 762.940 MiB)
julia> @time rand(100, 10000, 100);
0.531731 seconds (2 allocations: 762.940 MiB, 61.01% gc time)
julia> versioninfo()
Julia Version 1.10.0
Commit 3120989f39b (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (aarch64-linux-gnu)
CPU: 8 × unknown
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, generic)
Threads: 1 on 8 virtual cores
----------------------
x@fedora:~$ julia +release --startup-file=no
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _' | |
| | |_| | | | (_| | | Version 1.10.2 (2024-03-01)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> using Chairmarks
julia> for size in [100, 300, 1000, 3000, 10000]
# (@b rand(100, size, 100)).gc_fraction == 0 && println(size)
display(@b rand(100, size, 100))
end
577.586 μs (2 allocs: 7.629 MiB)
1.787 ms (2 allocs: 22.888 MiB)
8.989 ms (2 allocs: 76.294 MiB, 4.03% gc time)
29.419 ms (2 allocs: 228.882 MiB, 4.76% gc time)
94.930 ms (2 allocs: 762.940 MiB, 4.78% gc time)
julia> for size in [100, 300, 1000, 3000, 10000]
# (@b rand(100, size, 100)).gc_fraction == 0 && println(size)
display(@b rand(100, size, 100))
end
848.213 μs (2 allocs: 7.629 MiB)
1.849 ms (2 allocs: 22.888 MiB)
9.638 ms (2 allocs: 76.294 MiB, 4.31% gc time)
30.085 ms (2 allocs: 228.882 MiB, 5.20% gc time)
99.414 ms (2 allocs: 762.940 MiB, 5.75% gc time)
julia> using BenchmarkTools
julia> @b rand(100, 10000, 100)
95.285 ms (2 allocs: 762.940 MiB, 8.78% gc time)
julia> @b rand(100, 10000, 100)
92.532 ms (2 allocs: 762.940 MiB, 6.28% gc time)
julia> @btime rand(100, 10000, 100);
85.244 ms (2 allocations: 762.94 MiB)
julia> @b rand(100, 10000, 100)
90.619 ms (2 allocs: 762.940 MiB)
<increase memory pressure>
julia> @b rand(100, 10000, 100)
649.771 ms (2 allocs: 762.940 MiB, 82.25% gc time)
julia> @b rand(100, 10000, 100)
244.912 ms (2 allocs: 762.940 MiB, without a warmup)
julia> @b rand(100, 10000, 100)
226.078 ms (2 allocs: 762.940 MiB, 58.83% gc time, without a warmup)
julia> @b rand(100, 10000, 100)
201.505 ms (2 allocs: 762.940 MiB, 52.92% gc time)
julia> @b rand(100, 10000, 100)
295.420 ms (2 allocs: 762.940 MiB, without a warmup)
julia> @b rand(100, 10000, 100)
224.846 ms (2 allocs: 762.940 MiB, 58.04% gc time, without a warmup)
julia> @b rand(100, 10000, 100)
612.017 ms (2 allocs: 762.940 MiB, 79.08% gc time)
julia> @b rand(100, 10000, 100) seconds=2
259.160 ms (2 allocs: 762.940 MiB)
julia> @b rand(100, 10000, 100) seconds=2
134.677 ms (2 allocs: 762.940 MiB)
julia> @b rand(100, 10000, 100) seconds=2
259.993 ms (2 allocs: 762.940 MiB)
julia> @btime rand(100, 10000, 100)
90.647 ms (2 allocations: 762.94 MiB)
100×10000×100 Array{Float64, 3}:
[output]
julia> @btime rand(100, 10000, 100);
88.128 ms (2 allocations: 762.94 MiB)
julia> @b rand(100, 10000, 100) seconds=10
86.466 ms (2 allocs: 762.940 MiB)
julia> @b rand(100, 10000, 100) seconds=10
89.588 ms (2 allocs: 762.940 MiB)
julia> @macroexpand @btime rand(100, 10000, 100);
julia> @macroexpand @btime rand(100, 10000, 100)
quote
#= /home/x/.julia/packages/BenchmarkTools/QAv7m/src/execution.jl:651 =#
local var"##251" = begin
#= /home/x/.julia/packages/BenchmarkTools/QAv7m/src/execution.jl:482 =#
BenchmarkTools.generate_benchmark_definition(Main, Symbol[], Any[], Symbol[], (), $(Expr(:copyast, :($(QuoteNode(:(rand(100, 10000, 100))))))), $(Expr(:copyast, :($(QuoteNode(nothing))))), $(Expr(:copyast, :($(QuoteNode(nothing))))), BenchmarkTools.Parameters())
end
#= /home/x/.julia/packages/BenchmarkTools/QAv7m/src/execution.jl:652 =#
(BenchmarkTools).warmup(var"##251")
#= /home/x/.julia/packages/BenchmarkTools/QAv7m/src/execution.jl:653 =#
(BenchmarkTools).tune!(var"##251")
#= /home/x/.julia/packages/BenchmarkTools/QAv7m/src/execution.jl:654 =#
local (var"##252", var"##253") = (BenchmarkTools).run_result(var"##251")
#= /home/x/.julia/packages/BenchmarkTools/QAv7m/src/execution.jl:655 =#
local var"##254" = (BenchmarkTools).minimum(var"##252")
#= /home/x/.julia/packages/BenchmarkTools/QAv7m/src/execution.jl:656 =#
local var"##255" = (BenchmarkTools).allocs(var"##254")
#= /home/x/.julia/packages/BenchmarkTools/QAv7m/src/execution.jl:657 =#
println(" ", (BenchmarkTools).prettytime((BenchmarkTools).time(var"##254")), " (", var"##255", " allocation", if var"##255" == 1
""
else
"s"
end, ": ", (BenchmarkTools).prettymemory((BenchmarkTools).memory(var"##254")), ")")
#= /home/x/.julia/packages/BenchmarkTools/QAv7m/src/execution.jl:668 =#
var"##253"
end
julia> @b rand(100, 10000, 100) seconds=10
85.512 ms (2 allocs: 762.940 MiB)
julia> @b rand(100, 10000, 100) seconds=2
134.735 ms (2 allocs: 762.940 MiB)
julia> @b rand(100, 10000, 100) seconds=2
145.472 ms (2 allocs: 762.940 MiB)
----------------
julia> @be rand(100, 10000, 100) seconds=10
Benchmark: 14 samples with 1 evaluation
min 217.497 ms (2 allocs: 762.940 MiB)
median 523.573 ms (2 allocs: 762.940 MiB, 24.84% gc time)
mean 680.660 ms (2 allocs: 762.940 MiB, 30.91% gc time)
max 2.090 s (2 allocs: 762.940 MiB, 72.89% gc time)
julia> @benchmark rand(100, 10000, 100)
BenchmarkTools.Trial: 47 samples with 1 evaluation.
Range (min … max): 84.599 ms … 168.411 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 96.469 ms ┊ GC (median): 7.84%
Time (mean ± σ): 100.978 ms ± 16.851 ms ┊ GC (mean ± σ): 8.95% ± 9.43%
▂█ ▂▂
███▄▆█▁▄██▆▁▁▁▄▄▆▄▄▄▁▁▁▄▄▁▁▁▄▄▆▁▄▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▁
84.6 ms Histogram: frequency by time 168 ms <
Memory estimate: 762.94 MiB, allocs estimate: 2.
julia> @be rand(100, 10000, 100) seconds=10
Benchmark: 60 samples with 1 evaluation
min 87.863 ms (2 allocs: 762.940 MiB)
median 147.470 ms (2 allocs: 762.940 MiB, 7.08% gc time)
mean 149.320 ms (2 allocs: 762.940 MiB, 18.94% gc time)
max 340.094 ms (2 allocs: 762.940 MiB, 72.89% gc time)
There's a lot going on here, some of which is stuff I'd like to fix, and some of which is stuff I don't want to or don't think it's plausible to fix
I think that variable allocation and GC time is what is causing all this. In a system with high memory pressure, allocation and GC time depend on how "available" memory is to the process and to the running Julia code. This availability may be at the OS or the Julia GC level, I'm not sure. In any event, something in the My remaining questions, then, are
|
The printed time cost is not a serious concept, but many people require them, and usually prefer a shorter one.
Yes, and more on this: I think improving the result stability (not enough as much as BenchmarkTools) on high-resource-required task benchmark have higher priority than figure out why version difference effect on these question. The reason that I choosed BenchmarkTools as the standard for testing Chairmarks is because the results given by BenchmarkTools is typically more stable than Chairmarks on special type task. julia> using Chairmarks, BenchmarkTools
julia> @be findSth()
4314 milliseconds
Benchmark: 1 sample with 1 evaluation
4.315 s (19 allocs: 632 bytes, without a warmup)
julia> @btime findSth()
4322 milliseconds
4316 milliseconds
4315 milliseconds
4315 milliseconds
4324 milliseconds
5091 milliseconds
4.325 s (12 allocations: 376 bytes) findSth() is a function including rand()/randn(), and print it's time cost. I also tried the test below from two side, amount of samples, and computer-resource-requirement: julia> @btime rand(100, 10000, 100);
124.924 ms (3 allocations: 762.94 MiB)
#we assume this is the standard temporaly
julia> @be rand(100, 10000, 100) seconds=15
Benchmark: 90 samples with 1 evaluation
min 163.741 ms (3 allocs: 762.940 MiB, 7.98% gc time)
median 166.733 ms (3 allocs: 762.940 MiB, 8.32% gc time)
mean 166.977 ms (3 allocs: 762.940 MiB, 8.31% gc time)
max 173.272 ms (3 allocs: 762.940 MiB, 8.70% gc time)
julia> @be rand(100, 10000, 100) seconds=150
Benchmark: 849 samples with 1 evaluation
min 162.024 ms (3 allocs: 762.940 MiB, 6.90% gc time)
median 166.415 ms (3 allocs: 762.940 MiB, 8.20% gc time)
mean 167.998 ms (3 allocs: 762.940 MiB, 8.24% gc time)
max 201.978 ms (3 allocs: 762.940 MiB, 18.03% gc time)
#124.924 and 162.024, obvious gap
#90 samples' result is not that different from 849 samples' result, especially on min-time-cost
julia> @btime rand(100, 1000, 100);
12.425 ms (3 allocations: 76.29 MiB)
julia> @be rand(100, 1000, 100) seconds=15
Benchmark: 847 samples with 1 evaluation
min 13.127 ms (3 allocs: 76.294 MiB, 4.52% gc time)
median 16.685 ms (3 allocs: 76.294 MiB, 9.60% gc time)
mean 16.730 ms (3 allocs: 76.294 MiB, 9.79% gc time)
max 35.706 ms (3 allocs: 76.294 MiB, 15.44% gc time)
#12.425 and 13.127, get closer
julia> @btime rand(100, 100, 100);
1.154 ms (3 allocations: 7.63 MiB)
julia> @be rand(100, 100, 100) seconds=1.5
Benchmark: 897 samples with 1 evaluation
min 1.160 ms (3 allocs: 7.629 MiB)
median 1.669 ms (3 allocs: 7.629 MiB, 15.98% gc time)
mean 1.631 ms (3 allocs: 7.629 MiB, 12.08% gc time)
max 3.863 ms (3 allocs: 7.629 MiB, 68.38% gc time)
#1.154 and 1.160 , almost same, and for this scale, we can think they're just same Hope them helps. |
Would you please try to reproduce the OP with Chairmarks 1.2 & the gc=false and seconds=5 keywords? This is what I get, but I've never been able to reproduce to OP reliably anyway, so IDK if it will also be fixed for you:
|
thx, it's fixed. |
Chairmarks.jl can't get the best performance before excuting the methods from BenchmarkTools.
See my test:
Most strangly, It seems like happenning randomly. Sometime this trick won't work, too.
Also, the performance fluctuating randomly even not doing anything, sometime can reach 200% times comparing with base time given by best-performanced
@b
and@btime
.A manual way for gc configuration may help?
The text was updated successfully, but these errors were encountered: