Skip to content

Commit 54358ad

Browse files
committed
math: replace exported variables with runtime.KeepAlive in benchmarks
Use runtime.KeepAlive for preventing benchmarked code from being eliminated as "dead" code (by marking its return values as live), instead of relying on "package level exported sink variable" hacks that are far from obvious to those not familiar with the internals of the Go compilers and also do not seem to be robust in the long term (I mean, it seems like an update to one of the Go implementations could easily break benchmarks by doing dead code elimination of benchmarked code regardless of those sink variables). The discussion that led to this patch happening: #33325 Also changed the BenchmarkDim benchmark a bit since it seemed very possible it would be optimized out otherwise (because subtracting zero from any number is a non-operation). Some other benchmarks other than BenchmarkDim also experienced speedups or slowdowns, which is quite mysterious to me in some cases (although I have not checked the assembly) because KeepAlive is supposed to be intrinsified? Benchstat output, benchmarked with tip on Linux AMD64 on Intel i5-8300H: name old time/op new time/op delta Acos-8 13.6ns ± 0% 13.8ns ± 0% +1.72% (p=0.000 n=13+15) Acosh-8 18.7ns ± 1% 19.1ns ± 0% +2.37% (p=0.000 n=14+13) Asin-8 10.7ns ± 1% 11.0ns ± 1% +3.33% (p=0.000 n=14+15) Asinh-8 24.6ns ± 1% 25.2ns ± 0% +2.44% (p=0.000 n=15+15) Atan-8 6.68ns ± 0% 6.96ns ± 0% +4.24% (p=0.000 n=14+15) Atanh-8 20.9ns ± 0% 21.1ns ± 0% +0.89% (p=0.000 n=15+15) Atan2-8 12.4ns ± 1% 12.7ns ± 0% +2.36% (p=0.000 n=15+15) Cbrt-8 10.9ns ± 0% 11.9ns ±11% +9.54% (p=0.000 n=13+15) Ceil-8 2.01ns ± 0% 2.01ns ± 0% ~ (all equal) Copysign-8 0.75ns ± 1% 0.75ns ± 0% ~ (p=0.902 n=14+16) Cos-8 9.76ns ± 0% 9.88ns ± 0% +1.23% (p=0.000 n=15+13) Cosh-8 13.7ns ± 2% 13.8ns ± 0% +0.83% (p=0.001 n=15+13) Erf-8 6.86ns ± 0% 7.19ns ± 0% +4.84% (p=0.000 n=14+14) Erfc-8 8.10ns ± 1% 8.29ns ± 1% +2.31% (p=0.000 n=15+16) Erfinv-8 8.78ns ± 0% 9.08ns ± 0% +3.33% (p=0.000 n=14+14) Erfcinv-8 8.95ns ± 0% 9.07ns ± 0% +1.39% (p=0.000 n=14+15) Exp-8 8.56ns ± 0% 8.60ns ± 0% +0.48% (p=0.000 n=14+15) ExpGo-8 22.5ns ± 1% 22.3ns ± 0% -0.54% (p=0.000 n=16+15) Expm1-8 11.8ns ± 0% 12.1ns ± 0% +2.54% (p=0.000 n=16+14) Exp2-8 20.8ns ± 0% 21.3ns ± 1% +2.23% (p=0.000 n=14+14) Exp2Go-8 20.9ns ± 0% 21.7ns ± 1% +3.88% (p=0.000 n=14+15) Abs-8 0.38ns ± 0% 0.38ns ± 0% ~ (p=0.082 n=15+13) Dim-8 0.63ns ± 0% 1.21ns ±47% +92.18% (p=0.000 n=16+16) Floor-8 2.01ns ± 1% 2.01ns ± 0% -0.14% (p=0.000 n=14+16) Max-8 2.44ns ±12% 2.55ns ±31% ~ (p=0.187 n=16+14) Min-8 3.02ns ±49% 3.41ns ±44% ~ (p=0.115 n=16+16) Mod-8 41.9ns ±32% 37.4ns ±34% ~ (p=0.115 n=16+16) Frexp-8 5.17ns ±39% 4.31ns ±26% ~ (p=0.116 n=16+14) Gamma-8 11.5ns ±54% 10.8ns ±49% ~ (p=0.291 n=16+16) Hypot-8 3.93ns ±23% 3.22ns ± 3% ~ (p=0.087 n=16+13) HypotGo-8 5.92ns ±27% 6.25ns ±45% ~ (p=0.830 n=16+16) Ilogb-8 3.59ns ±49% 3.73ns ±46% +4.01% (p=0.016 n=16+16) J0-8 50.0ns ± 0% 59.9ns ±47% +19.74% (p=0.000 n=13+16) J1-8 50.1ns ± 0% 50.8ns ± 0% +1.40% (p=0.000 n=13+13) Jn-8 108ns ± 0% 108ns ± 0% ~ (all equal) Ldexp-8 5.67ns ±15% 5.56ns ± 0% ~ (p=0.831 n=13+15) Lgamma-8 14.3ns ±52% 13.5ns ±49% ~ (p=0.316 n=16+16) Log-8 8.95ns ± 1% 10.24ns ±46% +14.49% (p=0.000 n=13+15) Logb-8 4.00ns ±48% 3.92ns ±50% ~ (p=0.387 n=16+15) Log1p-8 16.7ns ±43% 13.6ns ± 1% ~ (p=0.241 n=16+13) Log10-8 11.2ns ± 2% 11.6ns ± 0% +3.43% (p=0.000 n=13+14) Log2-8 5.90ns ±48% 6.32ns ±45% ~ (p=0.130 n=15+16) Modf-8 4.73ns ±31% 4.70ns ±28% ~ (p=0.993 n=16+16) Nextafter32-8 4.66ns ±37% 4.55ns ±31% ~ (p=0.299 n=16+16) Nextafter64-8 3.37ns ±15% 3.95ns ±44% ~ (p=0.133 n=14+16) PowInt-8 32.4ns ±34% 31.9ns ±37% ~ (p=0.830 n=16+16) PowFrac-8 66.7ns ± 0% 77.3ns ±50% ~ (p=0.160 n=13+16) Pow10Pos-8 1.01ns ± 1% 1.26ns ±21% +25.36% (p=0.000 n=16+15) Pow10Neg-8 1.54ns ±41% 1.74ns ±28% ~ (p=0.069 n=16+16) Round-8 2.53ns ±39% 2.47ns ±41% ~ (p=0.229 n=16+16) RoundToEven-8 0.57ns ± 1% 0.63ns ± 2% +11.42% (p=0.000 n=13+13) Remainder-8 27.4ns ± 1% 26.9ns ± 0% -1.80% (p=0.000 n=13+13) Signbit-8 0.50ns ± 1% 0.75ns ± 0% +49.28% (p=0.000 n=13+13) Sin-8 9.09ns ± 0% 9.49ns ±19% +4.45% (p=0.000 n=15+13) Sincos-8 11.5ns ± 1% 11.7ns ± 0% +1.73% (p=0.000 n=15+13) Sinh-8 13.7ns ± 3% 13.9ns ± 0% ~ (p=0.628 n=16+15) SqrtIndirect-8 2.26ns ± 1% 2.45ns ±32% ~ (p=0.075 n=14+16) SqrtLatency-8 3.27ns ± 1% 3.51ns ±46% ~ (p=0.490 n=14+14) SqrtIndirectLatency-8 5.78ns ± 0% 6.03ns ± 0% +4.33% (p=0.000 n=16+16) SqrtGoLatency-8 34.7ns ± 0% 35.0ns ± 0% +0.88% (p=0.000 n=16+16) SqrtPrime-8 2.53µs ± 0% 2.56µs ± 0% +1.34% (p=0.000 n=16+16) Tan-8 9.59ns ± 0% 9.88ns ± 0% +3.04% (p=0.000 n=16+15) Tanh-8 14.3ns ± 3% 14.5ns ± 0% +1.58% (p=0.001 n=16+13) Trunc-8 2.01ns ± 0% 2.01ns ± 0% ~ (all equal) Y0-8 49.0ns ± 1% 49.7ns ± 0% +1.56% (p=0.000 n=16+15) Y1-8 49.0ns ± 1% 49.7ns ± 0% +1.28% (p=0.000 n=16+15) Yn-8 106ns ± 0% 106ns ± 0% ~ (all equal) Float64bits-8 0.50ns ± 0% 0.30ns ± 1% -40.17% (p=0.000 n=16+16) Float64frombits-8 0.50ns ± 0% 0.28ns ± 0% -43.82% (p=0.000 n=15+16) Float32bits-8 0.25ns ± 0% 0.28ns ± 0% +12.25% (p=0.000 n=16+12) Float32frombits-8 0.25ns ± 0% 0.25ns ± 0% ~ (p=0.704 n=16+16)
1 parent fe8a866 commit 54358ad

File tree

1 file changed

+82
-226
lines changed

1 file changed

+82
-226
lines changed

0 commit comments

Comments
 (0)