-
Notifications
You must be signed in to change notification settings - Fork 18.1k
Commit 54358ad
committed
math: replace exported variables with runtime.KeepAlive in benchmarks
Use runtime.KeepAlive for preventing benchmarked code from being
eliminated as "dead" code (by marking its return values as live),
instead of relying on "package level exported sink variable" hacks that
are far from obvious to those not familiar with the internals of the Go
compilers and also do not seem to be robust in the long term (I mean,
it seems like an update to one of the Go implementations could easily
break benchmarks by doing dead code elimination of benchmarked code
regardless of those sink variables).
The discussion that led to this patch happening:
#33325
Also changed the BenchmarkDim benchmark a bit since it seemed very
possible it would be optimized out otherwise (because subtracting zero
from any number is a non-operation).
Some other benchmarks other than BenchmarkDim also experienced speedups
or slowdowns, which is quite mysterious to me in some cases (although I
have not checked the assembly) because KeepAlive is supposed to be
intrinsified?
Benchstat output, benchmarked with tip on Linux AMD64 on Intel i5-8300H:
name old time/op new time/op delta
Acos-8 13.6ns ± 0% 13.8ns ± 0% +1.72% (p=0.000 n=13+15)
Acosh-8 18.7ns ± 1% 19.1ns ± 0% +2.37% (p=0.000 n=14+13)
Asin-8 10.7ns ± 1% 11.0ns ± 1% +3.33% (p=0.000 n=14+15)
Asinh-8 24.6ns ± 1% 25.2ns ± 0% +2.44% (p=0.000 n=15+15)
Atan-8 6.68ns ± 0% 6.96ns ± 0% +4.24% (p=0.000 n=14+15)
Atanh-8 20.9ns ± 0% 21.1ns ± 0% +0.89% (p=0.000 n=15+15)
Atan2-8 12.4ns ± 1% 12.7ns ± 0% +2.36% (p=0.000 n=15+15)
Cbrt-8 10.9ns ± 0% 11.9ns ±11% +9.54% (p=0.000 n=13+15)
Ceil-8 2.01ns ± 0% 2.01ns ± 0% ~ (all equal)
Copysign-8 0.75ns ± 1% 0.75ns ± 0% ~ (p=0.902 n=14+16)
Cos-8 9.76ns ± 0% 9.88ns ± 0% +1.23% (p=0.000 n=15+13)
Cosh-8 13.7ns ± 2% 13.8ns ± 0% +0.83% (p=0.001 n=15+13)
Erf-8 6.86ns ± 0% 7.19ns ± 0% +4.84% (p=0.000 n=14+14)
Erfc-8 8.10ns ± 1% 8.29ns ± 1% +2.31% (p=0.000 n=15+16)
Erfinv-8 8.78ns ± 0% 9.08ns ± 0% +3.33% (p=0.000 n=14+14)
Erfcinv-8 8.95ns ± 0% 9.07ns ± 0% +1.39% (p=0.000 n=14+15)
Exp-8 8.56ns ± 0% 8.60ns ± 0% +0.48% (p=0.000 n=14+15)
ExpGo-8 22.5ns ± 1% 22.3ns ± 0% -0.54% (p=0.000 n=16+15)
Expm1-8 11.8ns ± 0% 12.1ns ± 0% +2.54% (p=0.000 n=16+14)
Exp2-8 20.8ns ± 0% 21.3ns ± 1% +2.23% (p=0.000 n=14+14)
Exp2Go-8 20.9ns ± 0% 21.7ns ± 1% +3.88% (p=0.000 n=14+15)
Abs-8 0.38ns ± 0% 0.38ns ± 0% ~ (p=0.082 n=15+13)
Dim-8 0.63ns ± 0% 1.21ns ±47% +92.18% (p=0.000 n=16+16)
Floor-8 2.01ns ± 1% 2.01ns ± 0% -0.14% (p=0.000 n=14+16)
Max-8 2.44ns ±12% 2.55ns ±31% ~ (p=0.187 n=16+14)
Min-8 3.02ns ±49% 3.41ns ±44% ~ (p=0.115 n=16+16)
Mod-8 41.9ns ±32% 37.4ns ±34% ~ (p=0.115 n=16+16)
Frexp-8 5.17ns ±39% 4.31ns ±26% ~ (p=0.116 n=16+14)
Gamma-8 11.5ns ±54% 10.8ns ±49% ~ (p=0.291 n=16+16)
Hypot-8 3.93ns ±23% 3.22ns ± 3% ~ (p=0.087 n=16+13)
HypotGo-8 5.92ns ±27% 6.25ns ±45% ~ (p=0.830 n=16+16)
Ilogb-8 3.59ns ±49% 3.73ns ±46% +4.01% (p=0.016 n=16+16)
J0-8 50.0ns ± 0% 59.9ns ±47% +19.74% (p=0.000 n=13+16)
J1-8 50.1ns ± 0% 50.8ns ± 0% +1.40% (p=0.000 n=13+13)
Jn-8 108ns ± 0% 108ns ± 0% ~ (all equal)
Ldexp-8 5.67ns ±15% 5.56ns ± 0% ~ (p=0.831 n=13+15)
Lgamma-8 14.3ns ±52% 13.5ns ±49% ~ (p=0.316 n=16+16)
Log-8 8.95ns ± 1% 10.24ns ±46% +14.49% (p=0.000 n=13+15)
Logb-8 4.00ns ±48% 3.92ns ±50% ~ (p=0.387 n=16+15)
Log1p-8 16.7ns ±43% 13.6ns ± 1% ~ (p=0.241 n=16+13)
Log10-8 11.2ns ± 2% 11.6ns ± 0% +3.43% (p=0.000 n=13+14)
Log2-8 5.90ns ±48% 6.32ns ±45% ~ (p=0.130 n=15+16)
Modf-8 4.73ns ±31% 4.70ns ±28% ~ (p=0.993 n=16+16)
Nextafter32-8 4.66ns ±37% 4.55ns ±31% ~ (p=0.299 n=16+16)
Nextafter64-8 3.37ns ±15% 3.95ns ±44% ~ (p=0.133 n=14+16)
PowInt-8 32.4ns ±34% 31.9ns ±37% ~ (p=0.830 n=16+16)
PowFrac-8 66.7ns ± 0% 77.3ns ±50% ~ (p=0.160 n=13+16)
Pow10Pos-8 1.01ns ± 1% 1.26ns ±21% +25.36% (p=0.000 n=16+15)
Pow10Neg-8 1.54ns ±41% 1.74ns ±28% ~ (p=0.069 n=16+16)
Round-8 2.53ns ±39% 2.47ns ±41% ~ (p=0.229 n=16+16)
RoundToEven-8 0.57ns ± 1% 0.63ns ± 2% +11.42% (p=0.000 n=13+13)
Remainder-8 27.4ns ± 1% 26.9ns ± 0% -1.80% (p=0.000 n=13+13)
Signbit-8 0.50ns ± 1% 0.75ns ± 0% +49.28% (p=0.000 n=13+13)
Sin-8 9.09ns ± 0% 9.49ns ±19% +4.45% (p=0.000 n=15+13)
Sincos-8 11.5ns ± 1% 11.7ns ± 0% +1.73% (p=0.000 n=15+13)
Sinh-8 13.7ns ± 3% 13.9ns ± 0% ~ (p=0.628 n=16+15)
SqrtIndirect-8 2.26ns ± 1% 2.45ns ±32% ~ (p=0.075 n=14+16)
SqrtLatency-8 3.27ns ± 1% 3.51ns ±46% ~ (p=0.490 n=14+14)
SqrtIndirectLatency-8 5.78ns ± 0% 6.03ns ± 0% +4.33% (p=0.000 n=16+16)
SqrtGoLatency-8 34.7ns ± 0% 35.0ns ± 0% +0.88% (p=0.000 n=16+16)
SqrtPrime-8 2.53µs ± 0% 2.56µs ± 0% +1.34% (p=0.000 n=16+16)
Tan-8 9.59ns ± 0% 9.88ns ± 0% +3.04% (p=0.000 n=16+15)
Tanh-8 14.3ns ± 3% 14.5ns ± 0% +1.58% (p=0.001 n=16+13)
Trunc-8 2.01ns ± 0% 2.01ns ± 0% ~ (all equal)
Y0-8 49.0ns ± 1% 49.7ns ± 0% +1.56% (p=0.000 n=16+15)
Y1-8 49.0ns ± 1% 49.7ns ± 0% +1.28% (p=0.000 n=16+15)
Yn-8 106ns ± 0% 106ns ± 0% ~ (all equal)
Float64bits-8 0.50ns ± 0% 0.30ns ± 1% -40.17% (p=0.000 n=16+16)
Float64frombits-8 0.50ns ± 0% 0.28ns ± 0% -43.82% (p=0.000 n=15+16)
Float32bits-8 0.25ns ± 0% 0.28ns ± 0% +12.25% (p=0.000 n=16+12)
Float32frombits-8 0.25ns ± 0% 0.25ns ± 0% ~ (p=0.704 n=16+16)1 parent fe8a866 commit 54358adCopy full SHA for 54358ad
File tree
Expand file treeCollapse file tree
1 file changed
+82
-226
lines changedFilter options
- src/math
Expand file treeCollapse file tree
1 file changed
+82
-226
lines changed
0 commit comments