-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve benchmark.cc, adding -ryu and -invert options. #72
Commits on Aug 10, 2018
-
benchmark.cc: Drop
if (throwaway == 12345)
.This code was present in the float benchmarking, but not double. I suspect that it was a relic of an earlier era when throwaway wasn't returned to main() for inspection.
Configuration menu - View commit details
-
Copy full SHA for 0c5e2d3 - Browse repository at this point
Copy the full SHA 0c5e2d3View commit details -
benchmark.cc: constexpr BUFFER_SIZE, avoid calloc().
BUFFER_SIZE should be a compile-time constant, not a modifiable global variable. Instead of dynamically allocating memory (and never freeing it), it's simpler to make buffer a global array and bufferown a local array on the stack.
Configuration menu - View commit details
-
Copy full SHA for 9d24a45 - Browse repository at this point
Copy the full SHA 9d24a45View commit details -
benchmark.cc: Simplify fcv() and dcv().
Part of the code already assumes that fcv() and dcv() update buffer. Instead of calling them again, we can simply compare bufferown and buffer to verify that Ryu and Grisu3 produce identical output.
Configuration menu - View commit details
-
Copy full SHA for 269871b - Browse repository at this point
Copy the full SHA 269871bView commit details -
benchmark.cc: Improve C++ style.
Directly name `struct mean_and_variance`. Prefer preincrement. Use auto for steady_clock::time_point (which is verbose, and type-safe, so there's little risk of getting confused about what it is). Use static_cast.
Configuration menu - View commit details
-
Copy full SHA for 7bbc986 - Browse repository at this point
Copy the full SHA 7bbc986View commit details -
benchmark.cc: Add "-ryu" for Ryu-only mode.
This mode still prints Grisu3 stats, as near-zero garbage.
Configuration menu - View commit details
-
Copy full SHA for 2a99749 - Browse repository at this point
Copy the full SHA 2a99749View commit details -
benchmark.cc: mean_and_variance member functions.
This replaces init() with C++11 default member initializers (less verbose than a constructor), changes update() and variance() into member functions, and adds stddev() so sqrt() doesn't need to be repeated later. One subtlety: Previously, update() said `int64_t n = ++mv.n;` and later read this local variable. Because this preincremented the data member before storing the local variable, and nothing else is happening, we can drop the local variable and simply read the data member.
Configuration menu - View commit details
-
Copy full SHA for 0e41d28 - Browse repository at this point
Copy the full SHA 0e41d28View commit details
Commits on Aug 11, 2018
-
benchmark.cc: Add "-invert" to switch loops.
Normally, the benchmark generates samples in an outer loop, and then it tests each sample in an inner loop for iterations. This allows mean and variance statistics to be collected for each sample, which is useful because some samples exercise different codepaths. However, repeatedly testing each sample causes Ryu's branches to be highly predictable, which is unrealistic for the real world. The new "-invert" option generates samples into a std::vector (to keep the Mersenne Twister out of the profiling). Then it has an outer loop for iterations, and its inner loop tests each sample once. This has realistic branch (mis)prediction characteristics. Note that when inverting, we divide the t2 - t1 duration (which is the total time taken to convert all samples in the vector) by samples, so the delta is the average time taken to convert a sample. This means that the printed Average values are comparable between normal mode and inverted mode (and the difference shows how costly branch misprediction is). The printed Stddev values aren't comparable, though - in normal mode, they measure the variation between samples (which is nonzero, even for an ideal machine, because Ryu does different amounts of work for different inputs), while in inverted mode, they measure the variation between each vector loop (which would ideally be zero for a perfectly deterministic machine). Example output: ``` C:\Temp\TESTING_X64>benchmark_clang Average & Stddev Ryu Average & Stddev Grisu3 32: 19.561 1.811 91.689 52.230 64: 29.868 1.882 106.720 89.150 C:\Temp\TESTING_X64>benchmark_clang -invert Average & Stddev Ryu Average & Stddev Grisu3 32: 31.694 1.187 117.464 4.087 64: 42.507 0.963 131.933 1.514 ```
Configuration menu - View commit details
-
Copy full SHA for cde1cd3 - Browse repository at this point
Copy the full SHA cde1cd3View commit details -
benchmark.cc: Improve "-ryu" and "-invert" output.
This avoids printing unnecessary fields.
Configuration menu - View commit details
-
Copy full SHA for 3b2fc51 - Browse repository at this point
Copy the full SHA 3b2fc51View commit details