Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve benchmark.cc, adding -ryu and -invert options. #72

Merged
merged 8 commits into from
Aug 11, 2018

Commits on Aug 10, 2018

  1. benchmark.cc: Drop if (throwaway == 12345).

    This code was present in the float benchmarking, but not double.
    I suspect that it was a relic of an earlier era when throwaway
    wasn't returned to main() for inspection.
    StephanTLavavej committed Aug 10, 2018
    Configuration menu
    Copy the full SHA
    0c5e2d3 View commit details
    Browse the repository at this point in the history
  2. benchmark.cc: constexpr BUFFER_SIZE, avoid calloc().

    BUFFER_SIZE should be a compile-time constant,
    not a modifiable global variable.
    
    Instead of dynamically allocating memory (and never freeing it),
    it's simpler to make buffer a global array and bufferown a local array
    on the stack.
    StephanTLavavej committed Aug 10, 2018
    Configuration menu
    Copy the full SHA
    9d24a45 View commit details
    Browse the repository at this point in the history
  3. benchmark.cc: Simplify fcv() and dcv().

    Part of the code already assumes that fcv() and dcv() update buffer.
    Instead of calling them again, we can simply compare bufferown and
    buffer to verify that Ryu and Grisu3 produce identical output.
    StephanTLavavej committed Aug 10, 2018
    Configuration menu
    Copy the full SHA
    269871b View commit details
    Browse the repository at this point in the history
  4. benchmark.cc: Improve C++ style.

    Directly name `struct mean_and_variance`.
    
    Prefer preincrement.
    
    Use auto for steady_clock::time_point (which is verbose, and type-safe,
    so there's little risk of getting confused about what it is).
    
    Use static_cast.
    StephanTLavavej committed Aug 10, 2018
    Configuration menu
    Copy the full SHA
    7bbc986 View commit details
    Browse the repository at this point in the history
  5. benchmark.cc: Add "-ryu" for Ryu-only mode.

    This mode still prints Grisu3 stats, as near-zero garbage.
    StephanTLavavej committed Aug 10, 2018
    Configuration menu
    Copy the full SHA
    2a99749 View commit details
    Browse the repository at this point in the history
  6. benchmark.cc: mean_and_variance member functions.

    This replaces init() with C++11 default member initializers (less
    verbose than a constructor), changes update() and variance() into
    member functions, and adds stddev() so sqrt() doesn't need to be
    repeated later.
    
    One subtlety: Previously, update() said `int64_t n = ++mv.n;` and later
    read this local variable. Because this preincremented the data member
    before storing the local variable, and nothing else is happening, we
    can drop the local variable and simply read the data member.
    StephanTLavavej committed Aug 10, 2018
    Configuration menu
    Copy the full SHA
    0e41d28 View commit details
    Browse the repository at this point in the history

Commits on Aug 11, 2018

  1. benchmark.cc: Add "-invert" to switch loops.

    Normally, the benchmark generates samples in an outer loop, and then it
    tests each sample in an inner loop for iterations. This allows mean and
    variance statistics to be collected for each sample, which is useful
    because some samples exercise different codepaths.
    
    However, repeatedly testing each sample causes Ryu's branches to be
    highly predictable, which is unrealistic for the real world.
    
    The new "-invert" option generates samples into a std::vector (to keep
    the Mersenne Twister out of the profiling). Then it has an outer loop
    for iterations, and its inner loop tests each sample once. This has
    realistic branch (mis)prediction characteristics.
    
    Note that when inverting, we divide the t2 - t1 duration (which is the
    total time taken to convert all samples in the vector) by samples, so
    the delta is the average time taken to convert a sample. This means
    that the printed Average values are comparable between normal mode
    and inverted mode (and the difference shows how costly branch
    misprediction is). The printed Stddev values aren't comparable,
    though - in normal mode, they measure the variation between samples
    (which is nonzero, even for an ideal machine, because Ryu does
    different amounts of work for different inputs), while in inverted
    mode, they measure the variation between each vector loop (which would
    ideally be zero for a perfectly deterministic machine).
    
    Example output:
    
    ```
    C:\Temp\TESTING_X64>benchmark_clang
        Average & Stddev Ryu  Average & Stddev Grisu3
    32:   19.561    1.811       91.689   52.230
    64:   29.868    1.882      106.720   89.150
    
    C:\Temp\TESTING_X64>benchmark_clang -invert
        Average & Stddev Ryu  Average & Stddev Grisu3
    32:   31.694    1.187      117.464    4.087
    64:   42.507    0.963      131.933    1.514
    ```
    StephanTLavavej committed Aug 11, 2018
    Configuration menu
    Copy the full SHA
    cde1cd3 View commit details
    Browse the repository at this point in the history
  2. benchmark.cc: Improve "-ryu" and "-invert" output.

    This avoids printing unnecessary fields.
    StephanTLavavej committed Aug 11, 2018
    Configuration menu
    Copy the full SHA
    3b2fc51 View commit details
    Browse the repository at this point in the history