Skip to content

Benchmark times should not display all significant figures. #10953

Closed
@huonw

Description

@huonw
#[feature(macro_rules, asm)];

extern mod extra;
use BH = extra::test::BenchHarness;
macro_rules! mk_benches(
    ( $($name:ident, $n:expr;)* ) => {

        $(
            #[bench]
            fn $name(bh: &mut BH) {
                bh.iter(|| for _ in range(0, $n) { unsafe {asm!("")}; } );
            }
            )*
    })

mk_benches! {
    a,           1;
    b,          40; // to get a meaningful time
    c,         100;
    d,       1_000;
    e,      10_000;
    f,     100_000;
    g,   1_000_000;
    h,  10_000_000;
    i, 100_000_000;
}

Compiled with rustc -O --test (using a rustc with #10952) shows

running 9 tests
test a ... bench:         2 ns/iter (+/- 1)
test b ... bench:        34 ns/iter (+/- 3)
test c ... bench:        62 ns/iter (+/- 8)
test d ... bench:       383 ns/iter (+/- 60)
test e ... bench:      3608 ns/iter (+/- 367)
test f ... bench:     35984 ns/iter (+/- 4093)
test g ... bench:    361734 ns/iter (+/- 98714)
test h ... bench:   3381412 ns/iter (+/- 281941)
test i ... bench:  34584793 ns/iter (+/- 2991062)

The long numbers are hard to read, and most of the digits are unnecessary. It would be clearer if we only showed a fixed precision, e.g., using engineering notation:

running 9 tests
test a ... bench:         2 ns/iter (+/-     1)
test b ... bench:        34 ns/iter (+/-     3)
test c ... bench:        62 ns/iter (+/-     8)
test d ... bench:       383 ns/iter (+/-    60)
test e ... bench:     3.6e3 ns/iter (+/-   367)
test f ... bench:    36.0e3 ns/iter (+/- 4.1e3)
test g ... bench:   361.7e3 ns/iter (+/-  99e3)
test h ... bench:     3.4e6 ns/iter (+/- 280e3)
test i ... bench:    34.6e6 ns/iter (+/- 3.0e6)

This specific notation has the advantages of easy local comparison, that is, tests that take approximately the same amount of time can easily be compared without accidentally missing 8.0e3 vs 1.0e4 (8e3 vs 10e3 under the proposed scheme).

This could also be achieved by changing the units (i.e. ns -> us -> ms), but I personally think that that difference is more subtle than necessary, but I don't have a particularly strong opinion on it.

running 9 tests
test a ... bench:         2 ns/iter (+/-    1)
test b ... bench:        34 ns/iter (+/-    3)
test c ... bench:        62 ns/iter (+/-    8)
test d ... bench:       383 ns/iter (+/-   60)
test e ... bench:       3.6 us/iter (+/- 0.37)
test f ... bench:      36.0 us/iter (+/-  4.1)
test g ... bench:     361.7 us/iter (+/-   99)
test h ... bench:       3.4 ms/iter (+/- 0.28)
test i ... bench:      34.6 ms/iter (+/-  3.0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions