benchmarks: add Taylor series for pi #482

axic · 2020-08-13T11:48:09Z

Depends on #474.

This benchmark uses only f64 instructions: const, add, div, mul, lt, ge. And two conversions: f64.convert_s/i32 and i64.trunc_u/f64.

axic · 2020-08-13T11:48:58Z

test/benchmarks/taylor.c

+    sum = 4.0 * sum;
+
+    // Display all 16 digits of double precision as a 64-bit integer
+    return sum * 10000000000000000ULL;


First I returned f64. Supporting that in wasm_engine/fizzy_engine is trivial, but changing the fizzy-bench parser seemed like a larger task so decided for this "workaround".

axic · 2020-08-13T11:49:41Z

test/benchmarks/taylor.c

+WASM_EXPORT unsigned long long taylor(unsigned n)
+{
+    double sum = 1.0;
+    int sign = -1;


Could make this double too to have sign changing op, but this way there's an int to float conversion.

Alternatively make this float sign so there's a promote instruction in the loop.

axic · 2020-08-13T15:27:58Z

test/benchmarks/taylor_pi.c

+    return sum;
+}
+
+WASM_EXPORT unsigned long long taylor_pi(unsigned n)


Doing this to avoid the need of f32/f64 in fizzy-bench.

codecov · 2020-08-14T15:22:54Z

Codecov Report

Merging #482 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #482   +/-   ##
=======================================
  Coverage   99.67%   99.67%           
=======================================
  Files          54       54           
  Lines       17180    17180           
=======================================
  Hits        17125    17125           
  Misses         55       55

chfast · 2020-08-14T16:06:42Z

test/benchmarks/taylor_pi.inputs

+
+31415916535897744
+
+pi_3000000_runs


Please keep single execution case for a start. Something that runs in milliseconds range.

This is on my machine:

fizzy/execute/taylor_pi/pi_1000000_runs 52600 us 51521 us 14 wabt/execute/taylor_pi/pi_1000000_runs 80834 us 79290 us 9 wasm3/execute/taylor_pi/pi_1000000_runs 12373 us 12234 us 58 fizzy/execute/taylor_pi/pi_3000000_runs 152631 us 150737 us 4 wabt/execute/taylor_pi/pi_3000000_runs 235717 us 233095 us 3 wasm3/execute/taylor_pi/pi_3000000_runs 36830 us 36516 us 19

What is your upper bound?

Leave the first one. The second runs over 2 seconds with sanitizers on CI.

axic · 2020-08-14T16:50:00Z

(module
  (type $t0 (func))
  (type $t1 (func (param i32) (result i64)))
  (func $__wasm_call_ctors (type $t0))
  (func $taylor_pi (export "taylor_pi") (type $t1) (param $p0 i32) (result i64)
    (local $l0 i64) (local $l1 i32) (local $l2 f64) (local $l3 f64)
    i64.const 40000000000000000
    set_local $l0
    block $B0
      block $B1
        get_local $p0
        i32.const 2
        i32.lt_u
        br_if $B1
        get_local $p0
        i32.const -1
        i32.add
        set_local $l1
        f64.const 0x1p+0 (;=1;)
        set_local $l2
        i32.const -1
        set_local $p0
        f64.const 0x1p+0 (;=1;)
        set_local $l3
        loop $L2
          get_local $l3
          get_local $p0
          f64.convert_s/i32
          get_local $l2
          get_local $l2
          f64.add
          f64.const 0x1p+0 (;=1;)
          f64.add
          f64.div
          f64.add
          set_local $l3
          get_local $l2
          f64.const 0x1p+0 (;=1;)
          f64.add
          set_local $l2
          i32.const 0
          get_local $p0
          i32.sub
          set_local $p0
          get_local $l1
          i32.const -1
          i32.add
          tee_local $l1
          br_if $L2
        end
        end
        get_local $l3
        f64.const 0x1p+2 (;=4;)
        f64.mul
        f64.const 0x1.1c37937e08p+53 (;=1e+16;)
        f64.mul
        tee_local $l2
        f64.const 0x1p+64 (;=1.84467e+19;)
        f64.lt
        get_local $l2
        f64.const 0x0p+0 (;=0;)
        f64.ge
        i32.and
        br_if $B0
        i64.const 0
        set_local $l0
      end
      get_local $l0
      return
    end
    get_local $l2
    i64.trunc_u/f64)
  (table $T0 1 1 anyfunc)
  (memory $memory (export "memory") 2)
  (global $g0 (mut i32) (i32.const 66560))
  (global $__heap_base (export "__heap_base") i32 (i32.const 66560))
  (global $__data_end (export "__data_end") i32 (i32.const 1024)))

axic · 2020-08-17T08:55:57Z

Agreed to make this use single precision.

axic · 2020-08-23T10:58:41Z

Here's the single-precision verison:

(module
  (type $t0 (func))
  (type $t1 (func (param i32) (result i64)))
  (func $__wasm_call_ctors (type $t0))
  (func $taylor_pi (export "taylor_pi") (type $t1) (param $p0 i32) (result i64)
    (local $l0 i64) (local $l1 i32) (local $l2 i32) (local $l3 f32) (local $l4 f32)
    i64.const 40000001090256896
    set_local $l0
    block $B0
      block $B1
        get_local $p0
        i32.const 2
        i32.lt_u
        br_if $B1
        i32.const -1
        set_local $l1
        i32.const 1
        set_local $l2
        f32.const 0x1p+0 (;=1;)
        set_local $l3
        loop $L2
          get_local $l3
          get_local $l1
          f32.convert_s/i32
          get_local $l2
          f32.convert_u/i32
          tee_local $l4
          get_local $l4
          f32.add
          f32.const 0x1p+0 (;=1;)
          f32.add
          f32.div
          f32.add
          set_local $l3
          i32.const 0
          get_local $l1
          i32.sub
          set_local $l1
          get_local $p0
          get_local $l2
          i32.const 1
          i32.add
          tee_local $l2
          i32.ne
          br_if $L2
        end
        get_local $l3
        f32.const 0x1p+2 (;=4;)
        f32.mul
        f32.const 0x1.1c3794p+53 (;=1e+16;)
        f32.mul
        tee_local $l3
        f32.const 0x1p+64 (;=1.84467e+19;)
        f32.lt
        get_local $l3
        f32.const 0x0p+0 (;=0;)
        f32.ge
        i32.and
        br_if $B0
        i64.const 0
        set_local $l0
      end
      get_local $l0
      return
    end
    get_local $l3
    i64.trunc_u/f32)
  (table $T0 1 1 anyfunc)
  (memory $memory (export "memory") 2)
  (global $g0 (mut i32) (i32.const 66560))
  (global $__heap_base (export "__heap_base") i32 (i32.const 66560))
  (global $__data_end (export "__data_end") i32 (i32.const 1024)))

Done.

axic commented Aug 13, 2020

View reviewed changes

axic force-pushed the bench-taylor branch from bd66371 to 6c6335e Compare August 13, 2020 15:21

axic commented Aug 13, 2020

View reviewed changes

axic force-pushed the bench-taylor branch 2 times, most recently from a468409 to 020bb75 Compare August 14, 2020 15:14

axic marked this pull request as ready for review August 14, 2020 15:23

axic changed the title ~~benchmarks: add taylor series for pi~~ benchmarks: add Taylor series for pi Aug 14, 2020

axic requested review from chfast and gumb0 August 14, 2020 15:23

chfast previously requested changes Aug 14, 2020

View reviewed changes

axic force-pushed the bench-taylor branch from 020bb75 to a42dce6 Compare August 23, 2020 10:57

axic requested a review from chfast August 23, 2020 10:58

chfast approved these changes Aug 23, 2020

View reviewed changes

benchmarks: add Taylor series for pi

e17e743

axic force-pushed the bench-taylor branch from a42dce6 to e17e743 Compare August 23, 2020 12:55

axic merged commit 744a469 into master Aug 23, 2020

axic deleted the bench-taylor branch August 23, 2020 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmarks: add Taylor series for pi #482

benchmarks: add Taylor series for pi #482

axic commented Aug 13, 2020 •

edited

Loading

axic Aug 13, 2020

axic Aug 13, 2020 •

edited

Loading

axic Aug 13, 2020

codecov bot commented Aug 14, 2020 •

edited

Loading

chfast Aug 14, 2020

axic Aug 14, 2020

chfast Aug 14, 2020

axic commented Aug 14, 2020 •

edited by chfast

Loading

axic commented Aug 17, 2020

axic commented Aug 23, 2020


		31415916535897744

		pi_3000000_runs

benchmarks: add Taylor series for pi #482

benchmarks: add Taylor series for pi #482

Conversation

axic commented Aug 13, 2020 • edited Loading

axic Aug 13, 2020

Choose a reason for hiding this comment

axic Aug 13, 2020 • edited Loading

Choose a reason for hiding this comment

axic Aug 13, 2020

Choose a reason for hiding this comment

codecov bot commented Aug 14, 2020 • edited Loading

Codecov Report

chfast Aug 14, 2020

Choose a reason for hiding this comment

axic Aug 14, 2020

Choose a reason for hiding this comment

chfast Aug 14, 2020

Choose a reason for hiding this comment

axic commented Aug 14, 2020 • edited by chfast Loading

axic commented Aug 17, 2020

axic commented Aug 23, 2020

axic commented Aug 13, 2020 •

edited

Loading

axic Aug 13, 2020 •

edited

Loading

codecov bot commented Aug 14, 2020 •

edited

Loading

axic commented Aug 14, 2020 •

edited by chfast

Loading