Performance regression using new interpreter API #1550

gumb0 · 2020-09-23T10:02:35Z

We have been using WABT 1.0.12 interpreter as a baseline for our benchmarking while developing Fizzy.

Recently we tried comparing WABT 1.0.12 against 1.0.19 on our benchmarks (mostly arithmetic-heavy code, crypto algos), and it showed significant slowdown in the latter (execution up to 2-3 times slower in many benchmarks; instantiation also slower but only up to ~15% slower).

We don't rule out the possibility, that we use the new interpreter API in some suboptimal way, so we would appreciate, if some obvious issues with it were pointed out.

Our code is in https://github.com/wasmx/fizzy/blob/ce3c446e62acce5db245a1287b7a524956277b87/test/utils/wabt_engine.cpp (Note: we do create one Thread object to be used in all execute iterations; with the old API we similarly created single Executor)

Detailed benchmarking results are in wasmx/fizzy#443.

Thank you.

The text was updated successfully, but these errors were encountered:

aardappel · 2020-09-24T17:30:06Z

Note that the interpreter has been written for correctness and clarity, not performance. It is not unlikely that as new features get added, it will keep getting slower. The only way to not make it slow down is to emit specialized opcodes for all conditionals being introduced, which would very much go against the primary goals.

I know for example that with my recent addition of Memory64 support, extra conditionals were added to every memory access, and lookups to see which memory type we're dealing with. I certainly hope that alone wasn't a 2x slowdown :)

Maybe you could run your benchmark with various recent commits to isolate which one added the most slowdown?

chfast · 2020-09-25T09:08:36Z

I profiled wasm-interp and the issue is in "New interpreter (#1330)" introduced in 1.0.14.

I could not use our benchmarking tool and bigger set of inputs because the API changes a lot between 1.0.12 and 1.0.14 due to implementation of Wasm C API.

I ended up with using single input memset.wasm (memset.wat).

Time before: 0.003623
Time after: 0.009478 (2.61x)

I was not looking for reasons of the slowdown, but I noticed that BinOp function is not inlined in the new interpreter.

Perf stats

> perf stat -e cycles,instructions ./wasm-interp memset.wasm --run-all-exports

memset_bench() => i32:5100000

 Performance counter stats for './wasm-interp memset.wasm --run-all-exports':

        15 041 811      cycles
        39 211 115      instructions              #    2,61  insn per cycle

       0,003593214 seconds time elapsed

       0,003623000 seconds user
       0,000000000 seconds sys


> perf stat -e cycles,instructions ./wasm-interp-new memset.wasm --run-all-exports

memset_bench() => i32:5100000

 Performance counter stats for './wasm-interp-new memset.wasm --run-all-exports':

        39 301 171      cycles
       105 862 465      instructions              #    2,69  insn per cycle

       0,009431269 seconds time elapsed

       0,009478000 seconds user
       0,000000000 seconds sys

aardappel · 2020-09-25T20:13:48Z

@binji any idea what part of that could the biggest slowdown?

Would be good to run it under a profiler to see if there's any low-hanging fruit.

binji · 2020-10-21T22:43:44Z

IIRC there were some hacks in the old interpreter to try to keep more values in local variables while running the interpreter loop. The new interpreter is refactored to have a "Step" function instead, perhaps that's not being inlined/optimized. I'd guess there are a lot of opportunities to speed it up, but I'd worry a little about adding complexity.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance regression using new interpreter API #1550

Performance regression using new interpreter API #1550

gumb0 commented Sep 23, 2020

aardappel commented Sep 24, 2020

chfast commented Sep 25, 2020

aardappel commented Sep 25, 2020

binji commented Oct 21, 2020

Performance regression using new interpreter API #1550

Performance regression using new interpreter API #1550

Comments

gumb0 commented Sep 23, 2020

aardappel commented Sep 24, 2020

chfast commented Sep 25, 2020

Perf stats

aardappel commented Sep 25, 2020

binji commented Oct 21, 2020