Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] Boost hit count for outer back branch targets; improve conditional execution detection #83630

Merged
merged 4 commits into from
Mar 29, 2023

Conversation

kg
Copy link
Member

@kg kg commented Mar 18, 2023

When we jit compile a trace and it tries to back branch outside of itself, this will cause a bailout. In cases with nested loops the outer target's hit count may take a long time to reach the compile threshold, but ideally we would compile a trace starting at that outer target so that we can spend more time inside of traces. This could happen for a less-frequently-called method that has millions of internal loop iterations, for example - like Benchstone.BenchF.FFT. In local test runs this appears to improve that benchmark's performance a bit and the bailout counts indicate that it goes down from millions of back-branch bailouts to tens of thousands, so it seems to work there.

This may increase startup time and memory usage a bit because we compile traces sooner, but it should balance out since the new traces will usually perform better. Adding the monitoring phase will help cull any cases where the new traces are slower.

Also, the jiterpreter's original logic only cared whether we were in a branch block when deciding whether a ret or call would execute, but any instruction after a branch instruction is conditionally executed. Improving this heuristic should fix some spurious compile aborts in certain functions (this affected a couple of the FFT benchmarks, for example)

EDIT: No merge because I want to delay this until after P3.

@kg kg added arch-wasm WebAssembly architecture area-Codegen-Jiterpreter-mono labels Mar 18, 2023
@ghost ghost assigned kg Mar 18, 2023
@ghost
Copy link

ghost commented Mar 18, 2023

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

When we jit compile a trace and it tries to back branch outside of itself, this will cause a bailout. In cases with nested loops the outer target's hit count may take a long time to reach the compile threshold, but ideally we would compile a trace starting at that outer target so that we can spend more time inside of traces. This could happen for a less-frequently-called method that has millions of internal loop iterations, for example - like Benchstone.BenchF.FFT. In local test runs this appears to improve that benchmark's performance a bit and the bailout counts indicate that it goes down from millions of back-branch bailouts to tens of thousands, so it seems to work there.

This may increase startup time and memory usage a bit because we compile traces sooner, but it should balance out since the new traces will usually perform better. Adding the monitoring phase will help cull any cases where the new traces are slower.

Also, the jiterpreter's original logic only cared whether we were in a branch block when deciding whether a ret or call would execute, but any instruction after a branch instruction is conditionally executed. Improving this heuristic should fix some spurious compile aborts in certain functions (this affected a couple of the FFT benchmarks, for example)

Author: kg
Assignees: -
Labels:

arch-wasm, area-Codegen-Jiterpreter-mono

Milestone: -

@kg kg added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Mar 18, 2023
kg added 2 commits March 24, 2023 18:45
…e and it is a trace prepare point, boost its hit count

This will cause it to get jitted sooner
@kg kg removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Mar 25, 2023
@kg kg force-pushed the wasm-jiterpreter-conditional-exec-flag branch from 0c491f0 to aa55a35 Compare March 25, 2023 02:15
@kg
Copy link
Member Author

kg commented Mar 25, 2023

Timings for browser-bench:

scenario PR main % of base
Json, non-ASCII text serialize 2.0845ms 2.1645ms 96%
Json, non-ASCII text deserialize 3.6811ms 3.6340ms 101%
Json, small serialize 0.0960ms 0.0905ms 106%
Json, small deserialize 0.1768ms 0.1767ms 100%
Json, large serialize 25.8281ms 24.3029ms 106%
Json, large deserialize 47.7798ms 48.1759ms 99%
Span, Reverse bytes 0.0660ms 0.0660ms 100%
Span, Reverse chars 0.1050ms 0.1052ms 100%
Span, IndexOf bytes 0.5279us 0.5397us 98%
Span, IndexOf chars 0.0423ms 0.0423ms 100%
Span, SequenceEqual bytes 0.0294ms 0.0294ms 100%
Span, SequenceEqual chars 0.0587ms 0.0590ms 99%
String, Normalize 0.6263ms 0.6293ms 100%
String, Normalize ASCII 0.1854ms 0.0448ms 414%
Vector, Create Vector128 0.0798us 0.0852us 94%
Vector, Add 2 Vector128's 0.0954us 0.1017us 94%
Vector, Multiply 2 Vector128's 0.0965us 0.1026us 94%
Vector, Dot product int 0.0895us 0.0947us 95%
Vector, Dot product ulong 0.0758us 0.0831us 91%
Vector, Dot product float 0.1006us 0.1059us 95%
Vector, Dot product double 0.0814us 0.0870us 94%
Vector, Sum sbyte 0.1322us 0.1383us 96%
Vector, Sum short 0.1016us 0.1069us 95%
Vector, Sum uint 0.0786us 0.0852us 92%
Vector, Sum double 0.0805us 0.0851us 95%
Vector, Min float 0.1284us 0.1333us 96%
Vector, Max float 0.1299us 0.1349us 96%
Vector, Min double 0.1019us 0.1082us 94%
Vector, Max double 0.1038us 0.1095us 95%
Vector, Normalize float 0.4104us 0.4117us 100%

Looking into Normalize ASCII

@kg kg added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Mar 25, 2023
@kg kg removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Mar 25, 2023
@kg kg merged commit 043d3e7 into dotnet:main Mar 29, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Apr 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants