-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Enable physical promotion by default #88090
JIT: Enable physical promotion by default #88090
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
The failure is a test bug. #88097 has the fix. |
/azp run runtime, runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, runtime-coreclr outerloop, Fuzzlyn |
Azure Pipelines successfully started running 5 pipeline(s). |
cc @dotnet/jit-contrib PTAL @AndyAyersMS. The failure is #87934. Diffs. TP impact ranges from 0.4% to 1.6%. I have analyzed the actual benchmark regressions using the perf lab reportings and created a report of them (both regressions and improvements) that includes diffs. The report is viewable at https://github.com/jakobbotsch/perf-diff-finder; it is too big to be rendered by GitHub's markdown renderer, but you can use the online codespaces to view it in VSCode. To do that press The regressions were identified by a Kusto query (thanks Andy) over the perf lab data. The query computes the median execution time of each benchmark over the past 7 days with and without physical promotion enabled. I then limited these to benchmarks taking more than 1 nanosecond that regressed by 3% or more. That returned a list of about 200 benchmarks. For each benchmark I used https://github.com/AndyAyersMS/InstructionsRetiredExplorer to find all hot functions (> 1% fraction of samples). This set was then further limited to the benchmarks that actually had physical promotions in a hot function. This reduced the set to the 26 that can be viewed in the report, for which I went through and analyzed the causes and left notes and the perf lab graphs in the report. Many of these I still classified as noisy, but there are definitely a few actual regressions in there. I also ran my tool for all improvements (
|
I assume you have also dug into some of the bigger diffs, eg x64 win asp.net's:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! I am really excited to see this enabled.
Seems like some of the analysis tooling you have built up could be very useful elsewhere too.
Generally the size costs come from there being more to do in the prolog/colder blocks getting things into the right registers. For partially promoted structs block copies are also larger in size because they usually involve the full block copy that was there before, plus also writing/reading all the fields, and this can frequently be more costly in terms of code size than the improvements. Of course we also create many new locals with live ranges that has significant impact on LSRA, so in lots of cases there are different spill choices made too. In this particular context physical promotion unlocks loop cloning, so we end up cloning a large loop, so code size is much worse while perf score is a bit better: - Total bytes of code 1988, prolog size 110, PerfScore 85670.97, instruction count 411, allocated bytes for code 1988 (MethodHash=945c7d3a) for method System.Reflection.MethodBase:CheckArguments(System.Span`1[System.Object],ulong,System.Span`1[ubyte],System.ReadOnlySpan`1[System.Object],System.RuntimeType[],System.Reflection.Binder,System.Globalization.CultureInfo,int):this (Tier1-OSR)
+ Total bytes of code 3504, prolog size 110, PerfScore 76167.30, instruction count 742, allocated bytes for code 3504 (MethodHash=945c7d3a) for method System.Reflection.MethodBase:CheckArguments(System.Span`1[System.Object],ulong,System.Span`1[ubyte],System.ReadOnlySpan`1[System.Object],System.RuntimeType[],System.Reflection.Binder,System.Globalization.CultureInfo,int):this (Tier1-OSR) |
/azp run runtime-coreclr gcstress0x3-gcstress0xc, runtime-coreclr gcstress-extra |
Azure Pipelines successfully started running 2 pipeline(s). |
The source is available in that repo, but it is of course quite tailored to the analysis I was doing. |
See #76928.
Fix #6534
Fix #6707
Fix #7576
Fix #32415
Fix #58522
Fix #68797
Fix #71510
Fix #71565
Fix #76928