JIT: Enable physical promotion by default #88090

jakobbotsch · 2023-06-27T12:48:12Z

Fix #6534
Fix #6707
Fix #7576
Fix #32415
Fix #58522
Fix #68797
Fix #71510
Fix #71565
Fix #76928

ghost · 2023-06-27T12:48:35Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Fix #6534
Fix #6707
Fix #7576
Fix #32415
Fix #58522
Fix #68797
Fix #71510
Fix #71565
Fix #76928

Author:	jakobbotsch
Assignees:	-
Labels:	`area-CodeGen-coreclr`
Milestone:	-

jakobbotsch · 2023-06-27T16:50:41Z

The failure is a test bug. #88097 has the fix.

jakobbotsch · 2023-06-27T22:36:10Z

/azp run runtime, runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, runtime-coreclr outerloop, Fuzzlyn

azure-pipelines · 2023-06-27T22:36:57Z

Azure Pipelines successfully started running 5 pipeline(s).

jakobbotsch · 2023-06-28T20:02:12Z

cc @dotnet/jit-contrib PTAL @AndyAyersMS. The failure is #87934.

Diffs. TP impact ranges from 0.4% to 1.6%.

I have analyzed the actual benchmark regressions using the perf lab reportings and created a report of them (both regressions and improvements) that includes diffs. The report is viewable at https://github.com/jakobbotsch/perf-diff-finder; it is too big to be rendered by GitHub's markdown renderer, but you can use the online codespaces to view it in VSCode. To do that press ., open physicalpromotion/regressions.md and then execute the Markdown: Open Preview command (hotkey ctrl-shift-V).

The regressions were identified by a Kusto query (thanks Andy) over the perf lab data. The query computes the median execution time of each benchmark over the past 7 days with and without physical promotion enabled. I then limited these to benchmarks taking more than 1 nanosecond that regressed by 3% or more. That returned a list of about 200 benchmarks. For each benchmark I used https://github.com/AndyAyersMS/InstructionsRetiredExplorer to find all hot functions (> 1% fraction of samples). This set was then further limited to the benchmarks that actually had physical promotions in a hot function.

This reduced the set to the 26 that can be viewed in the report, for which I went through and analyzed the causes and left notes and the perf lab graphs in the report. Many of these I still classified as noisy, but there are definitely a few actual regressions in there.

I also ran my tool for all improvements (physicalpromotion/improvements.md), with two key differences

The threshold was set to benchmarks that improve by 10% or more, which results in about 400 benchmarks from the query, for my tool to be able to finish generating the report overnight. For benchmarks that improve by 3% or more the query returns around 1500 rows (with presumably a large number of false positives, but the number is still around 8x the same number on the regression side). The set was further reduced to the 121 included in the report in the same way as above.
I did not go through and analyze these individually or attach the perf lab graphs to them. If you'd like to look at the perf lab graphs you can do so here (internal only; ping me if you don't have access) or here (public, but with no direct comparisons to standard perf lab runs).

AndyAyersMS · 2023-06-28T20:51:23Z

I assume you have also dug into some of the bigger diffs, eg x64 win asp.net's:

        1516 (76.26 % of base) : 96184.dasm - System.Reflection.MethodBase:CheckArguments(System.Span`1[System.Object],ulong,System.Span`1[ubyte],System.ReadOnlySpan`1[System.Object],System.RuntimeType[],System.Reflection.Binder,System.Globalization.CultureInfo,int):this (Tier1-OSR)

AndyAyersMS

Awesome! I am really excited to see this enabled.

Seems like some of the analysis tooling you have built up could be very useful elsewhere too.

jakobbotsch · 2023-06-28T21:05:32Z

I assume you have also dug into some of the bigger diffs, eg x64 win asp.net's:

Generally the size costs come from there being more to do in the prolog/colder blocks getting things into the right registers. For partially promoted structs block copies are also larger in size because they usually involve the full block copy that was there before, plus also writing/reading all the fields, and this can frequently be more costly in terms of code size than the improvements. Of course we also create many new locals with live ranges that has significant impact on LSRA, so in lots of cases there are different spill choices made too.

In this particular context physical promotion unlocks loop cloning, so we end up cloning a large loop, so code size is much worse while perf score is a bit better:

- Total bytes of code 1988, prolog size 110, PerfScore 85670.97, instruction count 411, allocated bytes for code 1988 (MethodHash=945c7d3a) for method System.Reflection.MethodBase:CheckArguments(System.Span`1[System.Object],ulong,System.Span`1[ubyte],System.ReadOnlySpan`1[System.Object],System.RuntimeType[],System.Reflection.Binder,System.Globalization.CultureInfo,int):this (Tier1-OSR)
+ Total bytes of code 3504, prolog size 110, PerfScore 76167.30, instruction count 742, allocated bytes for code 3504 (MethodHash=945c7d3a) for method System.Reflection.MethodBase:CheckArguments(System.Span`1[System.Object],ulong,System.Span`1[ubyte],System.ReadOnlySpan`1[System.Object],System.RuntimeType[],System.Reflection.Binder,System.Globalization.CultureInfo,int):this (Tier1-OSR)

jakobbotsch · 2023-06-28T21:34:05Z

/azp run runtime-coreclr gcstress0x3-gcstress0xc, runtime-coreclr gcstress-extra

azure-pipelines · 2023-06-28T21:34:24Z

Azure Pipelines successfully started running 2 pipeline(s).

jakobbotsch · 2023-06-29T09:12:59Z

Seems like some of the analysis tooling you have built up could be very useful elsewhere too.

The source is available in that repo, but it is of course quite tailored to the analysis I was doing.

cincuranet · 2023-07-04T11:58:11Z

Possible improvements:

[Perf] Windows/x64: 48 Improvements on 6/29/2023 8:27:03 AM perf-autofiling-issues#19475
[Perf] Windows/x64: 39 Improvements on 6/29/2023 8:27:03 AM perf-autofiling-issues#19528
[Perf] Linux/x64: 36 Improvements on 6/29/2023 8:27:03 AM perf-autofiling-issues#19437

Arm64:

[Perf] Windows/arm64: 51 Improvements on 6/29/2023 8:27:03 AM perf-autofiling-issues#19583
[Perf] Linux/arm64: 49 Improvements on 6/29/2023 8:27:03 AM perf-autofiling-issues#19587
[Perf] Windows/arm64: 18 Improvements on 6/29/2023 8:27:03 AM perf-autofiling-issues#19579
[Perf] Windows/arm64: 4 Improvements on 6/27/2023 11:39:34 PM perf-autofiling-issues#19577

JIT: Enable physical promotion by default

108b2ed

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 27, 2023

ghost assigned jakobbotsch Jun 27, 2023

jakobbotsch mentioned this pull request Jun 27, 2023

JIT: Generalized struct promotion #76928

Closed

40 tasks

build-analysis bot mentioned this pull request Jun 28, 2023

Test failure: Microsoft.Extensions.Logging.Generators.Tests.LoggerMessageGeneratorParserTests.NeedlessExceptionInMessage #87934

Closed

jakobbotsch marked this pull request as ready for review June 28, 2023 20:02

jakobbotsch requested a review from AndyAyersMS June 28, 2023 20:02

AndyAyersMS approved these changes Jun 28, 2023

View reviewed changes

BruceForstall approved these changes Jun 28, 2023

View reviewed changes

jakobbotsch merged commit 9dcc7b1 into dotnet:main Jun 29, 2023

jakobbotsch deleted the enable-physical-promotion branch June 29, 2023 05:25

jakobbotsch mentioned this pull request Jun 29, 2023

Use wider look-up table while searching chars on Arm64 #88183

Closed

cincuranet mentioned this pull request Jul 4, 2023

[Perf] Linux/x64: 11 Improvements on 6/30/2023 4:39:46 PM dotnet/perf-autofiling-issues#19440

Closed

This was referenced Jul 31, 2023

[Perf] Regressions in System.Collections.Tests.Perf_PriorityQueue<Int32, Int32> #76887

Closed

Regression in System.Text.Json.Tests.Perf_Get.GetDateTime #84228

Closed

jakobbotsch mentioned this pull request Aug 3, 2023

What's new in .NET 8 Preview 7 [WIP] dotnet/core#8438

Closed

3 tasks

ghost locked as resolved and limited conversation to collaborators Aug 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Enable physical promotion by default #88090

JIT: Enable physical promotion by default #88090

jakobbotsch commented Jun 27, 2023

ghost commented Jun 27, 2023

jakobbotsch commented Jun 27, 2023

jakobbotsch commented Jun 27, 2023

azure-pipelines bot commented Jun 27, 2023

jakobbotsch commented Jun 28, 2023 •

edited

Loading

AndyAyersMS commented Jun 28, 2023

AndyAyersMS left a comment

jakobbotsch commented Jun 28, 2023

jakobbotsch commented Jun 28, 2023

azure-pipelines bot commented Jun 28, 2023

jakobbotsch commented Jun 29, 2023

cincuranet commented Jul 4, 2023 •

edited by EgorBo

Loading

JIT: Enable physical promotion by default #88090

JIT: Enable physical promotion by default #88090

Conversation

jakobbotsch commented Jun 27, 2023

ghost commented Jun 27, 2023

jakobbotsch commented Jun 27, 2023

jakobbotsch commented Jun 27, 2023

azure-pipelines bot commented Jun 27, 2023

jakobbotsch commented Jun 28, 2023 • edited Loading

AndyAyersMS commented Jun 28, 2023

AndyAyersMS left a comment

Choose a reason for hiding this comment

jakobbotsch commented Jun 28, 2023

jakobbotsch commented Jun 28, 2023

azure-pipelines bot commented Jun 28, 2023

jakobbotsch commented Jun 29, 2023

cincuranet commented Jul 4, 2023 • edited by EgorBo Loading

jakobbotsch commented Jun 28, 2023 •

edited

Loading

cincuranet commented Jul 4, 2023 •

edited by EgorBo

Loading