JIT: Propagate flow into finally regions correctly in synthesis #114016

amanasifkhalid · 2025-03-28T16:28:55Z

Part of #107749. The first profile synthesis run happens before importation, so we don't model flow in and out of finally regions with flow edges yet. As a workaround, synthesis gives each finally region the same weight as its corresponding try region. When call-finally pairs are created, the tail inherits the weight of its head under the (faulty) assumption that all flow into a call-finally will return to the same pair. Once we have flow edges, we can compute the flow out of finally regions the same way as we compute flow elsewhere. It's important that synthesis models flow through finally regions via flow edges once we have them, or else flow through a loop that executes a finally might be lost, messing up the cyclic probability computation and flattening the loop's weight.

I noticed this issue after discovering that profile synthesis can disable profile consistency checking if it messed up the profile under the assumption that incorrect IL can have nonsensical flow. In such cases, synthesis will disable profile checks until the importer has run, after which the checks will be re-enabled. This quirk is specific to the pre-importation run of synthesis, so later runs can quietly disable consistency checks indefinitely, hiding bugs in synthesis (such as its inability to handle finally regions).

I want to disable this quirk for post-importation runs of synthesis, but there's one more issue with synthesis I have to resolve first: I'm seeing instances where synthesis computes cyclic probabilities close to the cap, but not quite exceeding it. Thus, synthesis doesn't flag the profile as approximate, but consistency checks find that the flow exiting a loop exceeds the flow entering it. I'm not sure if lowering the likelihood cap is a sustainable or desirable fix for this -- perhaps some more sophisticated detection of approximate consistency would be better.

Copilot

Copilot wasn't able to review any files in this pull request.

Files not reviewed (3)

src/coreclr/jit/fgprofile.cpp: Language not supported
src/coreclr/jit/fgprofilesynthesis.cpp: Language not supported
src/coreclr/jit/importer.cpp: Language not supported

dotnet-policy-service · 2025-03-28T16:29:42Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

AndyAyersMS · 2025-03-28T17:36:11Z

What do we do for the likelihoods coming out of a finally? Assume each is equally likely?

It might be better to weight the likelihoods based on the weights of the associated callfinallies. Eg if there are two callfinallies, A with weight 9 and B with weight 1, the edge from the retfinally to the continuation of A should have 0.9 likelihood.

amanasifkhalid · 2025-03-28T19:46:28Z

What do we do for the likelihoods coming out of a finally? Assume each is equally likely?

That seems to be the case:

runtime/src/coreclr/jit/importer.cpp

Line 12644 in 65f0139

newEdge->setLikelihood(1.0 / predCount);

Let me try your suggestion...

amanasifkhalid · 2025-03-31T14:04:16Z

/azp run runtime-coreclr libraries-pgo

azure-pipelines · 2025-03-31T14:04:30Z

Azure Pipelines successfully started running 1 pipeline(s).

amanasifkhalid · 2025-03-31T17:24:24Z

Aside from timeouts, I'm not seeing any failures in libraries-pgo.

@AndyAyersMS PTAL. Aside from libraries_tests, diffs aren't all that big; they seem to be mostly diffs in layout/lSRA. Thanks!

AndyAyersMS · 2025-03-31T17:33:38Z

src/coreclr/jit/fgbasic.cpp

+    // If the block has other successors, distribute the removed edge's likelihood among the remaining successor edges.
+    if (succCount > 1)
+    {
+        const weight_t likelihoodIncrease = succEdge->getLikelihood() / (succCount - 1);


Shouldn't these proportionally scale up?

Say there are 3 successors with likelihoods A 0.1, B 0.1, C 0.8. We remove A. Then we should have B = 0.1111..., C = 0.8888...

With these changes we'd get B = 0.15, C = 0.85, so the relative likelihood of B would increase.

Generally $p_{i,new} = p_{i,old} / (1 - p_{removed})$, unless $p_{removed}$ is 1.0 or close to 1.0, in which case equal distribution seems ok.

Good point; fixed

AndyAyersMS · 2025-03-31T17:41:42Z

src/coreclr/jit/fgbasic.cpp

    }

+    // If the block has other successors, distribute the removed edge's likelihood among the remaining successor edges.
+    if (succCount > 1)


amanasifkhalid · 2025-04-01T14:05:44Z

/azp run runtime-coreclr libraries-pgo

azure-pipelines · 2025-04-01T14:05:59Z

Azure Pipelines successfully started running 1 pipeline(s).

amanasifkhalid · 2025-04-02T00:51:09Z

@AndyAyersMS re: the profile consistency issue I mentioned yesterday, it does seem to be a floating-point precision issue when computing cyclic probabilities. Here's the smallest example I could find:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight      IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1       100 [000..01E)-> BB17(0.00595),BB14(0.994)   ( cond )                     i IBC hascall newarr
BB14 [0016]  2       BB01,BB34           192.52  19252 [01E..025)-> BB34(0.046),BB15(0.954) ( cond )                     i IBC bwd
BB15 [0017]  1       BB14                183.67  18367 [025..???)-> BB19(0.00334),BB27(0.997)   ( cond )                     IBC internal
BB27 [0029]  1       BB15                183.06  18306 [???..???)-> BB19(0.00334),BB28(0.997)   ( cond )                     IBC internal
BB28 [0030]  1       BB27                182.45  18245 [???..???)-> BB19(0.00334),BB29(0.997)   ( cond )                     IBC internal
BB29 [0031]  1       BB28                181.99  18199 [???..???)-> BB19(0.00251),BB31(0.997)   ( cond )                     IBC internal idxlen
BB31 [0033]  1       BB29                181.53  18153 [???..???)-> BB19(0.00251),BB32(0.997)   ( cond )                     IBC internal idxlen
BB32 [0034]  1       BB31                181.08  18108 [???..???)-> BB19(0.00251),BB18(0.997)   ( cond )                     IBC internal idxlen
BB18 [0020]  2       BB08,BB32            3931. 393088 [025..042)-> BB04(0.242),BB05(0.25),BB06(0.309),BB07(0.189),BB08(0.01)[def] (switch)                     i IBC idxlen bwd
BB04 [0004]  1       BB18                951.34  95134 [044..061)-> BB08(1)                 (always)                     i IBC hascall gcsafe idxlen bwd
BB05 [0005]  1       BB18                983.50  98350 [061..07E)-> BB08(1)                 (always)                     i IBC hascall gcsafe idxlen bwd
BB06 [0006]  1       BB18                 1213. 121339 [07E..09A)-> BB08(1)                 (always)                     i IBC hascall gcsafe idxlen bwd
BB07 [0007]  1       BB18                743.35  74335 [09A..0B4)-> BB08(1)                 (always)                     i IBC hascall gcsafe idxlen bwd
BB08 [0008]  5       BB04,BB05,BB06,BB07,BB18  3931. 393088 [0B4..0BF)-> BB18(0.954),BB34(0.046) ( cond )                     i IBC bwd
BB19 [0021]  7       BB15,BB25,BB27,BB28,BB29,BB31,BB32  69.71   6971 [025..042)-> BB24(0.242),BB23(0.25),BB22(0.309),BB21(0.189),BB25(0.01)[def] (switch)                     i IBC idxlen bwd
BB21 [0023]  1       BB19                 13.18   1318 [09A..0B4)-> BB25(1)                 (always)                     i IBC hascall gcsafe idxlen bwd
BB22 [0024]  1       BB19                 21.52   2152 [07E..09A)-> BB25(1)                 (always)                     i IBC hascall gcsafe idxlen bwd
BB23 [0025]  1       BB19                 17.44   1744 [061..07E)-> BB25(1)                 (always)                     i IBC hascall gcsafe idxlen bwd
BB24 [0026]  1       BB19                 16.87   1687 [044..061)-> BB25(1)                 (always)                     i IBC hascall gcsafe idxlen bwd
BB25 [0027]  5       BB19,BB21,BB22,BB23,BB24  69.71   6971 [0B4..0BF)-> BB19(0.954),BB34(0.046) ( cond )                     i IBC bwd
BB34 [0036]  3       BB08,BB14,BB25      192.67  19267 [0BF..0CC)-> BB14(0.994),BB17(0.00595)   ( cond )                     i IBC bwd
BB17 [0019]  2       BB01,BB34             1.15    115 [0CC..0D3)                           (return)                     i IBC
BB35 [0037]  0                             0         0 [???..???)                           (throw )                     i IBC rare keep internal
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

At some point, we gained some extra weight in the loop for which BB34 exits, because the method exit weight (115) doesn't match the entry weight (100). Here's the cyclic probability computation for the outer loop:

ccp: BB14 :: 1.0 (header)
ccp: BB15 :: 0.95405
ccp: BB27 :: 0.9508592
ccp: BB28 :: 0.947679
ccp: BB29 :: 0.9453009
ccp: BB31 :: 0.9429287
ccp: BB32 :: 0.9405625
ccp: BB19 :: 0.3621144 (nested header)
ccp: BB21 :: 0.0684773
ccp: BB22 :: 0.1117778
ccp: BB23 :: 0.09060025
ccp: BB24 :: 0.08763792
ccp: BB25 :: 0.3621144
ccp: BB18 :: 20.41789 (nested header)
ccp: BB07 :: 3.861107
ccp: BB06 :: 6.302613
ccp: BB05 :: 5.108514
ccp: BB04 :: 4.941482
ccp: BB08 :: 20.41789
ccp: BB34 :: 1.000791

All three loops have one test block each, and there isn't any EH flow to contend with, so I expect the exit block BB34's weight to match the weight of the header BB14, but instead the former slightly exceeds the latter. This issue hits for only a few contexts in benchmarks.run_pgo, and each case involves loop cloning introducing conditions with edge likelihoods with a lot of significant figures.

I tweaked the entry/exit residual check to detect failed convergence for synthesis runs after importation (enabling this check beforehand causes diffs, as we'll blend likelihoods and resynthesize more often). This feels like a quirk: Another option is to mimic the method entry/exit consistency check for each loop body: If the loop doesn't have EH flow, then we'd expect its entry flow to match its exit flow. There's nothing stopping this imprecision from cropping up for loop bodies with EH though, so neither option seems robust.

AndyAyersMS · 2025-04-02T01:46:42Z

Attach the full log (at least the synthesis related part) if you get a chance.

I wonder if we have some block whose likelihoods doesn't sum to 1.0, and this leads to the problem (though perhaps not, you would think this would lead us to underestimate the flow out of a loop)?

In fgDebugCheckOutgoingProfileData we allow some tolerance here; I wonder what happens if we insist the outgoing likelihood exactly sum to 1.0.

amanasifkhalid · 2025-04-02T02:30:46Z

Here's the JitDump:
dump.txt

In fgDebugCheckOutgoingProfileData we allow some tolerance here; I wonder what happens if we insist the outgoing likelihood exactly sum to 1.0.

Tightening this invariant didn't reveal anything.

AndyAyersMS · 2025-04-02T18:41:43Z

Right, we should see BB34 be exactly 1.0 here, not 1.000791, since there is a single loop exit. But the backedge drops below 1.0 so we don't notice.

I can't tell where things go wrong from the dump, perhaps because of rounding issues in the displayed values. BB18 is only reached via a chain of jumps, and BB19 has many preds, so likely one or both of those weights are slightly too high.

AndyAyersMS

Looks good for the most part.

AndyAyersMS · 2025-04-02T18:50:20Z

src/coreclr/jit/fgbasic.cpp


+    // Recompute the likelihoods of the block's other successor edges.
+    const weight_t removedLikelihood = succEdge->getLikelihood();
+    for (unsigned i = 0; (removedLikelihood != 1.0) && (i < (succCount - 1)); i++)


We still need to handle the case where removed likelihood is 1.0, don't we (probably rare)?

In that case we should just spread the likelihood equally.

Right, fixed

AndyAyersMS · 2025-04-02T18:50:29Z

src/coreclr/jit/fgbasic.cpp


+    // Recompute the likelihoods of the block's other successor edges.
+    const weight_t removedLikelihood = succEdge->getLikelihood();
+    for (unsigned i = 0; (removedLikelihood != 1.0) && (i < (succCount - 1)); i++)


Ditto here.

AndyAyersMS · 2025-04-02T21:33:06Z

src/coreclr/jit/fgbasic.cpp

+        // If we removed all of the flow out of 'block', distribute flow among the remaining edges evenly.
        const weight_t currLikelihood = succTab[i]->getLikelihood();
-        const weight_t newLikelihood  = currLikelihood / (1.0 - removedLikelihood);
+        const weight_t newLikelihood =


If we 're removing all the likelihood then currLikelihood for each survivor will be 0.

This needs to be 1 / succCount if removedLikelihood == 1.0.

Sorry I blanked on this -- fixed

AndyAyersMS

Still think this needs a bit more revision.

amanasifkhalid · 2025-04-03T14:37:18Z

/ba-g blocked by timeouts

amanasifkhalid added 3 commits March 28, 2025 10:10

Check BBJ_CALLFINALLYRET weight correctly

3e5a6d0

Propagate flow into finally regions correctly

1816412

Allocate cyclic probabilities array once

50abab8

Copilot AI review requested due to automatic review settings March 28, 2025 16:28

Copilot AI reviewed Mar 28, 2025

View reviewed changes

ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 28, 2025

dotnet-policy-service bot assigned amanasifkhalid Mar 28, 2025

amanasifkhalid mentioned this pull request Mar 28, 2025

JIT: Add post-morph profile repair phase #113896

Merged

This was referenced Mar 28, 2025

[QUIC & HTTP/3] Handshake Timeout on tests #104426

Closed

System.Net.Quic tests timeout #107761

Closed

System.Net.Requests test timeout #113883

Closed

Compute finally successor edge likelihoods using entry weights

4dfddff

This was referenced Mar 29, 2025

Build on Windows Fails sometimes with fatal error C1090: PDB API call failed #48070

Open

System.TimeoutException : The operation has timed out. dotnet/dnceng#5279

Closed

AndyAyersMS reviewed Mar 31, 2025

View reviewed changes

amanasifkhalid added 2 commits March 31, 2025 15:35

Feedback

5a39850

Enable likelihood checks for BBJ_EHFINALLYRET blocks

e3f4462

build-analysis bot mentioned this pull request Mar 31, 2025

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

3 tasks

Enable post-synthesis consistency checking

66be069

AndyAyersMS reviewed Apr 2, 2025

View reviewed changes

Handle removing dominant successor edge

b419dfd

AndyAyersMS reviewed Apr 2, 2025

View reviewed changes

Feedback

25dec79

AndyAyersMS approved these changes Apr 2, 2025

View reviewed changes

This was referenced Apr 3, 2025

System.Net.WebSockets.Client.Tests timeout #114153

Closed

System.Net.Security.Unit.Tests timeout #114176

Closed

amanasifkhalid merged commit a79768e into dotnet:main Apr 3, 2025
106 of 110 checks passed

amanasifkhalid deleted the fix-profile-synthesis branch April 3, 2025 14:38

amanasifkhalid mentioned this pull request Apr 3, 2025

JIT: Assertion failed '!"Inconsistent profile data"' during 'Importation' #114229

Closed

LoopedBard3 mentioned this pull request Apr 8, 2025

[Perf] Windows/x64: 1 Regression on 4/3/2025 2:37:57 PM +00:00 #114384

Closed

jakobbotsch mentioned this pull request Apr 16, 2025

[RuntimeAsync] Model async continuation definition via delay-free dotnet/runtimelab#3090

Merged

github-actions bot locked and limited conversation to collaborators May 4, 2025

JIT: Propagate flow into finally regions correctly in synthesis #114016

JIT: Propagate flow into finally regions correctly in synthesis #114016

Uh oh!

Conversation

amanasifkhalid commented Mar 28, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

dotnet-policy-service bot commented Mar 28, 2025

Uh oh!

AndyAyersMS commented Mar 28, 2025

Uh oh!

amanasifkhalid commented Mar 28, 2025

Uh oh!

amanasifkhalid commented Mar 31, 2025

Uh oh!

azure-pipelines bot commented Mar 31, 2025

Uh oh!

amanasifkhalid commented Mar 31, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amanasifkhalid commented Apr 1, 2025

Uh oh!

azure-pipelines bot commented Apr 1, 2025

Uh oh!

amanasifkhalid commented Apr 2, 2025

Uh oh!

AndyAyersMS commented Apr 2, 2025

Uh oh!

amanasifkhalid commented Apr 2, 2025

Uh oh!

AndyAyersMS commented Apr 2, 2025

Uh oh!

AndyAyersMS left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AndyAyersMS left a comment

Choose a reason for hiding this comment

Uh oh!

amanasifkhalid commented Apr 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants