JIT: use synthesis to repair some reconstruction issues #84312

AndyAyersMS · 2023-04-04T16:43:40Z

In particular, run synthesis in repair mode for cases where there are profile counts within the method but zero counts in fgFirstBB.

Recall that sparse profiling effectively probes return blocks to determine the method entry count.

So the zero-entry but not zero-everywhere case can happen if we have a method with a very long running loop plus sparse profiling plus OSR -- we will only get profile counts from the instrumented Tier0 method, and it will never return (instead it will always escape to an OSR version which will eventually return, but that version won't be instrumented).

I originally was a bit more ambitious and ran repair for a broader set of reconstruction issues, but lead to a large number of diffs, in part because repair doesn't cope well with irreducible loops.

Leaving the entry count zero can have fairly disastrous impact on the quality of optimizations done in the method.

Addresses quite a few of the worst-performing benchmarks in #84264.

In particular, run synthesis in repair mode for cases where there are profile counts within the method but zero counts in `fgFirstBB`. Recall that sparse profiling effectively probes return blocks to determine the method entry count. So the zero-entry but not zero-everywhere case can happen if we have a method with a very long running loop plus sparse profiling plus OSR -- we will only get profile counts from the instrumented Tier0 method, and it will never return (instead it will always escape to an OSR version which will eventually return, but that version won't be instrumented). I originally was a bit more ambitious and ran repair for a broader set of reconstruction issues, but lead to a large number of diffs, in part because repair doesn't cope well with irreducible loops. Leaving the entry count zero can have fairly disastrous impact on the quality of optimizations done in the method. Addresses quite a few of the worst-performing benchmarks in dotnet#84264.

ghost · 2023-04-04T16:43:53Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

In particular, run synthesis in repair mode for cases where there are profile counts within the method but zero counts in fgFirstBB.

Recall that sparse profiling effectively probes return blocks to determine the method entry count.

So the zero-entry but not zero-everywhere case can happen if we have a method with a very long running loop plus sparse profiling plus OSR -- we will only get profile counts from the instrumented Tier0 method, and it will never return (instead it will always escape to an OSR version which will eventually return, but that version won't be instrumented).

I originally was a bit more ambitious and ran repair for a broader set of reconstruction issues, but lead to a large number of diffs, in part because repair doesn't cope well with irreducible loops.

Leaving the entry count zero can have fairly disastrous impact on the quality of optimizations done in the method.

Addresses quite a few of the worst-performing benchmarks in #84264.

Author:	AndyAyersMS
Assignees:	AndyAyersMS
Labels:	`area-CodeGen-coreclr`
Milestone:	-

AndyAyersMS · 2023-04-04T16:44:31Z

Handful of diffs expected. If we had a PGO benchmarks collection we'd see quite a few more.

@EgorBo PTAL
cc @dotnet/jit-contrib

EgorBo

Two questions:

Is it possible today to detect stale static PGO data and should we still use it or discard completely?
Should we enable Dynamic PGO for coreclr_tests.run collection?

AndyAyersMS · 2023-04-04T22:28:50Z

Two questions:

Is it possible today to detect stale static PGO data and should we still use it or discard completely?

If the flowgraph for the method is quite different (that is, if we fail to find an edge in the flowgraph based on IL offsets from schema entries) we will hit the Mismatch cases and throw out all the data. We only expect this to happen with static PGO data but currently we don't assert that this must be so.

This can mean that some trivial/harmless edits to methods will lead to us tossing usable data. It is possible (though not easy) to build approximate matching algorithms that try and recognize when the graphs have the same shape but not the same identifying marks and use that to propagate the stale data. And I suppose do something simialr for class profiles if their IL offsets shift a bit. But not sure it's worth the trouble.

Should we enable Dynamic PGO for coreclr_tests.run collection?

Yes, we need to enable more PGO driven collections. Benchmarks would be good to have too.

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 4, 2023

ghost assigned AndyAyersMS Apr 4, 2023

AndyAyersMS requested a review from EgorBo April 4, 2023 16:44

EgorBo approved these changes Apr 4, 2023

View reviewed changes

AndyAyersMS merged commit 4b5491e into dotnet:main Apr 4, 2023

AndyAyersMS mentioned this pull request Apr 6, 2023

Investigate microbenchmarks that regress with PGO enabled #84264

Closed

ghost locked as resolved and limited conversation to collaborators May 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: use synthesis to repair some reconstruction issues #84312

JIT: use synthesis to repair some reconstruction issues #84312

AndyAyersMS commented Apr 4, 2023

ghost commented Apr 4, 2023

AndyAyersMS commented Apr 4, 2023

EgorBo left a comment

AndyAyersMS commented Apr 4, 2023

JIT: use synthesis to repair some reconstruction issues #84312

JIT: use synthesis to repair some reconstruction issues #84312

Conversation

AndyAyersMS commented Apr 4, 2023

ghost commented Apr 4, 2023

AndyAyersMS commented Apr 4, 2023

EgorBo left a comment

Choose a reason for hiding this comment

AndyAyersMS commented Apr 4, 2023