Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Public PGO config switch #70410

Closed
EgorBo opened this issue Jun 7, 2022 · 10 comments
Closed

Proposal: Public PGO config switch #70410

EgorBo opened this issue Jun 7, 2022 · 10 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@EgorBo
Copy link
Member

EgorBo commented Jun 7, 2022

Background and motivation

We want users to be able to easily enable the Dynamic PGO feature in .NET 7.0. Currently it's not as convenient as it could be: users have to define a set of environment variables and make sure those are defined in the process of app's execution (it's especially complicated for asp.net scenarios) - I have a small write-up here. We define three modes for PGO:

  1. Default - Current Default
  2. Full - All R2R images are ignored, everything is compiled and profiled from scratch. Noticeable startup regression with the most performant steady state.
  3. Dynamic or e.g. Hybrid or Mixed - JIT only instruments functions without R2R data. (NOTE: it doesn't mean such functions won't benefit from PGO - e.g. lots of BCL's code contains a "generic" profile baked into R2R which is then used by the tiered JIT for tier1) - a balance between start up time and performance. Eventually, we hope that this mode will be the default one.

A good visualization of the difference between these modes is this TE benchmark:
image
Full significantly regresses "time to first request" but provides the best performance. Dynamic still slightly regresses "time to first request" but eventually we plan to reduce overhead from instrumentation in tier0 and minimize it.

Proposal & Usage

Add a runtime switch for .csproj:

<PgoMode>full</PgoMode>

and runtimeconfig.json:

"System.Runtime.PgoMode": dynamic

(in dotnet/sdk project)

Where full basically sets DOTNET_ReadyToRun=false and DOTNET_TieredPGO=true internally and dynamic only defines DOTNET_TieredPGO=true (OSR and QJFL are already enabled by default in .NET 7.0 and omitted here)

Alternative Designs

  1. <OptimizeFor>
    Alternatively, we can encapsulate all of that into a generic <OptimizeFor> property with "FastStartup", "MaxPerformance", "SmallWorkingSet" values and rely on them in JIT/VM/BCL in various places, not just for PGO

  2. <DynamicPgo>true</DynamicPgo> and <IgnoreR2R>true</IgnoreR2R>
    Define two simple bool properties, where <IgnoreR2R>true</IgnoreR2R> is added only when FullPgo mode is requested.

cc @richlander @terrajobst @AndyAyersMS @davidwrighton @dotnet/jit-contrib

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 7, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jun 7, 2022
@ghost
Copy link

ghost commented Jun 7, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

We want users to be able to easily turn on the Dynamic PGO feature in .NET 7.0. Currently it's not as convenient as it could be: users have to define a set of environment variables and make sure those are defined in the process of app's execution - I have a small write-up here. We define three modes for PGO:

  1. Default - Current Default
  2. Full - All R2R images are ignored, everything is compiled and profiled from scratch. Noticeable startup regression with the most performant steady state.
  3. Dynamic or e.g. Hybrid or Mixed - We only instrument functions without R2R data. (NOTE: it doesn't mean such functions won't benefit from PGO - e.g. lots of BCL's code contains a "generic" profile baked into R2R which is then used by the tiered JIT for tier1) - a balance between start up time and performance. Eventually, we hope that this mode will be the default one as it balances between start up time and the steady state's performance.

A good visualization of the difference between these modes is this TE benchmark:
image
Full significantly regresses "time to first request" but provides the best performance. Dynamic still slightly regresses "time to first request" but eventually we plan to reduce overhead from instrumentation in tier0 and minimize it.

Proposal & Usage

Add a runtime switch for .csproj:

<PgoMode>full</PgoMode>

and runtimeconfig.json:

"System.Runtime.PgoMode": dynamic

(in dotnet/sdk project)

Where full basically sets DOTNET_ReadyToRun=false and DOTNET_TieredPGO=true internally and dynamic only defines DOTNET_TieredPGO=true (OSR and QJFL are already enabled by default in .NET 7.0 and omitted here)

Alternative Designs

  1. <OptimizeFor>
    Alternatively, we can encapsulate all of that into a generic <OptimizeFor> property with "FastStartup", "MaxPerformance", "SmallWorkingSet" values and rely on them in JIT/VM/BCL in various places, not just for PGO

  2. <DynamicPgo>true</DynamicPgo> and <IgnoreR2R>true</IgnoreR2R>
    Define two simple bool properties, where <IgnoreR2R>true</IgnoreR2R> is added only when FullPgo mode is requested.

cc @richlander @terrajobst @AndyAyersMS @davidwrighton @dotnet/jit-contrib

Author: EgorBo
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@richlander
Copy link
Member

I sort of wish there were two dynamic PGO modes:

  • Conservative: only re-compile the methods that we know will be profitable, aiming to conserve as much CPU on jitting as possible.
  • Go wild: Get as close to "Full" as possible but don't actually reject R2R first.

@EgorBo
Copy link
Member Author

EgorBo commented Jun 8, 2022

Go wild: Get as close to "Full" as possible but don't actually reject R2R first.

Technically it can be achieved if we instrument R2R'd code, but it's likely a huge binary size penalty (at least now)

@jkotas
Copy link
Member

jkotas commented Jun 8, 2022

Right, we would not want to instrument R2R code. You can instrument the hot methods once the process starts up, like:

  1. Use R2R code for process startup
  2. Once the process startups up, switch to instrumented code for hot methods
  3. Once you collect enough PGO data, create optimized JITed version

@grbell-ms
Copy link
Member

The other side of the coin is better visibility into what form of PGO is enabled:
#69820

@JulieLeeMSFT
Copy link
Member

@EgorBo are you proposing this change for .NET 7 or after .NET 7?

@EgorBo
Copy link
Member Author

EgorBo commented Jun 8, 2022

@EgorBo are you proposing this change for .NET 7 or after .NET 7?

For .NET 7.0. Ideally with Jan's idea eventually we won't need any switches and the default mode will be good as is (both "full" level of perf and R2R-level of startup) but we're not there and not sure it will make it to even 8.0. Currently some of our customers struggle enabling PGO properly to try it (even 1st parties) especially for aspnet so it'd be nice to provide them with a simple csproj/json key switch

@EgorBo EgorBo added this to the 7.0.0 milestone Jun 8, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jun 8, 2022
@AndyAyersMS
Copy link
Member

AndyAyersMS commented Jun 10, 2022

Seems likely that if we want to have R2R -> instrumented -> optimized we'll need to do a bunch of work to improve the efficiency of the instrumented code. Otherwise, apps will see perf start out reasonably good (R2R, especially composite), then dip -- perhaps substantially -- for a while (instrumented), then improve (optimized).

We've been trying to avoid modes where the performance can have dips like this.

For example, we will likely need to fully support code that is both instrumented and optimized. Some relevant issues:

@EgorBo
Copy link
Member Author

EgorBo commented Jun 29, 2022

I went ahead and filed PRs #71438 and dotnet/sdk#26350

In #70941 we realized that we can only expose a boolean property TieredPGO (it matches env.var name) which will provide FullPGO-level of perf and DynamicPGO-level of start up time so we won't be confusing customers on what to choose.

@richlander
Copy link
Member

Seems legit. Having a property (MSBuild and runtime config) seems like a good plan.

@EgorBo EgorBo closed this as completed Jul 18, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Aug 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

6 participants