Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: MethodImplOptions.Cold for marking cold methods for the JIT #84333

Open
MichalPetryka opened this issue Apr 5, 2023 · 13 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Runtime.CompilerServices needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Milestone

Comments

@MichalPetryka
Copy link
Contributor

Background and motivation

Having methods that always perform expensive operations like for example growing methods in growable collections is common in high performance code like the BCL. It'd be useful to be able to mark those with a flag being the reverse of AggressiveInlining which would tell the JIT to always treat blocks that end up with a call to it as cold, heavily pessimize inlining it unless a method always ends up calling it and prefer hoisting branches with it to the end of the generated codegen, similar to what Throw Helpers currently do.

I've thought about this while working on #82146 and noticing the JIT placing the call to Grow which is expected to be rare and expensive above a singular mov which is expected to happen in 90% of cases.

I'm not really sure what the exact name of the flag should be, my ideas include: Cold, Unlikely, Rare, Expensive.

cc @tannergooding @EgorBo @AndyAyersMS

API Proposal

namespace System.Runtime.CompilerServices;

public enum MethodImplOptions
{
    Cold = 1024
}

API Usage

There are quite a few places in the BCL that have "always cold" methods that should be marked with it, for example List.Grow that always ends up allocating a new array and copying data to it:

/// <summary>
/// Increase the capacity of this list to at least the specified <paramref name="capacity"/>.
/// </summary>
/// <param name="capacity">The minimum capacity to ensure.</param>
internal void Grow(int capacity)
{
Debug.Assert(_items.Length < capacity);
int newCapacity = _items.Length == 0 ? DefaultCapacity : 2 * _items.Length;

Alternative Designs

Alternative designs include relying on PGO to detect such methods, which is less reliable, slows down startup and isn't as AOT friendly, or using more flexible method intrinsics from #4966 that are however harder to implement and require the library user, not author to mark the code with those.

Risks

Partial overlap with Assume intrinsics and PGO, more work for the JIT to do.

@MichalPetryka MichalPetryka added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Apr 5, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 5, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Apr 5, 2023
@ghost
Copy link

ghost commented Apr 5, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

Having methods that always perform expensive operations like for example growing methods in growable collections is common in high performance code like the BCL. It'd be useful to be able to mark those with a flag being the reverse of AggressiveInlining which would tell the JIT to always treat blocks that end up with a call to it as cold, heavily pessimize inlining it unless a method always ends up calling it and prefer hoisting branches with it to the end of the generated codegen, similar to what Throw Helpers currently do.

I've thought about this while working on #82146 and noticing the JIT placing the call to Grow which is expected to be rare and expensive above a singular mov which is expected to happen in 90% of cases.

I'm not really sure what the exact name of the flag should be, my ideas include: Cold, Unlikely, Rare, Expensive.

cc @tannergooding @EgorBo @AndyAyersMS

API Proposal

namespace System.Runtime.CompilerServices;

public enum MethodImplOptions
{
    Cold = 1024
}

API Usage

There are quite a few places in the BCL that have "always cold" methods that should be marked with it, for example List.Grow that always ends up allocating a new array and copying data to it:

/// <summary>
/// Increase the capacity of this list to at least the specified <paramref name="capacity"/>.
/// </summary>
/// <param name="capacity">The minimum capacity to ensure.</param>
internal void Grow(int capacity)
{
Debug.Assert(_items.Length < capacity);
int newCapacity = _items.Length == 0 ? DefaultCapacity : 2 * _items.Length;

Alternative Designs

Alternative designs include relying on PGO to detect such methods, which is less reliable, slows down startup and isn't as AOT friendly, or using more flexible method intrinsics from #4966 that are however harder to implement and require the library user, not author to mark the code with those.

Risks

Partial overlap with Assume intrinsics and PGO, more work for the JIT to do.

Author: MichalPetryka
Assignees: -
Labels:

api-suggestion, area-CodeGen-coreclr

Milestone: -

@EgorBo EgorBo added area-System.Runtime.CompilerServices and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Apr 5, 2023
@ghost
Copy link

ghost commented Apr 5, 2023

Tagging subscribers to this area: @dotnet/area-system-runtime-compilerservices
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

Having methods that always perform expensive operations like for example growing methods in growable collections is common in high performance code like the BCL. It'd be useful to be able to mark those with a flag being the reverse of AggressiveInlining which would tell the JIT to always treat blocks that end up with a call to it as cold, heavily pessimize inlining it unless a method always ends up calling it and prefer hoisting branches with it to the end of the generated codegen, similar to what Throw Helpers currently do.

I've thought about this while working on #82146 and noticing the JIT placing the call to Grow which is expected to be rare and expensive above a singular mov which is expected to happen in 90% of cases.

I'm not really sure what the exact name of the flag should be, my ideas include: Cold, Unlikely, Rare, Expensive.

cc @tannergooding @EgorBo @AndyAyersMS

API Proposal

namespace System.Runtime.CompilerServices;

public enum MethodImplOptions
{
    Cold = 1024
}

API Usage

There are quite a few places in the BCL that have "always cold" methods that should be marked with it, for example List.Grow that always ends up allocating a new array and copying data to it:

/// <summary>
/// Increase the capacity of this list to at least the specified <paramref name="capacity"/>.
/// </summary>
/// <param name="capacity">The minimum capacity to ensure.</param>
internal void Grow(int capacity)
{
Debug.Assert(_items.Length < capacity);
int newCapacity = _items.Length == 0 ? DefaultCapacity : 2 * _items.Length;

Alternative Designs

Alternative designs include relying on PGO to detect such methods, which is less reliable, slows down startup and isn't as AOT friendly, or using more flexible method intrinsics from #4966 that are however harder to implement and require the library user, not author to mark the code with those.

Risks

Partial overlap with Assume intrinsics and PGO, more work for the JIT to do.

Author: MichalPetryka
Assignees: -
Labels:

api-suggestion, area-System.Runtime.CompilerServices, untriaged

Milestone: -

@jkotas
Copy link
Member

jkotas commented Apr 5, 2023

Alternative designs include relying on PGO to detect such methods, which is less reliable, slows down startup and isn't as AOT friendly

At scale, PGO is going to be more reliable and cheaper than annotating methods manually. Also, PGO is AOT friendly and does not need to slow down the startup when the data is collected statically ahead of time.

@MichalPetryka
Copy link
Contributor Author

PGO can also say that a method is profitable to be inlined, yet AggressiveInlining is still being used with it. even in runtime code.

Even if static PGO can be collected for AOT apps, I doubt most apps will do so, afaik most native projects do not rely on PGO.

@EgorBo
Copy link
Member

EgorBo commented Apr 5, 2023

PGO can also say that a method is profitable to be inlined, yet AggressiveInlining is still being used with it. even in runtime code.

Even if static PGO can be collected for AOT apps, I doubt most apps will do so, afaik most native projects do not rely on PGO.

From my understanding your attribute should just sligtly improve code layout by moving call method to a cold section while AggressiveInlining can be a perf game changer. In an ideal world we won't need it too (someday).

@AndyAyersMS
Copy link
Member

I've thought about this while working on #82146 and noticing the JIT placing the call to Grow which is expected to be rare and expensive above a singular mov which is expected to happen in 90% of cases.

As I mentioned offline, if you merge something like this into the BCL, wait a few days for our profiling process to update the data that feeds into StandardOptimization.mibc and then take a fresh look at the codegen, it's quite likely our PGO process will have noticed calls to this method are cold.

See eg #49520 (comment) where we had similar discussions.

This process should work equally well for the library code shared by Native AOT, though I am not sure if the profile data gets fed into that toolchain currently.

To see similar effects directly during BCL development you can run a suitable benchmark with TieredPGO enabled and inspect the Tier1 codegen.

@MichalPetryka
Copy link
Contributor Author

From my understanding your attribute should just sligtly improve code layout by moving call method to a cold section while AggressiveInlining can be a perf game changer. In an ideal world we won't need it too (someday).

You could say the same about throw helpers, reordering them manually is always an option, yet the JIT handles them as cold separately.

@tannergooding
Copy link
Member

At scale, PGO is going to be more reliable and cheaper than annotating methods manually. Also, PGO is AOT friendly and does not need to slow down the startup when the data is collected statically ahead of time.

While I agree PGO should be the main choice for most scenarios, there are always going to be scenarios where it is not applicable or where ensuring it gets the "right" data is significant effort. Static PGO is entirely dependent on the workloads you process as part of collection. Additionally, while we may someday achieve better parity with the extended optimizations native compilers may provide, I find it extremely unlikely that we will surpass them such that developer provided hints are truly unnecessary.


.NET has multiple scenarios to consider, including both JIT and AOT, as well as a range of platforms (Windows, MacOS, Linux, Android, iOS, etc). Not all of our features (such as Dynamic PGO) are available everywhere, nor do we ourselves use some of our functionality everywhere (e.g. Static PGO is still off for MacOS by default, as far as I recall).

Native compilers, despite having significantly more time to spend doing intraprocedural analysis and optimizations in general still provide multiple types of these "guided optimizations". That is, they provide both a type of static PGO and a type of code hints provided via attributes, custom pragma/keywords, or more recently official language features covering features that have long existed across multiple supporting compilers implementations.

While most native code bases do not use these features, perf sensitive code bases and hot paths still do. Some much more extensively than others, particularly where the perf is extremely sensitive. With the introduction of official language features around these attributes, code bases are more likely to adopt them in the future as well. This has been seen repeatedly over the years with other features that have moved from being compiler specific to official features.
-- Microsoft's own mimalloc is one example that uses such hints extensively and where turning them off has a huge impact on the end benchmarks. For reference, mimalloc is a native memory allocator, which is used by Bing, Azure, Unreal Engine, and several other extremely perf sensitive applications.

Allowing developers to provide basic hints around "hot" (likely) vs "cold" (unlikely) is a natural next step for .NET and meshes nicely with the overall directions we've been making. The two biggest issues would be:

  1. How do we ensure this is "pay for play", particularly as the number of free bits for the specially encoded MethodImpl attribute is reduced
  2. How does such "developer provided PGO" interact with "static PGO" and "dynamic PGO"

For 1, I think the solution is relatively simple as we have existing prior art. We should simply use one of the remaining free bits to indicate that an "extended attribute" exists and should be resolved. This works much like how DllImport was upgrade to work with UnmanagedCallConvAttribute.

For 2, I think we just need to make a decision. In general, I would presume that the intent is "developer hints" take precedence over "static PGO" but that they are less preferred to "dynamic PGO". That is DPGO > HINT > SPGO. The exception would be if the developer wants to disable Dynamic PGO which is another feature we don't have a flag for which is likely desirable long term.

@jkotas
Copy link
Member

jkotas commented Apr 5, 2023

How do we ensure this is "pay for play", particularly as the number of free bits for the specially encoded MethodImpl attribute is reduced

The manual likely/unlikely annotations via method calls are more powerful. If we were to do something here, should we start with those? We do not need multiple mechanisms to do the same thing.

@tannergooding
Copy link
Member

If we were to do something here, should we start with those?

I think that would be reasonable and cover the same general need.

@MichalPetryka
Copy link
Contributor Author

The manual likely/unlikely annotations via method calls are more powerful. If we were to do something here, should we start with those? We do not need multiple mechanisms to do the same thing.

The thing is that a flag like this is slightly different:

  • it's declared by the callee, not caller, which reduces code polution and allows usage in library code without interaction from the user
  • it has a slightly different intent, it doesn't fully say something is unlikely, it just says that optimizing for this case won't matter in practice and it's better to optimize other case
  • a flag like this is simpler to specify exactly and implement afaik
  • method intrinsics also leave a bit of uncertainty, for example while they say something is unlikely, they don't really say something is expensive enough for the jit to for example hoist to the end of the method or for example to separate the branch off via hot-cold splitting.

@jkotas
Copy link
Member

jkotas commented Apr 5, 2023

it's declared by the callee, not caller, which reduces code polution and allows usage in library code without interaction from the user

The annotation via a method call can be the very first thing in a method to achieve the same effect.

it has a slightly different intent, it doesn't fully say something is unlikely, it just says that optimizing for this case won't matter in practice and it's better to optimize other case

Yes, there are multiple different variants of cold as you have pointed out. Almost never executed (e.g. throw helper), executed typically once (e.g. static constructor or one time initialization), executed relatively less often (your CollectionsMarshal.SetCount example). The optimization strategies are different for each of these cases. We want to be able to differentiate between these.

We do not want to have flags like this with overloaded meaning. We have overloaded flag like that today (MethodImplOptions.AggressiveOptimization) and the overloaded meaning makes it unusable for most practical purposes.

a flag like this is simpler to specify exactly and implement afaik

It is not that simple once you address the problem with shortage of MethodImplOptions bits that @tannergooding pointed out.

@tannergooding
Copy link
Member

it's declared by the callee, not caller, which reduces code polution and allows usage in library code without interaction from the user

Yes, this is potentially a useful distinction. However, it can be problematic particularly for public API surface in that often whether something is "likely" or "unlikely" is per caller. It's often only truly declarable by the callee in a few cases such as throw helpers or internal methods with a single callsite. In both of these cases, Unsafe.Likely and Unsafe.Unlikely would fill the same general need without much additional maintenance overhead on the user.

it doesn't fully say something is unlikely

Saying something is cold is saying it is unlikely to be repeatedly executed and therefore typically won't show up in the flamegraph of a profile. There are indeed different kinds of cold here from "never expected" to "expected once" to "expected rarely" and that may be worth considering.

method intrinsics also leave a bit of uncertainty, for example while they say something is unlikely, they don't really say something is expensive enough for the jit to for example hoist to the end of the method or for example to separate the branch off via hot-cold splitting.

They aren't that uncertain. They come with whatever semantics/heuristics we want to give them, much as they do in C/C++. Marking a block as unlikely (cold) or likely (hot), allows it to participate in the exact same things that PGO already does. The method attr would do the same thing, the difference being that Unsafe.Unlikely marks the block entrance where-as the metrhod attr would mark the containing block.

There are other complications with the method attr as well, such as determining what happens if the block contains more than just one cold call or if we wanted to expand to also allow annotating hot calls, etc.

@tannergooding tannergooding added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed untriaged New issue has not been triaged by the area owner labels Jun 24, 2024
@stephentoub stephentoub added this to the Future milestone Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Runtime.CompilerServices needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Projects
None yet
Development

No branches or pull requests

6 participants