Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal API suggestion: bool JitHelpers.IsKnownConstant(uint) to let us optimize certain code paths #11484

Closed
GrabYourPitchforks opened this issue Nov 15, 2018 · 12 comments · Fixed by #63734
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI design-discussion Ongoing discussion about design without consensus optimization
Milestone

Comments

@GrabYourPitchforks
Copy link
Member

There are quite a few places in the Framework where we have specialized knowledge of data structures and algorithms - beyond what the JIT has - where we'd be able to make optimizations beyond what the JIT can provide if the JIT is able to hint to us information about locals.

For example, consider string.StartsWith(char) : bool, which is currently implemented as below.

public bool StartsWith(char value) => Length != 0 && _firstChar == value;

If the JIT were able to hint to us information about the local, we could write this as follows, which would allow us to elide the first branch entirely in almost all cases, taking advantage of the fact that empty strings have the null-terminator in the _firstChar position.

public bool StartsWith(char value) {
  if (JitHelpers.IsKnownConstant(value) && value != 0) {
    return _firstChar == value;
  } else {
    return Length != 0 && _firstChar == value;
  }
}

It would also allow us to further optimize the Span.Slice(int start, int length) logic we checked in earlier, which allows us to do something akin to the below.

if (JitHelpers.IsKnownConstant(start)) {
  if (start == 0) {
    // case Slice(0, <something>)
    ThrowIf((uint)length > (uint)_realLength);
    return <slice>;
  } else if (JitHelpers.IsKnownConstant(length) && start >= 0 && length >= 0) {
    // case Slice(10, 20)
    ThrowIf((uint)(start + length) > (uint)(_realLength));
    return <slice>;
  }
}

ThrowIf(normal_validation_checks);
return <slice>;

This could also have potential applications for methods which are implemented via a giant switch statement which normally doesn't get inlined (see String.Equals(..., StringComparison) overload). We could choose to implement the method as an inlineable method that does a very fast dispatch if the comparison parameter is known constant, falling back to a slower non-inlined method if the comparison parameter needs to be evaluated at runtime.

The behavior of IsKnownConstant would be such that the JIT would replace it with true if the value is known at JIT time to be constant (perhaps because it is itself a literal or has been calculated from some other const-folding). It would be replaced with false in all other cases, including if the JIT simply doesn't have enough information to make a determination, or if optimizations have been disabled. I'm also assuming branch pruning would kick in to completely elide code that is on the incorrect side of the IsKnownConstant check.

category:proposal
theme:optimization
skill-level:expert
cost:extra-large

@fiigii
Copy link
Contributor

fiigii commented Nov 16, 2018

Thanks for opening this issue. This is a really interesting idea, but I have some concerns about its usefulness under the current JIT design. First of all, when should JIT replace the call-site of JitHelpers.IsKnownConstant to a bool value?

  1. The current JIT intrinsic system tries to expand intrinsic call-site at very front-end, which even before inlining. So, that will cause problems for applications like above bool StartsWith(char value). In this example, value is a function argument that won't be a constant unless StartsWith gets inlined to the call-site. So, JitHelpers.IsKnownConstant has to be expanded after inlining to get the mentioned optimization.
  2. Actually, inlining is not the only problem. In hardware intrinsic, we are suffering the JIT constant problem from unexpected arguments https://github.com/dotnet/coreclr/issues/17108. For example, JitHelpers.IsKnownConstant(length), at the front-end of RyuJIT length may not be treated as a constant even if the value of length from a literal. Some operations over constant literal, like "cast", cannot be eliminated by Roslyn.
  3. If we want to find the JIT constants as many as possible, JIT would need to expand IsKnownConstant later. But how late? If behind most of the optimization phases, the bool constant would not be able to enable other optimizations that we actually want.

@GSPP
Copy link

GSPP commented Nov 16, 2018

I find this to be a very clever idea! I have never before seen the idea of communicating information from the compiler to the code. Normally, the code communicates assumptions to the compiler.

@AndyAyersMS
Copy link
Member

I think most of what you want here exists already in the jit, though the parts don't always work as well as they could. The importer does aggressive branch pruning and the inliner does forward substitution of constant arguments, so for the most part available constants will show up early, and if not, should show up later on during the optimization phases.

The one bit that is not as powerful as perhaps you'd like is the modelling of the impact of constant args on larger method bodies in the inline heuristics. There is some support, but the recognition is simplistic and the benefit model is pretty crude. I have experimented in the past with more thoughtful heuristics but was held back in part by the desire not to slow down the jit. With the advent of tiering one could perhaps argue that making the Tier1 jit a bit slower is acceptable.

Alternatively, we could look at offloading parameter impact analysis to an AOT tool. Various folks have been kicking around these ideas recently (cc @morganbr @sergiy-k). This would hint to the jit that aggressive inlining based on constant args would pay off; it would still be up to the jit to actually inline and simplify.

So I'm curious: what happens today when you use coding patterns like the above? Maybe we can collect examples here so they're all in one place and get a better picture of where things fall apart.

Addressing the related issues in the jit would provide a wider benefit than introducing a new specialized helper.

@AndyAyersMS
Copy link
Member

cc @dotnet/jit-contrib

@GrabYourPitchforks
Copy link
Member Author

I agree that addressing issues JIT-wide would probably have the greatest overall impact, since it could benefit almost all code that follows the conventions you decide are optimization candidates. The switch statement scenario is a prime candidate - developers shouldn't have to write extra code "please optimize me!" if the JIT can make things a bit nicer on their behalf automatically.

The scenarios where I think this doesn't quite work are scenarios where the JIT would need special knowledge of how a type behaves in order to provide the maximum possible optimizations. The String.StartsWith(char) example I think is a good one, as we presumably wouldn't want to build into the JIT knowledge of the fact that all strings (even empty strings!) are null-terminated. This will become more relevant with Utf8Strings.StartsWith(Rune), as we'll want to ensure that only the correct branch (of a larger method) gets emitted into the code gen.

@tannergooding
Copy link
Member

tannergooding commented Jan 9, 2019

Another case where this would be useful is in the following code:

public T GetElement(int index)
{
    ThrowIfUnsupportedType();

    if ((uint)(index) >= (uint)(Count))
    {
        throw new ArgumentOutOfRangeException(nameof(index));
    }

    ref T e0 = ref Unsafe.As<Vector128<T>, T>(ref Unsafe.AsRef(in this));
    return Unsafe.Add(ref e0, index);
}

If the index is a constant value, it is much more efficient for us to emit a shuffle or extract instruction. However, when it is non-constant, it is generally more efficient to do a direct read from memory (than to emit the jump table fallback which also has the overhead of a method call, rather than being directly inlineable).

Edit: This can, of course, be handled as an intrinsic directly in the runtime (and is probably what will get done), but an IsConstant method would allow such an optimization without having to add special runtime support for this method.

@AndyAyersMS
Copy link
Member

Still on the fence about the value of this -- labelling as "future"

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@BruceForstall BruceForstall added design-discussion Ongoing discussion about design without consensus and removed JitUntriaged CLR JIT issues needing additional triage labels Nov 25, 2020
@GrabYourPitchforks
Copy link
Member Author

Per Tanner's comment, I wonder if adding a bunch of [Intrinsic] methods would be the way to go here. It's not the cleanest long-term solution, but at least it could allow us to prove / disprove the value of this feature.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jan 13, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jan 19, 2022
@stephentoub
Copy link
Member

stephentoub commented Jan 19, 2022

Is there a reason we wouldn't want to propose this or a variant of it as being public?

And are there other places we should ourselves be using it internally now that we have it internally?

@EgorBo
Copy link
Member

EgorBo commented Jan 19, 2022

Is there a reason we wouldn't want to propose this or a variant of it as being public?

And are there other places we should ourselves be using it internally now that we have it internally

I personally don't have a strong opinion. If we decide to make it public we will need to address that inline-limit issue and properly implement it in Mono too.

@GSPP
Copy link

GSPP commented Feb 1, 2022

This API can be used to create behavioral differences depending on what the call site looks like, what surrounding code looks like, inlining decisions, tiering, the runtime version, bitness, the phase of the moon, ...

It's also very hard to test this for correctness. You have to hope that the JIT does what you expect it to do so that the test is valid.

This seems like a very dangerous thing to expose publicly. In my opinion, clearly not worth it. I don't want to see this show up in libraries causing nasty bugs.

At least, the decision should wait for a release or two to gain practical experience with this concept. @stephentoub

@tannergooding
Copy link
Member

This seems like a very dangerous thing to expose publicly. In my opinion, clearly not worth it. I don't want to see this show up in libraries causing nasty bugs.

This is no more dangerous than many of the APIs in System.Runtime.* ranging from *.CompilerServices.Unsafe to *.InteropServices.MemoryMarshal, *.InteropServices.CollectionsMarshal, or even some of the behaviors of the entire *.Intrinsics.* namespace.

Users can already do all sorts of hacks, specializations, and other tricks (including relying on inlining) to force a particularly behavior and this just extends that to one other interesting avenue that we ourselves find worthwhile in various scenarios.

It's not like every app would suddenly start using it and it being exposed somewhere like RuntimeHelpers would help ensure that the scenarios its used remain limited for most users. There is a lot of code out there that is far scarier than trying to determine if an input is constant after inlining happens.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI design-discussion Ongoing discussion about design without consensus optimization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants