-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] An API for providing compiler hints to the JIT/AOT #24593
Comments
🤔 More seriously, I've occasionally been thinking about something like this, but I have no idea if the scenarios I have in mind are even that much of a perf impact. |
I think its the type of optimization where a big impact depends on your API. For example, a method which is But, a thousand tiny improvements can still add up to a lot as well, so it can also be useful for small APIs. |
FYI. @fiigii, @4creators, @mikedn, @CarolEidt, @jkotas I've proposed this feature as a general way for developers to provide hints to the JIT/AOT about their code. The only API for the feature I've proposed at this time would give users a way to tell the JIT to emit Otherwise, even in the case where the user has pinned a managed array, and offset their ptr appropriately, it is still expensive for the JIT to determine that the logic will result in the code being always aligned and that it is safe to emit the better code. |
This feature is IMO long overdue. It can be helpful in a lot of scenarios not only for HW intrinsics. I would even expand this proposal with
For discussion see https://github.com/dotnet/coreclr/issues/15427#issuecomment-351794198 and https://github.com/dotnet/coreclr/issues/15522 |
I would not consider these two cases to fall into the category of compiler "hints". I would expect a hint to be something that the compiler could use to guide its choices (e.g. whether to inline), but which would have no correctness requirement. For the alignment case, at least the worst that could happen is that the program would fault if the alignment were not correct (and the developer would then know they'd got it wrong). For the "Pure" case, it could get completely wrong results, and be much more difficult to debug. That said, I think these are worth considering - we should just call them something other than "hints". |
Fair enough. Now to come up with a better name 😄 |
My proposals 😄 would be than:
|
There also needs to be a name for the static class which is used to provide information inline with code (such as for a local, where attributes aren't allowed). |
Similar proposal: https://github.com/dotnet/coreclr/issues/2725 |
👍. I would opt for named assumptions over a general I do think putting them in |
I've updated the OP to suggest using |
Yet, I would still allow general
|
The point would be to provide named assumptions (e.g. A general
This requires the JIT to support and itself assume that the above IL sequence (and any other similar IL sequences that may be emitted by other compiles) meant "I am not null". It also requires the compiler to not optimize such a check away (C# is adding non-nullable types, whose to say they don't optimize the check to However, an explicit
A high level compiler (such as C#) also has no way of optimizing this call away itself so there is no worry about the JIT not being able to determine what the user was wanting. |
Got it. It is OK than as simpler solution from perspective of Jit support. |
The current purpose of the Adding assumptions there doesn't look like a very good fit to me. It rather seems to be a violation of the single responsibility principle. I'd dedicate a new class for this, here are some naming ideas:
|
Thanks for the suggestions! I'll update the OP to include a list of suggested names sometime later tonight. Based on the discussions here, and a couple I've had offline, the general idea of the proposal looks to be sound (we want a way to suggest optimizations to the compiler and we want those to be named/verifiable). The actual name of the class that holds these methods will likely undergo some discussion during the API review session (which will happen sometime after we mark this as |
If they are to be trusted rather than guidance it should be |
@benaadams when you say |
Although I think that |
@CarolEidt, is your recommendation then that we do not perform any validation of the assumption when optimizations are disabled then? Based on your statement on the CoreCLR issue (https://github.com/dotnet/coreclr/issues/2725#issuecomment-173028054), I had taken it that we should validate the condition under debug (however, that was 2 years ago and things may have changed). |
@tannergooding - thanks for reminding me of that. I still think it would be useful and probably desirable to validate the condition under debug. However, even in that context, I don't feel that the verb "assert" is quite strong enough; it says "I claim this is true". To me, "assume" says "I claim this is true and you (the compiler/JIT/runtime) can take actions based on that claim". |
IMO
|
I'm going to tag @jaredpar and @MadsTorgersen explicitly here as well. This will primarily be intended for the lower compilers (JIT/AOT, etc), but it would be nice to see if the higher level compilers (C#, VB, F#, etc) have any requirements or scenarios that this might prevent/harm/etc, or if there are any scenarios for which this would be advantageous (i.e. |
Updated the OP with:
|
Tagging @joperezr, @AlexGhiondea as area owners. Is there any other feedback here before I mark as |
You should update the proposed API at the top post to what you want folks to review. |
👍, Updated it to be Rest of the OP looks to be as intended. |
Looks good to me. |
If this is just about FWIW, what I have been hearing is that the performance penalty of the instructions that handle misaligned access is very small on modern platforms and that it is not something to really worry about. |
There is basically no difference between the However, on non-AVX hardware (anything prior to Sandy Bridge, which came out Jan 2011) you cannot use the Any machine running on post Sandy Bridge (post Bulldozer for AMD, which was released Oct 2011) micro-architecture can freely use the VEX encoding ( However, even on modern hardware, there is a non-trivial performance hit if you read unaligned data that crosses a cache line or page boundary, so it is important to ensure reads/writes (outside the first) are aligned anyways. Additionally, the Benchmark Game performance tests (CC. @danmosemsft, @eerhardt) are run on legacy hardware (specifically a Q6600 from the Kentsfield micro-architecture) for which the CC. @fiigii as he may be able to provide more details or correct any mistakes I made with the timelines. |
@mikedn - although there's a lot we don't know about what the JIT could or couldn't do, especially if it were engineered as well as it could possibly be, I think there are a couple of things we can say:
That said, you make a very good point that we should be doing proof-of-concept work to validate the benefit of any such hints that we might propose for .NET. |
Yes, of course. But one of the problems with these assumptions is that the JIT may not be able to handle them very well for the same reasons it does not handle well other optimizations - lack of time, problematic design and implementation. I've ran into such issues already. For example, I added the And when people try to use "assume" in their code and find that the JIT doesn't perform the desired optimizations what do we do? Go back to the drawing board and introduce new ways to fill their needs? While not necessarily bad I'm not sure this is a good idea. |
I'm sympathetic to this kind of proposal but also have seen this kind of thing go wrong in many ways. So I'd prefer it if we had an actual end to end implementation that we could evaluate. |
@joperezr There is no point in sending this to API review until there is a proof of concept that demonstrates that this API is actually useful. |
I have a good use case for this. See dotnet/coreclr#16095 (comment) for better details. Basically, we cannot fold
|
@AndyAyersMS @CarolEidt This is an actual assumption that is proven in practice to help even on highly optimized C++ compilers: https://github.com/dotnet/coreclr/issues/6024 |
The jit's ability to represent branch probabilities is quite limited. Without actual profile data all we can describe is that a particular branch is almost never or almost always taken. The not taken side of the branch then gets treated as very cold, akin to how we handle throws. If that seems interesting, then it is probably not too difficult to prototype, especially with the advent of new style intrinsics. Something like of two static identity methods in some namespace yet to be determined, eg: class CompilerHint
{
[Intrinsic]
static bool Unlikely(bool x) { return x;} // hint: x is almost always true
[Intrinsic]
static bool Likely(bool x) { return x; } // hint: x is almost always false
} These would be legal anywhere but would initially impact jit codegen only when used in a branching context, and then maybe only sensibly when at top level of an expression: if (CompilerHint.Likely(a > b)) { A; } else { B; } // A is very hot, B is very cold |
I'm currently focusing my available time on helping ensure the HWIntrinsics are generally ready for 2.1. After that is done, I want to try prototyping one of these for the next release (assuming someone else doesn't first). |
@tannergooding is this required in order to ship 5.0? It seems milestone should be Future. |
Yes, updated. Not sure who the relevant area owner is. |
Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process. This process is part of our issue cleanup automation. |
So is this just abandoned or there's an alternative proposal? |
It's been abandoned for nearly 7 years. I was never able to get enough buy-off from the JIT team on the idea. This is simply representing that state and not keeping an issue that no one was looking at open indefinitely. If someone has the time/energy to build a proof of concept and show the real-world benefit, that might change some people's minds. Otherwise, those scenarios already have valid ways to get the desired codegen and have it all function as expected using existing |
Rationale
There are a number of optimizations which require complex or in-depth analysis. In many cases, these optimizations are not possible to make. This can be due to time constraints (JIT), due to requiring an assumption to be made (JIT and AOT), or due to some other reason.
However, the developer often knows what assumptions are safe to make. This can be due to the way they wrote their algorithm, some check they performed higher in the code, etc.
As such, I propose we provide a set attributes and APIs which can help the compiler (JIT or AOT) produce more optimal code.
Proposed APIs
The proposed API here is very simplistic and represents a core use-case for the new Hardware Intrinsics APIs.
There are multiple other additions that may be equally useful, but they should likely be reviewed on a case-by-case basis with their own supporting arguments.
Alternative Class Names
CompilerHints
was the initial suggested name of the type which would hold these APIs. However, there was some concern over whether some of the suggested APIs actually qualified asHints
, since they may be required to be correct and may result in failure or incorrect data at runtime.The general feedback (so far) is that these are assumptions not asserts. @CarolEidt gave a good explanation of the difference here: https://github.com/dotnet/corefx/issues/26188#issuecomment-356147405
Other names suggested include:
Additional Details
All assumptions should be named. That is, there should be no generic
Assume(bool cond)
method exposed now or in the future. These assumptions are meant to convey an explicit optimization to the compiler and should not be left to interpretation.Potentially as an implementation detail, preferably as part of the contract: Any assumption made by these APIs should be verified when compiling with optimizations disabled.
Example Case 1
In many high-performance algorithms the user will be reading/writing memory in 16 or 32-byte chunks by utilizing the SIMD registers available to the hardware.
It is also generally beneficial to such algorithms to ensure that their reads/writes are aligned as it ensures a read/write never crosses a cache or page boundary (which can incur a heavy performance penalty). Additionally, on older hardware (and some architectures) the unaligned read/write instruction (
movups
on x86) is not as fast as the aligned read/write instruction (movaps
on x86) or thereg, reg/mem
encoding of the various instruction may require thatmem
be aligned.However, the GC does not currently support arbitrary alignments for objects and it is impossible to guarantee alignment without pinning and then checking the alignment of the memory. As such, users generally write their algorithm such that it does a single unaligned read/write, adjusts the offset so that the new address is aligned, and then operates on the rest of the data using aligned reads/writes.
In some scenarios, it may be difficult or impossible to determine that the pinned address is actually aligned (such as if an address is pinned and then passed to some other method to be processed), so the compiler may not emit the most optimal code. As such, providing an
AssumeAlignment(address, alignment)
that tells the compiler to emit code as if the address was aligned would be beneficial.Example Case 2
NOTE: This is not currently proposed, but represents another annotation that having compiler support for may be useful.
In many cases, a user may write a method that is considered Pure (the method does not modify state and always returns the same input for a given output).
In simplistic cases, the compiler may perform constant folding or inlining and the actual computation may not be done. However, in more complex cases the compiler may need to do more in-depth analysis which may prevent it from caching the return value or inlining the method.
As such, providing a
PureAttribute
(similarly to theContract.PureAttribute
, but actually consumed by the compiler) would be useful as it would allow the compiler to cache the return value and optimize away subsequent calls for a given input.Existing IL Support for skipping fault checks
There are currently three concrete fault checks which the CLI specifies can be skipped using the
no.
prefx.Although these are supported by IL, high level compilers may not emit them. It is potentially useful to expose
Assume.IsType
,Assume.InRange
, andAssume.NotNull
checks so that all languages (which compile to IL) can benefit from skipping these fault checks (without having to undergo an IL rewriting step).Additional Notes
There are many other potential hints or attributes which can be useful and for which the class could be extended.
Some of these represent assumptions that a higher level compiler (i.e. C#) cannot make (due to not being able to do things like cross-assembly optimizations, or being required to emit certain IL).
The text was updated successfully, but these errors were encountered: