-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A mechanism to relax exception checks such that inlined method calls are at least as fast as manual inlining should be supported. #8948
Comments
FYI. @JosephTremoulet So after re-reading the sections a few times. I'm fairly positive that this is supported under the pretext of There are specific sections that not only call out that |
@tannergooding, I think you're combining "side-effects" and "checks"/"exceptions" into a single concept in a way that the spec does not, and that this may be the cause of your incorrect reading of it. Side-effects and checks are separate throughout the language quoted above. Relaxation allows reordering some checks across each other and across side-effects. This means that, in terms of observable differences, when some check fails, some side-effects may be suppressed, and other (prior) checks may appear to be suppressed. There is nothing in here about suppressing checks, except in that they are implicitly suppressed when reordered after a subsequent check that fails. |
Also, a "manual inlining" of your example: int Select(bool condition, int trueValue, int falseValue)
{
return condition ? trueValue : falseValue;
}
int MyMethod(bool condition, int[] a, int[] b)
{
return Select(condition, a[5], b[6]);
} would produce int Select(bool condition, int trueValue, int falseValue)
{
return condition ? trueValue : falseValue;
}
int MyMethod(bool condition, int[] a, int[] b)
{
int _condition = condition;
int _trueValue = a[5];
int _falseValue = b[6];
int _result = _condition ? _trueValue : _falseValue;
return _result;
} Further modifications that alter the semantics (by making the type safety checks conditional) are rewrites that produce semantically different programs, not "inlining". Why there's discussion of interleaving instructions and being "as fast as" manual inlining is e.g. if the call were in a loop, you'd want to be able to hoist operations from the inlinee out of the loop even though doing so moves them past instructions from the caller that remain in the loop. |
@CarolEidt could speak with more authority than I can as to the intent and details of the spec. The current JITs don't actually pay attention to the relaxed exception control bits, so this is somewhat academic without changing the JIT, and I don't know the history of why we have this complicated bit of spec that we don't actually take advantage of in codegen. |
@JosephTremoulet. I might be misunderstanding, but I don't think my proposal would be supressing the check, merely moving it so that it is only executed if the variable in question is actually accessed. From VI.F.1
It explicitly states that they want calls from callee to I don't know anyone who would inline it by hand to be (as this would be a literal inlining of the raw IL): int MyMethod(bool condition, int[] a, int[] b)
{
bool _condition = condition;
int _trueValue = a[5];
int _falseValue = b[6];
int _result = _condition ? _trueValue : _falseValue;
return _result;
} You would inline it by hand to be: int MyMethod(bool condition, int[] a, int[] b)
{
return condition ? a[5] : b[6];
} |
My understanding of this, is that the implicit check from an instruction can be reordered anywhere, provided that the observable behavior is only changed in the case that the check fails. So, unless a range check has an observable side effect in the case it succeeds, we should be free to move the check to execute at a later point in the program. That is, if |
Aha. You're interpreting "reorder" and "suppress" in terms of the static view of how the program is rewritten. I believe the spec is discussing the operational semantics, and that "reorder" and "suppress" are meant in terms of the dynamic stream of visible events that the executing program might trigger. So "suppress" means "remove from the dynamic stream", not "remove from the static view of the program", and "may be reordered" means "may occur in a different order in the dynamic event stream", not "may have one or the other suppressed from the dynamic stream so long as it is still present somewhere in the static view of the program". |
So today, we have the following streams
Today, under the normal requirements, it is transformed to:
My proposal is that, following the relaxations we transform it to:
So we are not "suppressing" anything (removing it from the dynamic stream". We are only "reordering" (may occur in a different order). Because the Null Check and the Range Check do not have a side-effect in the case they succeed. This is allowed. |
I do not think the reordered sequence must guarantee that the exception is still thrown in all possible code paths. If that was the case, then re-ordering the check elsewhere is effectively pointless. Users (especially users who care about performance) would much rather the method fail earlier rather than later. As such, I believe that this (in addition to all the mentions of the generated code being as optimal as hand-inlining the method in the original source) means my view is correct. The check can be reordered to be later in the stream and it is not required to be executed in all subsequent code-paths. Merely that, the check must still exist and must still provide the appropriate protection to the relevant operation (in this case, the null and range check must still exist before accessing an element in the array, but it does not have to exist in a path where you do not access said array). |
Talking past each other... maybe I should have said "trace" instead of "stream"? The reasoning goes something like this:
int Select(bool condition, int trueValue, int falseValue)
{
return condition ? trueValue : falseValue;
}
int MyMethod(bool condition, int[] a, int[] b)
{
return Select(condition, a[5], b[6]);
}
2.1. What dynamic trace of events does the unoptimized program execute?
In this case, if the input has
and these agree in terms of visible side-effects and exceptions, so hooray. But if the input has
and the fact that |
@JosephTremoulet, I still don't think that is quite right. That leaves this entire portion of the spec effectively useless. If a check is going to fail (that is cause an exception) then moving it to later in the program is effectively useless (it means additional code will execute, but overall the function will still fail, will delay program execution, etc). Based on all the surrounding context, I am almost certain that this implies that the check is not required to exist in every subsequent code path, but instead that it is only required to exist in the code paths where it provides the required validation. |
Not so. Consider this example: class SomeClass {
int someField;
static int addToEach(int[] a, SomeClass o) {
int sum = 0;
for (int i = start; i != stop; ++i) {
a[i] = a[i] + o.someField;
}
}
} That loop has a null check on |
Loop rewriting is explicitly called out in a separate portion, with such an example. All the sections about method inlining under relaxations are explicitly separate, explicitly call out optimizing in a way equivalent to had you inclined the code in the original source (not in IL), as to being allowed to optimize as if the called method were a C/C++ style macro, etc. They all indicate that my assumptions are correct and the check can be moved such that it only exists under the path where it would be required (for safety/verifiability). |
The spec says |
Yes, but the entirely depends on how you interpret sequence (whether it is a single code path, or all code up until the next non-trivial protected or handler region). There is at least one example in the spec (on phone, will link in a bit) where it defines a sequence to contain a branch. |
@CarolEidt, could you comment as to whether my understanding of the spec is correct? If it isn't, that is fine. I am just wanting to know whether I should update the proposal. If my understanding is correct and if the attributes are there for use; then even if the JIT doesn't use them today, other tools could (CoreRT could use them for better codegen as well, for example). If it isn't supported then I would want to update the proposal to indicate that such a thing should be possible (I should be able to opt-in towards having my code inlined, as if I had inlined it by hand, regardless of the normal runtime rules). |
I want to chime in a bit on the spec and the value of having access to E-relaxed sequences. For the purpose of this discussion, let's suppose that the C# compiler transforms the code "canonically", e.g., it does not convert OP's ExampleThe OP's initial example: int IIf(bool condition, int @true, int @false) { return condition ? @true : @false; }
var items = new int[] { 1 };
int result = IIf(items.Length > 1 ? items[1] : items[0]); If we manually inline the emitted CIL (so method call using CIL is converted to substitution of method CIL body): var items = new int[] { 1 };
// C# evaluates arguments from left to right.
bool condition = (items.Length > 1);
int @true = items[1];
int @false = items[0];
int result = (condition ? @true : @false); In this example, it does not matter whether the sequence is ArrayExceptions-relaxed. The result is always an The spec says:
In OP's example, the sequence of CIL causes array bounds check to fail at Teaser ExampleThe rest of this comment is based on my understanding of the spec. Let me first demonstrate one interesting possible consequence of E-relaxation, before going to a more practically relevant example. class S { public object field = null; }
try
{
var o = new object();
var a = new int[0];
// store 1
S.field = 1;
var foo = (string)o;
var bar = a[1];
// store 2
S.field = 2.0;
}
catch (Exception ex)
{
Console.WriteLine(S.field is int
? "field is int"
: S.field is double
? "field is double"
: field == null
? "field == null"
: "other");
Console.WriteLine(ex.GetType());
} Suppose the sequence is InvalidCastException- and ArrayExceptions-relaxed, then the VES is free to choose from the following behaviors:
This means that either
In cases 3 and 4, the side effect of setting However, it is not possible that
Optimization ExampleThe primary usage of E-relaxation is to hoist checks. void AccumulateFirst10(ref int sum, int howmany, params int[] summands)
{
for (int i = 0; i < howmany; ++i)
{
sum += summands[i];
}
}
int result = 0;
try
{
AccumulateFirst10(ref result, 100, 1, 2, 3);
}
catch (Exception ex)
{
Console.WriteLine(result);
Console.WriteLine(ex.GetType());
} If However, if the method is ArrayExceptions-relaxed, then it's possible that void AccumulateFirst10(ref int sum, int howmany, params int[] summands)
{
if (howmany < 0) return;
// hoisted check
if (howmany > summands.Length) throw new IndexOutOfRangeException();
for (int i = 0; i < howmany; ++i)
{
// summands[i] is loaded WITHOUT bound check
sum += summands[i];
}
} The reason is as follows. If Also, it is necessary to check for Rationale in the spec
I think this rationale is based on optimization opportunities.
For example, consider struct S { public int field; }
void For1d(ref object result, int count, object[] array)
{
for (int i = 0; i < count; ++i) result = (S)array[i];
}
void For2d(ref object result, int count, object[][] jagged)
{
for (int i = 0; i < count; ++i) For1d(ref result, count, jagged[i]);
} Call to
Similary, If relaxation is per method and not "inter-method", then in case If relaxation is inter-method, then upon entrance to SummaryHaving access to E-relaxation could enable optimizing opportunities interesting for library authors or people needing performance-critical code, by sacrificing exception guarantees. |
A mechanism to relax exception checks such that inlined method calls are at least as fast as manual inlining should be supported.
Rationale
Under the CIL specification, the runtime is normally only allowed to perform certain optimizations and must preserve side-effects and exceptions generated by a thread in the order that they appear (I.12.6.4).
This means that, when inlining methods, it is not as effecicient as manual inlining when you are passing an argument that may potentially have side effects (such as accessing the element of an array). This also means that instruction reordering may not be allowed.
However, it is often desirable that the JIT produce code that is more performant over code that is more "compliant"
As such, I propose a mechanism be exposed to relax various existing compilation requirements such that the desired code gen can be achieved.
Example
Under the normal restrictions, the following code:
cannot be transformed to be equivalent to:
This means that, even if the
Select
call is inlined, it will fail ifcondition
is true andb[6]
is invalid (eitherb
isnull
or 6 is out of range) or whencondition
is false anda[5]
is invalid. However, the manually inlined code does not have this problem.Proposal
I believe that this functionality is allowed by the existing runtime specification under the pretext of
E-relaxed
methods (see below).The specification gives examples, and even declares members in the BCL, that should allow this functionality today. However, we do not expose the declared members and therefore do not support the functionality.
As such, I propose we expose the missing
System.Runtime.CompilerServices.CompilationRelaxations
values declared in ECMA TR-84 and update the runtime to support them.This will be beneficial both to JIT'd code as well as to AOT'd code that attempts to remain "compliant"/"compatible".
The proposed members are:
Important Spec Sections
I.12.6.4 Optimization
The first part explains the normal limitations and basically says (to my understanding) that instructions with side-effects (including throwing exceptions) may not be reordered with regard to each other.
The second part is a bit more in-depth and even gives examples in Annex F (see below). The important bit is probably the section on
E-checks
and how the timing is allowed to be changed.VI.Annex F Imprecise faults
The key point in the first section is the fourth bullet-point.
VI.F.1 Instruction reordering
The second section is again important, it explicitly calls out the
E-relaxed
sequences are allowed to span across methods.VI.F.2 Inlining
This block is slightly less important, but it covers the allowed optimizations between a relaxed and non-relaxed method.
VI.F.4 Interlaved calls
This section is the most important. It describes that, for a relaxed method
M
and its caller, instructions from the caller are free to be interleaved with the instructions inM
(provided there is no other constraining data or control dependences).The second bullet point is also important in that it explicitly states that this is to allow both instruction reordering and to ensure that inlined method calls are at least as fast as manual inlining.
category:cq
theme:runtime
skill-level:expert
cost:medium
impact:small
The text was updated successfully, but these errors were encountered: