-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loop cloning for Span #113575
base: main
Are you sure you want to change the base?
Loop cloning for Span #113575
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
/azp list |
This comment was marked as resolved.
This comment was marked as resolved.
/azp run runtime-coreclr outerloop, runtime-coreclr jitstress, runtime-coreclr pgo, runtime-coreclr pgostress, Fuzzlyn |
Azure Pipelines successfully started running 5 pipeline(s). |
Out of interest, for loop cloning, if (len > span.Length) would it more less code/more efficient to run the cloned loop up to span.Length and only then switch to the un-optimized version? Or because that path is likely to throw anyway, it's better to just do the simplest thing? |
Yep, the current impl is easier to implement and is more generic - we also need to check array instance for being null (for arrays) and there are other kinds of cloning conditions, e.g. if we have a virtual call inside the loop, we can add an additional cloning condition for the most popular type under that virtual call (PGO). |
Thanks for the info. So would all "likely" optimization go in the optimized verison of the loop, and none of them in the fallback or could you have some combinations? E.g. potentially multiple clone loops with different assumptions? I guess you can assume the fallback is always the fully-unoptimized version as you assume there will be a failure at some point |
It depends. Normally, yes, fallback is not expected to be hit in normal circumstances unless code is relying on OOB exception, but it is not the case for virtual calls, we clone loops with them but the fallback still may be invoked (when some other type arrives), we discussed this recently in #113579 (comment) |
/azp run runtime-coreclr jitstress, runtime-coreclr pgo, runtime-coreclr pgostress |
Azure Pipelines successfully started running 3 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
@AndyAyersMS @BruceForstall @dotnet/jit-contrib PTAL Surprisingly, it was not difficult, my changes are mostly cosmetic (with asserts). Basically, if we have a LCL_VAR length, we don't need to deref the array object (it was either already dereferenced when this local was created, or it's a local span that doesn't need any dereference). Diffs look sane to me, the TP impact is ~0.2% on average with a huge outlier in Outerloop failures are not related. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've overloaded the existing "jagged" array implementation with a case to support Span. Is this the cleanest way to express this? Would it be better to introduce a new LC_OPT(LcSpan)
"type" of optimization (and maybe a LC_Span
type that parallels LC_Array
, etc.)?
Can Span participate in "jagged" arrays? E.g., for a[x][y][z]
, can a
be a Span
, a[x]
be a span, a[x][y]
be an array?
This comment was marked as outdated.
This comment was marked as outdated.
@BruceForstall I've addressed your feedback. The impl is 2x bigger now, but I agree that it looks better. Diffs |
assert(isIncreasingLoop || iterInfo->IsDecreasingLoop()); | ||
if (!isIncreasingLoop && !iterInfo->IsDecreasingLoop()) | ||
{ | ||
// Normally, we reject weird-looking loops in optIsLoopClonable, but it's not the case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small pre-existing issue, can be reproduced (hits an assert in Checked) in Main via this snippet
Click me
using System;
using System.Runtime.CompilerServices;
class Program : IDisposable
{
public static void Main()
{
for (int i = 0; i < 1200; i++)
{
try
{
Test(new int[100000000], 44, new Program());
Thread.Sleep(16);
}
catch
{
}
}
}
[MethodImpl(MethodImplOptions.NoInlining)]
static void Test(int[] arr, int x, IDisposable d)
{
for (int i = 0; i < x; i--)
{
d.Dispose();
Console.WriteLine(arr[i]);
}
}
public void Dispose()
{
}
}
@EgorBot -amd -arm -profiler using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
public class Bench
{
byte[] _arr1 = new byte[2000];
byte[] _arr2 = new byte[2000];
[Benchmark]
[Arguments(1000)]
public void CopyN(int elems)
{
Span<byte> span1 = _arr1;
Span<byte> span2 = _arr2;
for (int i = 0; i < elems; i++)
span1[i] = span2[i];
}
[Benchmark]
[Arguments(1000)]
public void ReversedIter(int elems)
{
Span<byte> span = _arr1;
// Reversed iteration
for (int i = span.Length - 1; i >= 0; i--)
span[i] = 42;
}
} |
This PR enables loop cloning for Spans
Closes #82946
Closes #110986
Closes #112019
Example:
Current codegen:
New codegen:
Diffs