Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace some stackallocs with collection expressions #93126

Merged
merged 1 commit into from
Oct 6, 2023

Conversation

stephentoub
Copy link
Member

These will result in using an InlineArray rather than a localloc

These will result in using an InlineArray rather than a localloc
@ghost
Copy link

ghost commented Oct 6, 2023

Tagging subscribers to this area: @dotnet/area-meta
See info in area-owners.md if you want to be subscribed.

Issue Details

These will result in using an InlineArray rather than a localloc

Author: stephentoub
Assignees: stephentoub
Labels:

area-Meta

Milestone: -

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Would be good to resolve the styling question I asked on the other PR and update if we decide having the space is better

Copy link
Member

@gfoidl gfoidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub I played around with collection expression, and therefore your PRs are always a welcoming guide to get familiar with new features.

But here something seems strange, where either I miss something or this PR is actually a de-optimization.

will result in using an InlineArray rather than a localloc

When looking at the generated IL or reverse engineered C# with sharplab (main, 12 Oct 2023, so quite actual) I see array allocations instead the use of InlineArrays.

Can you please have a look and clarify?

I didn't comment on all places, to avoid noise.

@@ -24,7 +24,7 @@ public static string[] GetLogicalDrives()
int count = BitOperations.PopCount((uint)drives);

string[] result = new string[count];
Span<char> root = stackalloc char[] { 'A', ':', '\\' };
Span<char> root = ['A', ':', '\\'];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to sharplab, main 12 Oct 2023 this allocates.

Looks like the collection expression only avoid the allocation for ROS, but a ROS can't be used here.

@@ -922,7 +922,7 @@ public override string ToString()
AssertValid();

// Make local copy of data to avoid modifying input.
Span<uint> rgulNumeric = stackalloc uint[4] { _data1, _data2, _data3, _data4 };
Span<uint> rgulNumeric = [_data1, _data2, _data3, _data4];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -323,15 +323,15 @@ public static unsafe void Copy(Array sourceArray, int sourceIndex, Array destina
if (Rank != 2)
ThrowHelper.ThrowArgumentException(ExceptionResource.Arg_Need2DArray);

return InternalGetValue(GetFlattenedIndex(stackalloc int[] { index1, index2 }));
return InternalGetValue(GetFlattenedIndex([index1, index2]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -331,7 +331,7 @@ private static void SurrogateToUpperNLS(char h, char l, out char hr, out char lr
Debug.Assert(char.IsHighSurrogate(h));
Debug.Assert(char.IsLowSurrogate(l));

Span<char> chars = stackalloc char[] { h, l };
ReadOnlySpan<char> chars = [h, l];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub
Copy link
Member Author

stephentoub commented Oct 14, 2023

sharplab is targeting a runtime that doesn't have InlineArray yet. Its main is about roslyn, not runtime.

@gfoidl
Copy link
Member

gfoidl commented Oct 14, 2023

Locally with .NET 8 RC2 I also see the array allocations (<LangVersion>preview</LangVersion> is set).

For example:

static void Test0(int a, int b)
{
    Span<int> foo = [a, b];
    ReadOnlySpan<int> bar = [a, b];
}

emits

.method assembly hidebysig static 
	void '<<Main>$>g__Test0|0_0' (
		int32 a,
		int32 b
	) cil managed 
{
	.custom instance void [System.Runtime]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = (
		01 00 00 00
	)
	// Method begins at RVA 0x2080
	// Header size: 12
	// Code size: 44 (0x2c)
	.maxstack 5
	.locals init (
		[0] valuetype [System.Runtime]System.Span`1<int32> foo,
		[1] valuetype [System.Runtime]System.ReadOnlySpan`1<int32> bar
	)

	IL_0000: nop
	IL_0001: ldloca.s 0
	IL_0003: ldc.i4.2
	IL_0004: newarr [System.Runtime]System.Int32			// <-- newarr for Span<int> foo = [a, b];
	IL_0009: dup
	IL_000a: ldc.i4.0
	IL_000b: ldarg.0
	IL_000c: stelem.i4
	IL_000d: dup
	IL_000e: ldc.i4.1
	IL_000f: ldarg.1
	IL_0010: stelem.i4
	IL_0011: call instance void valuetype [System.Runtime]System.Span`1<int32>::.ctor(!0[])
	IL_0016: ldloca.s 1
	IL_0018: ldc.i4.2
	IL_0019: newarr [System.Runtime]System.Int32			// <-- newarr for ReadOnlySpan<int> bar = [a, b];
	IL_001e: dup
	IL_001f: ldc.i4.0
	IL_0020: ldarg.0
	IL_0021: stelem.i4
	IL_0022: dup
	IL_0023: ldc.i4.1
	IL_0024: ldarg.1
	IL_0025: stelem.i4
	IL_0026: call instance void valuetype [System.Runtime]System.ReadOnlySpan`1<int32>::.ctor(!0[])
	IL_002b: ret
} // end of method Program::'<<Main>$>g__Test0|0_0'

Or when running:

int a = 3;
int b = 4;

long allocated0 = GC.GetTotalAllocatedBytes(precise: true);
Test0(a, b);
long allocated1 = GC.GetTotalAllocatedBytes(precise: true);
Console.WriteLine(allocated1 - allocated0);

static void Test0(int a, int b)
{
    Span<int> foo = [a, b];
    ReadOnlySpan<int> bar = [a, b];
}

it prints (in Release):

64

but I'd expect with that optimization to be 0.

Or when looking at godbolt (main) I see the two allocations that should not be there.

@stephentoub
Copy link
Member Author

Sounds like your C# compiler is too old.

For your Test0, I get the equivalent of this:

private static void Test0(int a, int b)
{
	<>y__InlineArray2<int> buffer = default(<>y__InlineArray2<int>);
	global::<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray2<int>, int>(ref buffer, 0) = a;
	global::<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray2<int>, int>(ref buffer, 1) = b;
	global::<PrivateImplementationDetails>.InlineArrayAsSpan<<>y__InlineArray2<int>, int>(ref buffer, 2);
	<>y__InlineArray2<int> buffer2 = default(<>y__InlineArray2<int>);
	global::<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray2<int>, int>(ref buffer2, 0) = a;
	global::<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray2<int>, int>(ref buffer2, 1) = b;
	global::<PrivateImplementationDetails>.InlineArrayAsReadOnlySpan<<>y__InlineArray2<int>, int>(in buffer2, 2);
}

@gfoidl
Copy link
Member

gfoidl commented Oct 14, 2023

Fine, than it's as intended / expected. Thanks for clarification!

Sounds like your C# compiler is too old.

Today .NET 8 (RC2 and RTM (latest 8.0.1xx)) is already too old for this optimization...

@stephentoub
Copy link
Member Author

stephentoub commented Oct 14, 2023

Today .NET 8 (RC2 and RTM (latest 8.0.1xx)) is already too old for this optimization...

That should not be the case. Are you building in VS or from the command line?

Regardless, you need a compiler with this:
dotnet/roslyn#69412

@gfoidl
Copy link
Member

gfoidl commented Oct 14, 2023

Are you building in VS or from the command line?

VS tricked me into this.
With the command line I see now the expected code.
Tried both with the .NET 8 RTM SDK.

Thanks!

@radical
Copy link
Member

radical commented Oct 16, 2023

This broke a bunch of dotnet/performance benchmarks. See #93348 .

@stephentoub
Copy link
Member Author

This broke a bunch of dotnet/performance benchmarks. See #93348 .

Sounds like it's an issue with the mono interpreter not handling [InlineArray]s correctly.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants