Lower GetElement on arm64 to the correct access sequence #104288

tannergooding · 2024-07-02T07:19:11Z

Unlike xarch where this could be handled a bit more trivially in codegen, Arm64 isn't as flexible and so we get better and more correct codegen by explicitly lowering to the correct sequence instead.

dotnet-policy-service · 2024-07-02T07:19:44Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

tannergooding · 2024-07-03T02:20:35Z

/azp run runtime-nativeaot-outerloop

azure-pipelines · 2024-07-03T02:20:50Z

Azure Pipelines successfully started running 1 pipeline(s).

tannergooding · 2024-07-03T05:57:06Z

/azp run runtime-nativeaot-outerloop

azure-pipelines · 2024-07-03T05:57:17Z

Azure Pipelines successfully started running 1 pipeline(s).

tannergooding · 2024-07-03T18:54:54Z

CC. @dotnet/jit-contrib, This is the "proper" fix for #104232, which was worked around in #104264.

No TP diff, but good asmdiff, such as for Linux Arm64:

Overall (-10,228 bytes)
MinOpts (-4,844 bytes)
FullOpts (-5,384 bytes)

It looks like there's more improvements to be had, but those can be handled separately I think. An example is we're generating:

-            ldr     q16, [fp, #-0x68]	// [V98 tmp78]
-            dup     s16, v16.s[1]
-            ldr     q18, [fp, #-0x68]	// [V98 tmp78]
-            dup     s18, v18.s[2]
-            ldr     q19, [fp, #-0x68]	// [V98 tmp78]
-            dup     s19, v19.s[3]
+            sub     x1, fp, #104	// [V98 tmp78]
+            ; byrRegs +[x1]
+            ldr     s16, [x1, #0x04]
+            sub     x1, fp, #104	// [V98 tmp78]
+            ldr     s18, [x1, #0x08]
+            sub     x1, fp, #104	// [V98 tmp78]
+            ldr     s19, [x1, #0x0C]

This is a net improvement as we're only loading scalars, rather than repeatedly loading vectors, but it does represent a case that probably should've been represented as:

ldr     s16, [fp, #-0x64]
ldr     s18, [fp, #-0x60]
ldr     s19, [fp, #-0x5C]

I think there's just a general optimization missing here when the LclAddr itself has an offset and we're adding a constant offset (so its still representable as a single AddrMode)

kunalspathak

LGTM

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 2, 2024

dotnet-policy-service bot assigned tannergooding Jul 2, 2024

tannergooding force-pushed the simd-getelement branch from cc4ebd7 to 2705326 Compare July 2, 2024 07:28

Lower GetElement on arm64 to the correct access sequence

09d80fa

tannergooding force-pushed the simd-getelement branch 3 times, most recently from a3b0300 to 95fdd44 Compare July 2, 2024 07:49

Use constant offset where possible

95fdd44

build-analysis bot mentioned this pull request Jul 2, 2024

[x86] stress failure in RayTracer.GetNaturalColor with DOTNET_JitStress=2 #102590

Closed

tannergooding added 2 commits July 2, 2024 07:32

Ensure that lvaSIMDInitTempVarNum is marked as being used by LclAddrNode

e55da57

Fix assert

a6a4b0e

tannergooding force-pushed the simd-getelement branch from 0d4e0f3 to a6a4b0e Compare July 2, 2024 15:41

tannergooding added 2 commits July 2, 2024 11:08

Create a valid addr mode for Arm64

1d904dc

Don't lower unnecessarily

a71d7c8

This was referenced Jul 3, 2024

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

The job running on agent NetCore-Public ran longer than the maximum time #104044

Closed

tannergooding force-pushed the simd-getelement branch 3 times, most recently from 263de58 to 244c8da Compare July 3, 2024 05:56

Account for index 0 and scale 1

244c8da

This was referenced Jul 3, 2024

Test failure: GC\\Features\\HeapExpansion\\Finalizer\\Finalizer.cmd #102706

Closed

TimeProviderTests.TestProviderTimer failed in CI #103459

Closed

Remove the offset constant node when it's unused

ee2ef31

tannergooding marked this pull request as ready for review July 3, 2024 18:45

kunalspathak approved these changes Jul 5, 2024

View reviewed changes

tannergooding merged commit b9673cb into dotnet:main Jul 5, 2024
107 checks passed

tannergooding deleted the simd-getelement branch July 5, 2024 17:09

github-actions bot locked and limited conversation to collaborators Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lower GetElement on arm64 to the correct access sequence #104288

Lower GetElement on arm64 to the correct access sequence #104288

tannergooding commented Jul 2, 2024

dotnet-policy-service bot commented Jul 2, 2024

tannergooding commented Jul 3, 2024

azure-pipelines bot commented Jul 3, 2024

tannergooding commented Jul 3, 2024

azure-pipelines bot commented Jul 3, 2024

tannergooding commented Jul 3, 2024

kunalspathak left a comment

Lower GetElement on arm64 to the correct access sequence #104288

Lower GetElement on arm64 to the correct access sequence #104288

Conversation

tannergooding commented Jul 2, 2024

dotnet-policy-service bot commented Jul 2, 2024

tannergooding commented Jul 3, 2024

azure-pipelines bot commented Jul 3, 2024

tannergooding commented Jul 3, 2024

azure-pipelines bot commented Jul 3, 2024

tannergooding commented Jul 3, 2024

kunalspathak left a comment

Choose a reason for hiding this comment