Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm64: addressing modes #61026

Closed
wants to merge 4 commits into from
Closed

Arm64: addressing modes #61026

wants to merge 4 commits into from

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Oct 29, 2021

draft pr. The goal is to achieve this codegen: https://godbolt.org/z/dEerMoE6M
when we access any C# array by index (variable).
Currently JIT emits for it:

            mov     w1, w1 ;; zero extend for index.
            lsl     x1, x1, #2
            add     x1, x1, #16
            ldr     w0, [x0, x1]

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 29, 2021
@ghost
Copy link

ghost commented Oct 29, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

draft pr

Author: EgorBo
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@BruceForstall
Copy link
Member

Incorporating to godbolt data here, you suggest generating:

        add     x8, x0, w1, sxtw #2
        ldr     w0, [x8, #12]

One issue with this form is that we'll always need two instructions inside a loop; if w1 is the loop array index, neither instruction is hoistable.

#35618 suggested a form of generating array base + data offset (as a byref), which is hoistable out of a loop, and then generating ldr w0, [x0, w1, sxtw #2] in the loop body, so you only have one instruction in the loop.

@EgorBo
Copy link
Member Author

EgorBo commented Nov 2, 2021

Incorporating to godbolt data here, you suggest generating:

        add     x8, x0, w1, sxtw #2
        ldr     w0, [x8, #12]

One issue with this form is that we'll always need two instructions inside a loop; if w1 is the loop array index, neither instruction is hoistable.

#35618 suggested a form of generating array base + data offset (as a byref), which is hoistable out of a loop, and then generating ldr w0, [x0, w1, sxtw #2] in the loop body, so you only have one instruction in the loop.

Thanks, Bruce, yeah I keep that in mind. Actually it can be easily enabled even today in fgMorphArrayIndex. Currently we morph GT_INDEX into
(baseRef + ((indexRef * scaleCns) + dataOffsetCns)
while we can change it to
((baseRef + dataOffsetCns) + (indexRef * scaleCns)) if it's safe from GC's point of view and then (baseRef + dataOffsetCns) will be hoisted.

Alternatively, we can implement a more generic optimization for patterns like this:

"(invariantTree1 + X) + invariantTree2"  => "X + (invariantTree1  + invariantTree2)"

@EgorBo EgorBo closed this Nov 6, 2021
@EgorBo EgorBo reopened this Nov 6, 2021
@EgorBo EgorBo closed this Nov 7, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Dec 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants