Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoreCLR generates suboptimal codegen with structs passed via multiple registers #89374

Open
MichalPetryka opened this issue Jul 24, 2023 · 4 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@MichalPetryka
Copy link
Contributor

MichalPetryka commented Jul 24, 2023

Description

As noted in #55357 (comment), on systems using SysV ABI when structs are passed via multiple registers instead of using the stack, CoreCLR isn't fully able to reason about such usage and spills the struct to stack in some cases.

Code:

    public static UInt128 Shift(UInt128 i)
    {
        return Unsafe.BitCast<Vector128<byte>, UInt128>(Sse2.ShiftLeftLogical128BitLane(Unsafe.BitCast<UInt128, Vector128<byte>>(i), 1));
    }

Current codegen:

       sub      rsp, 40
       vzeroupper
       mov      qword ptr [rsp+18H], rdi
       mov      qword ptr [rsp+20H], rsi
       vpslldq  xmm0, xmmword ptr [rsp+18H], 1
       vmovaps  xmmword ptr [rsp], xmm0
       mov      rax, qword ptr [rsp]
       mov      rdx, qword ptr [rsp+08H]
       add      rsp, 40
       ret

Expected codegen:

        vmovq   xmm0, rdi
        vpinsrq xmm0, xmm0, rsi, 1
        vpslldq xmm0, xmm0, 1
        vmovq   rax, xmm0
        vpextrq rdx, xmm0, 1
        ret

Configuration

Any SysV ABI OS (Linux, MacOS)
Current main branch.

@MichalPetryka MichalPetryka added the tenet-performance Performance related issue label Jul 24, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 24, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 24, 2023
@ghost
Copy link

ghost commented Jul 24, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

As noted in #55357 (comment), on systems using SysV ABI when structs are passed via multiple registers instead of using the stack, CoreCLR isn't fully able to reason about such usage and spills the struct to stack in some cases.

Code:

    public static UInt128 Shift(UInt128 i)
    {
        return Unsafe.BitCast<Vector128<byte>, UInt128>(Sse2.ShiftLeftLogical128BitLane(Unsafe.BitCast<UInt128, Vector128<byte>>(i), 1));
    }

Current codegen:

       sub      rsp, 40
       vzeroupper
       mov      qword ptr [rsp+18H], rdi
       mov      qword ptr [rsp+20H], rsi
       vpslldq  xmm0, xmmword ptr [rsp+18H], 1
       vmovaps  xmmword ptr [rsp], xmm0
       mov      rax, qword ptr [rsp]
       mov      rdx, qword ptr [rsp+08H]
       add      rsp, 40
       ret

Expected codegen:

        vmovq   xmm0, rdi
        vpinsrq xmm0, xmm0, rsi, 1
        vpslldq xmm0, xmm0, 1
        vmovq   rax, xmm0
        vpextrq rdx, xmm0, 1
        ret

Configuration

Any SysV ABI OS (Linux, MacOS)

Author: MichalPetryka
Assignees: -
Labels:

tenet-performance, area-CodeGen-coreclr

Milestone: -

@JulieLeeMSFT JulieLeeMSFT added this to the 9.0.0 milestone Jul 28, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jul 28, 2023
@JulieLeeMSFT
Copy link
Member

CC @jakobbotsch

@kunalspathak
Copy link
Member

@jakobbotsch - can you please check if your struct work helped this scenario?

@jakobbotsch
Copy link
Member

I don't think it does -- #96372 (comment) talks a bit about what the long term plan is, but I don't think we'll get to this particular case in 9.0, so I will move it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

4 participants