Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove condition that forbids promotion of HFAs with CUSTOMLAYOUT flag #64863

Closed

Conversation

echesakov
Copy link
Contributor

@echesakov echesakov commented Feb 6, 2022

We have logic that checks if the layout is valid

@ghost ghost assigned echesakov Feb 6, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 6, 2022
@ghost
Copy link

ghost commented Feb 6, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Presumably, we have logic that checks if the layout is valid

Author: echesakovMSFT
Assignees: echesakovMSFT
Labels:

area-CodeGen-coreclr

Milestone: -

@echesakov
Copy link
Contributor Author

/azp run runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -1698,12 +1698,6 @@ bool Compiler::StructPromotionHelper::CanPromoteStructType(CORINFO_CLASS_HANDLE
return false;
}

// Don't struct promote if we have an CUSTOMLAYOUT flag on an HFA type
if (StructHasCustomLayout(typeFlags) && compiler->IsHfa(typeHnd))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to delete or replace the other use StructHasCustomLayout as well?

The CORINFO_FLG_CUSTOMLAYOUT flag on JIT/EE interface has always been very poorly defined. It would be nice to get rid of it, and replace it with well-defined type property or type properties as necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does. The only other user of StructHasCustomLayout is

if (StructHasCustomLayout(typeFlags) && ((typeFlags & CORINFO_FLG_CONTAINS_GC_PTR) == 0))
and based on the comment above the intent was to recognize structs with StructLayout(LayoutKind.Explicit, Size=...). However, in my local testing I found that structs marked with StructLayout(LayoutKind.Auto) also returns true for StructHasCustomLayout (e.g. ValueTuple).

Let me do more debugging here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been looking at the this and it seems that the original intent for StructHasCustomLayout was to identify LayoutKind.Explicit and prevent such structures from promoting (i.e. enregistering their fields).

I believe this is done to be able to properly copy such structs with explicitlayout attribute (incl. gaps between fields).

However, the way this is implemented in VM has interesting consequences.

For example, suppose we have

using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;

namespace Runtime_64863
{
    class Program
    {
        struct DefLayout
        {
            ulong x0;
            uint w1;
        }

        [StructLayout(LayoutKind.Auto)]
        struct AutoLayout
        {
            ulong x0;
            uint w1;
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        static AutoLayout RetAutoLayout() => default(AutoLayout);

        [MethodImpl(MethodImplOptions.NoInlining)]
        static DefLayout RetDefLayout() => default(DefLayout);

        static void Main(string[] args)
        {
            RetAutoLayout();
            RetDefLayout();
        }
    }
}

On Arm64 DefLayout return type would be promotable

; Assembly listing for method Runtime_64863.Program:RetDefLayout():DefLayout
; Emitting BLENDED_CODE for generic ARM64 CPU - Windows
; optimized code
; fp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;* V00 loc0         [V00    ] (  0,  0   )  struct (16) zero-ref    multireg-ret ld-addr-op
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [sp+00H]   "OutgoingArgSpace"
;  V02 tmp1         [V02,T00] (  2,  2   )    long  ->   x0         single-def V00.x0(offs=0x00) P-INDEP "field V00.x0 (fldOffset=0x0)"
;  V03 tmp2         [V03,T01] (  2,  2   )     int  ->   x1         single-def V00.w1(offs=0x08) P-INDEP "field V00.w1 (fldOffset=0x8)"
;
; Lcl frame size = 0

G_M26462_IG01:              ;; offset=0000H
        A9BF7BFD          stp     fp, lr, [sp,#-16]!
        910003FD          mov     fp, sp
                                                ;; bbWeight=1    PerfScore 1.50
G_M26462_IG02:              ;; offset=0008H
        AA1F03E0          mov     x0, xzr
        2A1F03E1          mov     w1, wzr
                                                ;; bbWeight=1    PerfScore 1.00
G_M26462_IG03:              ;; offset=0010H
        A8C17BFD          ldp     fp, lr, [sp],#16
        D65F03C0          ret     lr
                                                ;; bbWeight=1    PerfScore 2.00

while AutoLayout return type would not:

; Assembly listing for method Runtime_64863.Program:RetAutoLayout():AutoLayout
; Emitting BLENDED_CODE for generic ARM64 CPU - Windows
; optimized code
; fp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;  V00 loc0         [V00,T00] (  2,  2   )  struct (16) [fp+10H]   do-not-enreg[SR] multireg-ret ld-addr-op
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [sp+00H]   "OutgoingArgSpace"
;
; Lcl frame size = 16

G_M39902_IG01:              ;; offset=0000H
        A9BE7BFD          stp     fp, lr, [sp,#-32]!
        910003FD          mov     fp, sp
                                                ;; bbWeight=1    PerfScore 1.50
G_M39902_IG02:              ;; offset=0008H
        4E040FF0          dup     v16.4s, wzr
        3D8007B0          str     q16, [fp,#16] // [V00 loc0]
        F9400BA0          ldr     x0, [fp,#16]  // [V00 loc0]
        F9400FA1          ldr     x1, [fp,#24]  // [V00 loc0+0x08]
                                                ;; bbWeight=1    PerfScore 7.00
G_M39902_IG03:              ;; offset=0018H
        A8C27BFD          ldp     fp, lr, [sp],#32
        D65F03C0          ret     lr
                                                ;; bbWeight=1    PerfScore 2.00

Maybe, I misunderstand what ECMA-335 says but isn't [StructLayout(LayoutKind.Auto)] implied by default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, I misunderstand what ECMA-335 says but isn't [StructLayout(LayoutKind.Auto)] implied by default?

Wrt ECMA-335, there is always a StructLayout value written in metadata. This value is not optional.

The C# default for this value is StructLayout.Sequential for structs. So C# writes StructLayout.Sequential into metadata by default, unless it gets overridden by StructLayout attribute.

However, the way this is implemented in VM has interesting consequences.

Feel free to adjust the VM and JIT contract as needed. The existing CUSTOMLAYOUT flag is not very well defined. It would be great if we get something that well defined and easy to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I guess I confused the default value per ECMA

A class marked autolayout indicates that the loader is free to lay out the
class in any way it sees fit; any layout information that might have been specified is
ignored. This is the default.

and what C# and other compilers specify https://docs.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.layoutkind?view=net-6.0.

To reduce layout-related problems associated with the Auto value, C#, Visual Basic, and C++ compilers specify Sequential layout for value types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the JIT primarily checks for combination of lvContainsHoles && lvCustomLayout in the code - looks that the intent was to check "a value type is not tightly packed and the "holes" are due to the layout being explicit". The assumption is that the "holes" can contain some information? If the "holes" are due to padding then the JIT can discard the information (as in DefLayout case in my example).

Looking at another example

struct DefLayout2
{
    uint w0;
    ulong x1;
}

this seems to be the case

; Assembly listing for method Runtime_64863.Program:RetDefLayout2():DefLayout2
; Emitting BLENDED_CODE for generic ARM64 CPU - Windows
; optimized code
; fp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;* V00 loc0         [V00    ] (  0,  0   )  struct (16) zero-ref    multireg-ret ld-addr-op
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [sp+00H]   "OutgoingArgSpace"
;  V02 tmp1         [V02,T00] (  2,  2   )     int  ->   x0         single-def V00.w0(offs=0x00) P-INDEP "field V00.w0 (fldOffset=0x0)"
;  V03 tmp2         [V03,T01] (  2,  2   )    long  ->   x1         single-def V00.x1(offs=0x08) P-INDEP "field V00.x1 (fldOffset=0x8)"
;
; Lcl frame size = 0

G_M35870_IG01:              ;; offset=0000H
        A9BF7BFD          stp     fp, lr, [sp,#-16]!
        910003FD          mov     fp, sp
                                                ;; bbWeight=1    PerfScore 1.50
G_M35870_IG02:              ;; offset=0008H
        2A1F03E0          mov     w0, wzr
        AA1F03E1          mov     x1, xzr
                                                ;; bbWeight=1    PerfScore 1.00
G_M35870_IG03:              ;; offset=0010H
        A8C17BFD          ldp     fp, lr, [sp],#16
        D65F03C0          ret     lr
                                                ;; bbWeight=1    PerfScore 2.00

; Total bytes of code 24, prolog size 8, PerfScore 6.90, instruction count 6, allocated bytes for code 24 (MethodHash=fc8373e1) for method Runtime_64863.Program:RetDefLayout2():DefLayout2
; ============================================================

@echesakov
Copy link
Contributor Author

/azp run runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@echesakov echesakov force-pushed the Remove-CustomLayout-HFA-Condition branch from b2f355f to 678bc17 Compare March 2, 2022 00:19
@echesakov echesakov force-pushed the Remove-CustomLayout-HFA-Condition branch from 678bc17 to 71334fd Compare March 14, 2022 20:57
@echesakov
Copy link
Contributor Author

/azp run runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@echesakov
Copy link
Contributor Author

/azp run runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ghost ghost added the no-recent-activity label Apr 15, 2022
@ghost
Copy link

ghost commented Apr 15, 2022

This pull request has been automatically marked no-recent-activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no-recent-activity.

@ghost
Copy link

ghost commented Apr 30, 2022

This pull request will now be closed since it had been marked no-recent-activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the pull request, but please note that it will be locked if it remains inactive for another 30 days.

@ghost ghost closed this Apr 30, 2022
@ghost ghost locked as resolved and limited conversation to collaborators May 30, 2022
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI no-recent-activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants