Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORINFO_FLG_CUSTOMLAYOUT semantics #71711

Closed
jkotas opened this issue Jul 6, 2022 · 9 comments · Fixed by #87917
Closed

CORINFO_FLG_CUSTOMLAYOUT semantics #71711

jkotas opened this issue Jul 6, 2022 · 9 comments · Fixed by #87917
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have
Milestone

Comments

@jkotas
Copy link
Member

jkotas commented Jul 6, 2022

CORINFO_FLG_CUSTOMLAYOUT flag on the JIT-EE interface has unclear semantics and it is likely disabling optimizations for no good reason in some cases. We should precisely define the semantics of this change and change the implementation to match. Depending on the definition, rename of the flag may be appropriate too.

category:cq
theme:jit-ee-interface
skill-level:expert
cost:small
impact:small

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@jkotas jkotas added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 6, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 6, 2022
@ghost
Copy link

ghost commented Jul 6, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

CORINFO_FLG_CUSTOMLAYOUT flag on the JIT-EE interface has unclear semantics and it is likely disabling optimizations for no good reason in some cases. We should precisely define the semantics of this change and change the implementation to match. Depending on the definition, rename of the flag may be appropriate too.

Author: jkotas
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@jkotas
Copy link
Member Author

jkotas commented Jul 6, 2022

cc @AaronRobinsonMSFT @EgorBo

@EgorBo
Copy link
Member

EgorBo commented Jul 6, 2022

Also, related: #64863

@SingleAccretion
Copy link
Contributor

Proposed semantics: the flag is set for a struct if the padding in it can be significant to the user.

  1. The struct had Size or explicit layout in the metadata.
  2. And the struct has "holes".

The proposed name would be CORINFO_FLG_HAS_SIGNIFICANT_PADDING.

Structs with this flag set would not be promotable - this is unlike today, where we do promote them, but then pessimize some "decomposition" transforms, but not all.

@tannergooding
Copy link
Member

What would that mean for something like:

[StructLayout(LayoutKind.Explicit)]
public struct S
{
    [FieldOffset(0)]
    public ulong Bits;

    [FieldOffset(0)]
    public double Value;
}

This has explicit layout, but no "holes". So this would still allow promotion if only one of the two fields was accessed?

A union of say a double and float however would have SIGNIFICANT_PADDING even if only one field was accessed?


This means that "small" (1-4 element) fixed-sized buffers of primitive types always have "holes" given they currently have 1 field (of the primitive type) and are explicitly sized otherwise?

It may mean we want special semantics for tracking "fixed sized buffers" in the future (there discussions on how to remove or reduce metadata bloat from declaring n fields if the feature becomes more prominent and usable in "safe" code).

@SingleAccretion
Copy link
Contributor

What about unions?

Currently unions are never promoted, so the behavior for them would not change.

This means that "small" (1-4 element) fixed-sized buffers of primitive types always have "holes" given they currently have 1 field (of the primitive type) and are explicitly sized otherwise?

Correct.

@tfenise
Copy link
Contributor

tfenise commented Jul 6, 2022

Proposed semantics: the flag is set for a struct if the padding in it can be significant to the user.

  1. The struct had Size or explicit layout in the metadata.
  2. And the struct has "holes".

Consider:

[StructLayout(LayoutKind.Explicit)]
struct Struct1
{
    [FieldOffset(0)]
    int X;

    [FieldOffset(4)]
    byte Y;
}

and

[StructLayout(LayoutKind.Sequential)]
struct Struct2
{
    int X;

    byte Y;
}

Are Struct1 and Struct2 to be treated differently? They appear to be equivalent.
Can the padding in Struct2 really be considered not "significant to the user"? If the padding in Struct1 is considered significant, and Struct1 and Struct2 are basically equivalent, then the padding in Struct2 should also be.

@SingleAccretion
Copy link
Contributor

Can the padding in Struct2 really be considered not "significant to the user"?

It is a choice to not consider it significant. It would be prohibitively expensive to disqualify structs like Struct2 from promotion, e. g. Span<T> would be non-promotable under such rules.

It does create an asymmetry like you highlight. I do not see an easy way out of that. Considering padding in explicitly laid out structs insignificant seems like too dangerous a breaking change.

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Jul 7, 2022
@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone Jul 7, 2022
@jakobbotsch jakobbotsch modified the milestones: 7.0.0, 8.0.0 Jul 7, 2022
jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Apr 30, 2023
Rework the communication so that the JIT asks the EE for struct field
related information relevant to promotion in a single JIT-EE call of a
new function called 'flattenType'.

As a side effect, removes CORINFO_FLG_CUSTOMLAYOUT and
CORINFO_FLG_INDEXABLE_FIELDS (the latter by adding support to inline
arrays in this new function). Also, last but not least, this adds
support for general recursive struct promotion provided the field limit
of 4 is not hit (for now, let's see if there are any more fundamental
issues with this).

Fix dotnet#85511
Fix dotnet#71711
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 30, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jun 3, 2023
@jakobbotsch jakobbotsch added Priority:2 Work that is important, but not critical for the release Priority:3 Work that is nice to have and removed Priority:2 Work that is important, but not critical for the release labels Jun 20, 2023
jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Jun 22, 2023
Rework the communication so that the JIT asks the EE for struct field
related information relevant to promotion in a single JIT-EE call of a
new function called getTypeLayout. Crossgen2 can then more easily
provide the information that the JIT is allowed to depend on.

As a side effect, removes CORINFO_FLG_CUSTOMLAYOUT (which fixes dotnet#71711)
and CORINFO_FLG_DONT_DIG_FIELDS.

Fix dotnet#85511
Fix dotnet#71711
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jun 22, 2023
jakobbotsch added a commit that referenced this issue Jun 27, 2023
Rework the communication so that the JIT asks the EE for struct field
related information relevant to promotion in a single JIT-EE call of a
new function called getTypeLayout. Crossgen2 can then more easily
provide the information that the JIT is allowed to depend on.

As a side effect, removes CORINFO_FLG_CUSTOMLAYOUT (which fixes #71711)
and CORINFO_FLG_DONT_DIG_FIELDS.

Fix #85511
Fix #71711
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jun 27, 2023
jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Jun 30, 2023
This fixes the other point in dotnet#71711 by adjusting the check in the VM
for whether types have significant padding or not. It effectively
unifies the VM logic with crossgen2's check.

Example C#:
```csharp
private (long, int) _tuple;

[MethodImpl(MethodImplOptions.NoInlining)]
private void Foo()
{
    _tuple = (5, 10);
}
```

Before:
```asm
;  V00 this         [V00,T00] (  3,  3   )     ref  ->  rcx         this class-hnd single-def
;# V01 OutArgs      [V01    ] (  1,  1   )  struct ( 0) [rsp+00H]   do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;  V02 tmp1         [V02    ] (  4,  8   )  struct (16) [rsp+08H]   do-not-enreg[SB] ld-addr-op "NewObj constructor temp"
;  V03 tmp2         [V03,T01] (  3,  5   )    long  ->  [rsp+08H]   do-not-enreg[] "field V02.Item1 (fldOffset=0x0)" P-DEP
;  V04 tmp3         [V04,T02] (  3,  5   )     int  ->  [rsp+10H]   do-not-enreg[] "field V02.Item2 (fldOffset=0x8)" P-DEP
;
; Lcl frame size = 24

G_M52879_IG01:  ;; offset=0000H
       sub      rsp, 24
       vzeroupper
						;; size=7 bbWeight=1 PerfScore 1.25
G_M52879_IG02:  ;; offset=0007H
       vxorps   xmm0, xmm0, xmm0
       vmovups  xmmword ptr [rsp+08H], xmm0
       mov      qword ptr [rsp+08H], 5
       mov      dword ptr [rsp+10H], 10
       vmovups  xmm0, xmmword ptr [rsp+08H]
       vmovups  xmmword ptr [rcx+08H], xmm0
						;; size=38 bbWeight=1 PerfScore 8.33
G_M52879_IG03:  ;; offset=002DH
       add      rsp, 24
       ret
						;; size=5 bbWeight=1 PerfScore 1.25

; Total bytes of code 50, prolog size 7, PerfScore 15.83, instruction count 10, allocated bytes for code 50 (MethodHash=1da13170) for method Program:Foo():this (FullOpts)
```

After:
```asm
;  V00 this         [V00,T00] (  4,  4   )     ref  ->  rcx         this class-hnd single-def
;# V01 OutArgs      [V01    ] (  1,  1   )  struct ( 0) [rsp+00H]   do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;* V02 tmp1         [V02    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op "NewObj constructor temp"
;* V03 tmp2         [V03,T01] (  0,  0   )    long  ->  zero-ref    "field V02.Item1 (fldOffset=0x0)" P-INDEP
;* V04 tmp3         [V04,T02] (  0,  0   )     int  ->  zero-ref    "field V02.Item2 (fldOffset=0x8)" P-INDEP
;
; Lcl frame size = 0

G_M52879_IG01:  ;; offset=0000H
						;; size=0 bbWeight=1 PerfScore 0.00
G_M52879_IG02:  ;; offset=0000H
       mov      qword ptr [rcx+08H], 5
       mov      dword ptr [rcx+10H], 10
						;; size=15 bbWeight=1 PerfScore 2.00
G_M52879_IG03:  ;; offset=000FH
       ret
						;; size=1 bbWeight=1 PerfScore 1.00

; Total bytes of code 16, prolog size 0, PerfScore 4.60, instruction count 3, allocated bytes for code 16 (MethodHash=1da13170) for method Program:Foo():this (FullOpts)
```
jakobbotsch added a commit that referenced this issue Jul 1, 2023
…at is considered "significant padding" (#88238)

* Stop considering auto layout types to have significant padding in the VM. This was already the behavior in crossgen2. This fixes one of the points of #71711.
* Remove special case where we never considered structs with GC pointers to have significant padding. After the above change this has no additional diffs.
* Fix the inline array layout expansion; the numFields of the inline array node was computed incorrectly, and the parent indices of descendants were not updated correctly

Example C#:
```csharp
private (long, int) _tuple;

[MethodImpl(MethodImplOptions.NoInlining)]
private void Foo()
{
    _tuple = (5, 10);
}
```

Before:
```asm
;  V00 this         [V00,T00] (  3,  3   )     ref  ->  rcx         this class-hnd single-def
;# V01 OutArgs      [V01    ] (  1,  1   )  struct ( 0) [rsp+00H]   do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;  V02 tmp1         [V02    ] (  4,  8   )  struct (16) [rsp+08H]   do-not-enreg[SB] ld-addr-op "NewObj constructor temp"
;  V03 tmp2         [V03,T01] (  3,  5   )    long  ->  [rsp+08H]   do-not-enreg[] "field V02.Item1 (fldOffset=0x0)" P-DEP
;  V04 tmp3         [V04,T02] (  3,  5   )     int  ->  [rsp+10H]   do-not-enreg[] "field V02.Item2 (fldOffset=0x8)" P-DEP
;
; Lcl frame size = 24

G_M52879_IG01:  ;; offset=0000H
       sub      rsp, 24
       vzeroupper
						;; size=7 bbWeight=1 PerfScore 1.25
G_M52879_IG02:  ;; offset=0007H
       vxorps   xmm0, xmm0, xmm0
       vmovups  xmmword ptr [rsp+08H], xmm0
       mov      qword ptr [rsp+08H], 5
       mov      dword ptr [rsp+10H], 10
       vmovups  xmm0, xmmword ptr [rsp+08H]
       vmovups  xmmword ptr [rcx+08H], xmm0
						;; size=38 bbWeight=1 PerfScore 8.33
G_M52879_IG03:  ;; offset=002DH
       add      rsp, 24
       ret
						;; size=5 bbWeight=1 PerfScore 1.25

; Total bytes of code 50, prolog size 7, PerfScore 15.83, instruction count 10, allocated bytes for code 50 (MethodHash=1da13170) for method Program:Foo():this (FullOpts)
```

After:
```asm
;  V00 this         [V00,T00] (  4,  4   )     ref  ->  rcx         this class-hnd single-def
;# V01 OutArgs      [V01    ] (  1,  1   )  struct ( 0) [rsp+00H]   do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;* V02 tmp1         [V02    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op "NewObj constructor temp"
;* V03 tmp2         [V03,T01] (  0,  0   )    long  ->  zero-ref    "field V02.Item1 (fldOffset=0x0)" P-INDEP
;* V04 tmp3         [V04,T02] (  0,  0   )     int  ->  zero-ref    "field V02.Item2 (fldOffset=0x8)" P-INDEP
;
; Lcl frame size = 0

G_M52879_IG01:  ;; offset=0000H
						;; size=0 bbWeight=1 PerfScore 0.00
G_M52879_IG02:  ;; offset=0000H
       mov      qword ptr [rcx+08H], 5
       mov      dword ptr [rcx+10H], 10
						;; size=15 bbWeight=1 PerfScore 2.00
G_M52879_IG03:  ;; offset=000FH
       ret
						;; size=1 bbWeight=1 PerfScore 1.00

; Total bytes of code 16, prolog size 0, PerfScore 4.60, instruction count 3, allocated bytes for code 16 (MethodHash=1da13170) for method Program:Foo():this (FullOpts)
```
@ghost ghost locked as resolved and limited conversation to collaborators Jul 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have
Projects
None yet
7 participants