Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constant folding for RuntimeInformation.IsOSPlatform(OSPlatform) #83308

Merged
merged 3 commits into from
Mar 13, 2023

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Mar 11, 2023

Partially fixes #738 where this API was acknowledged as not JIT friendly which I took personally 😄 This PR is based on previous PRs to make this happen, namely #78736 and #80431

void Test()
{
    if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
    {
        Console.Write("linux!");
    }
}

Old codegen (pre .NET 8.0):

; Assembly listing for method Program:Test():this
       4883EC28             sub      rsp, 40
       FF153E642200         call     [System.Runtime.InteropServices.OSPlatform:get_Linux()]
       488BC8               mov      rcx, rax
       FF153DA22400         call     [System.OperatingSystem:IsOSPlatform(System.String):bool]
       85C0                 test     eax, eax
       7410                 je       SHORT G_M000_IG04
       48B99858453B02020000 mov      rcx, 0x2023B455898
       FF1561802200         call     [System.Console:Write(System.String)]
G_M000_IG04:                
       90                   nop      
       4883C428             add      rsp, 40
       C3                   ret      
; Total bytes of code 45

New codegen (I'm on Windows):

; Assembly listing for method Program:Test():this
       C3                   ret      
; Total bytes of code 1

Unfortunately, this only works if the operating system is different from current, otherwise the codegen is:

; Assembly listing for method Program:Test():this
       4883EC28             sub      rsp, 40
       48B99C94B9CAD6010000 mov      rcx, 0x1D6CAB9949C
       48B82000200020002000 mov      rax, 0x20002000200020
       480B01               or       rax, qword ptr [rcx]
       48B9770069006E006400 mov      rcx, 0x64006E00690077
       4833C8               xor      rcx, rax
       48B8A294B9CAD6010000 mov      rax, 0x1D6CAB994A2
       48BA2000200020002000 mov      rdx, 0x20002000200020
       480B10               or       rdx, qword ptr [rax]
       48B864006F0077007300 mov      rax, 0x730077006F0064
       4833C2               xor      rax, rdx
       480BC8               or       rcx, rax
       7510                 jne      SHORT G_M36831_IG04
       48B90094B9CAD6010000 mov      rcx, 0x1D6CAB99400
       FF1587D47000         call     [System.Console:Write(System.String)]
G_M36831_IG04:              
       90                   nop      
G_M36831_IG05:              
       4883C428             add      rsp, 40
       C3                   ret      
; Total bytes of code 103

It happens because of some phase ordering issue in JIT, namely JIT unrolls string.Equals("Windows", staticReadonlyField) early in importer and then is unable to fold whatever vectorizer produced -- it can be done actually, but it needs a new JIT-EE API - I might add it separately if it's worth it

Also, this only works for Tier1, so it won't be folded for TC=0/AggressiveOptimization and crossgen/NativeAOT (NativeAOT needs #83043 for that)

We recommend users the new APIs we introduced e.g. OperatingSystem.IsWindows which don't need any JIT help, but it seems we still have plenty of users of the old API, see https://grep.app/search?q=IsOSPlatform&words=true&filter[lang][0]=C%23

UPD: there are some improvements from using GT_ARR_LEN instead of GT_IND in importervectorization.cpp for string's length. It's needed so VN can fold GT_ARR_LEN(FROZEN_OBJ) later.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 11, 2023
@ghost ghost assigned EgorBo Mar 11, 2023
@ghost
Copy link

ghost commented Mar 11, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Partially fixes #738 where this API was acknowledged as not JIT friendly which I took personally 😄 This PR is based on previous PRs to make this happen, namely #78736 and #80431

static bool IsLinux() => RuntimeInformation.IsOSPlatform(OSPlatform.Linux);

Old codegen (pre .NET 8.0):

; Assembly listing for method Program:IsLinux():bool
       4883EC28             sub      rsp, 40
       FF15FE642200         call     [System.Runtime.InteropServices.OSPlatform:get_Linux()]
       488BC8               mov      rcx, rax
       FF1545A02300         call     [System.OperatingSystem:IsOSPlatform(System.String):bool]
       90                   nop      
       4883C428             add      rsp, 40
       C3                   ret      
; Total bytes of code 25

New codegen (I'm on Windows):

; Assembly listing for method Program:IsLinux():bool
       33C0                 xor      eax, eax
       C3                   ret      
; Total bytes of code 3

Unfortunately, this only works if the operating system is different from current, otherwise the codegen is:

; Assembly listing for method Program:IsLinux():bool
       48B85C9272B42C020000 mov      rax, 0x22CB472925C
       48BA2000200020002000 mov      rdx, 0x20002000200020
       480B10               or       rdx, qword ptr [rax]
       48B8770069006E006400 mov      rax, 0x64006E00690077
       4833C2               xor      rax, rdx
       48BA629272B42C020000 mov      rdx, 0x22CB4729262
       48B92000200020002000 mov      rcx, 0x20002000200020
       480B0A               or       rcx, qword ptr [rdx]
       48BA64006F0077007300 mov      rdx, 0x730077006F0064
       4833D1               xor      rdx, rcx
       480BC2               or       rax, rdx
       0F94C0               sete     al
       0FB6C0               movzx    rax, al
       C3                   ret      
; Total bytes of code 82

It happens because of some phase ordering issue in JIT, namely JIT unrolls string.Equals("Windows", staticReadonlyField) early in importer and then is unable to fold whatever vectorizer produced -- it can be done actually, but it needs a new JIT-EE API - I might add it separately if it's worth it

Also, this only works for Tier1, so it won't be folded for TC=0/AggressiveOptimization and crossgen/NativeAOT (NativeAOT needs #83043 for that)

We recommend users the new APIs we introduced e.g. OperatingSystem.IsWindows which don't need any JIT help, but it seems we still have plenty of users of the old API, see https://grep.app/search?q=IsOSPlatform&words=true&filter[lang][0]=C%23

UPD: there are some improvements from using GT_ARR_LEN instead of GT_IND in importervectorization.cpp for string's length. It's needed so VN can fold GT_ARR_LEN(FROZEN_OBJ) later.

Author: EgorBo
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo
Copy link
Member Author

EgorBo commented Mar 12, 2023

A couple of diffs

The main change is to use GT_ARR_LEN for string length during unrolling. That is then folded to a constant in VN after we fold static readonly struct {str} to a frozen object handle.

Ideally, IND(Add(frozenObj, CNS)) should be foldable too but it needs a new JIT-EE api, something like:

bool readObjectValue(CORINFO_OBJECT_HANDLE obj, uint8_t *buffer, int bufferSize, int valueOffset);

(for immutable objets)

@dotnet/jit-contrib PTAL, simple change.

BasicBlock* remainderBlock = fgSplitBlockAfterStatement(block, stmt);
BasicBlockFlags propagateFlagsToRemainder = block->bbFlags & BBF_GC_SAFE_POINT;
// Conservatively mark all blocks as having IDX_LEN since we don't know where exactly it went
BasicBlockFlags propagateFlagsToAll = block->bbFlags & BBF_HAS_IDX_LEN;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use BBF_COPY_PROPAGATE

Copy link
Member Author

@EgorBo EgorBo Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use BBF_COPY_PROPAGATE

That will decrease throughput at this moment for no benefit - e.g. we'll mark 4 blocks with "has nullcheck" and will have to visit them all in earlyprop. Ideally, QmarkGenTree should store BB flags it needs to propagate IMO. I propose we add flags on demand here since we use QMARK only in a few specific cases (none of them need to propagate flags atm except string-unrolling)

Copy link
Member

@jakobbotsch jakobbotsch Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How bad is it?

If that's the case I think we should have comment about it. Also, can you add asserts that check for GT_NULLCHECK, GT_MDARR_LENGTH, GT_ALLOCOBJ at least then (can use gtTreeContainsOper for this)?

Copy link
Member Author

@EgorBo EgorBo Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add asserts that check for GT_NULLCHECK, GT_MDARR_LENGTH, GT_ALLOCOBJ at least then (can use gtTreeContainsOper for this)?

It will assert anyway if we have a block without e.g. HAS_NULLCHECK with an actuall nullcheck, so no need in an extra assert - if someone adds a new kind of qmark with a need to propagate a new block flag (prior my PR it was never needed) they will hit it. The extra regression from BBF_COPY_PROPAGATE is another +0.02% or so

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will leave it up to you -- although I don't think 0.02% is that much to have this be always correct

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not urgent, let's have a CI run with BBF_COPY_PROPAGATE

Copy link
Member

@jakobbotsch jakobbotsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one suggestion

@EgorBo EgorBo merged commit e07a59e into dotnet:main Mar 13, 2023
@EgorBo EgorBo deleted the fold-isosplatform branch March 13, 2023 18:11
@ghost ghost locked as resolved and limited conversation to collaborators Apr 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Intrinsify RuntimeInformation.IsOSPlatform
2 participants