Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Unsafe.BitCast for Interlocked.{Compare}Exchange of float/double #87166

Merged
merged 7 commits into from
Jun 7, 2023

Conversation

huoyaoyuan
Copy link
Member

Codegen for coreclr:

        private static float x;
        private static int i;

        public float Variable(float a, float b) => Interlocked.CompareExchange(ref x, a, b);

        public float Constant() => Interlocked.CompareExchange(ref x, 1.0f, 2.0f);

        public float IntField() => Interlocked.CompareExchange(ref Unsafe.As<int, float>(ref i), 1.0f, 2.0f);
; Method CSPlayground.Program:Variable(float,float):float:this
G_M23030_IG01:  ;; offset=0000H
       vzeroupper 
						;; size=3 bbWeight=1 PerfScore 1.00

G_M23030_IG02:  ;; offset=0003H
       mov      rdx, 0x7FF8C0D428B8      ; data for CSPlayground.Program:x
       vmovd    ecx, xmm1
       vmovd    eax, xmm2
       lock     
       cmpxchg  dword ptr [rdx], ecx
       vmovd    xmm0, eax
						;; size=26 bbWeight=1 PerfScore 24.25

G_M23030_IG03:  ;; offset=001DH
       ret      
; Method CSPlayground.Program:Constant():float:this
G_M63042_IG01:  ;; offset=0000H
       vzeroupper 
						;; size=3 bbWeight=1 PerfScore 1.00

G_M63042_IG02:  ;; offset=0003H
       mov      rdx, 0x7FF8C0D328B8      ; data for CSPlayground.Program:x
       mov      ecx, 0x3F800000
       mov      eax, 0x40000000
       lock     
       cmpxchg  dword ptr [rdx], ecx
       vmovd    xmm0, eax
						;; size=28 bbWeight=1 PerfScore 20.75

G_M63042_IG03:  ;; offset=001FH
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00
; Method CSPlayground.Program:IntField():float:this
G_M13357_IG01:  ;; offset=0000H
       vzeroupper 
						;; size=3 bbWeight=1 PerfScore 1.00

G_M13357_IG02:  ;; offset=0003H
       mov      rdx, 0x7FF8C0D328BC      ; data for CSPlayground.Program:i
       mov      ecx, 0x3F800000
       mov      eax, 0x40000000
       lock     
       cmpxchg  dword ptr [rdx], ecx
       vmovd    xmm0, eax
						;; size=28 bbWeight=1 PerfScore 20.75

G_M13357_IG03:  ;; offset=001FH
       ret      

I don't know how to see codegen for mono/nativeaot, but as long as they implement Unsafe.As/BitCast correctly this would be very simple.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jun 6, 2023
@ghost
Copy link

ghost commented Jun 6, 2023

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

Issue Details

Codegen for coreclr:

        private static float x;
        private static int i;

        public float Variable(float a, float b) => Interlocked.CompareExchange(ref x, a, b);

        public float Constant() => Interlocked.CompareExchange(ref x, 1.0f, 2.0f);

        public float IntField() => Interlocked.CompareExchange(ref Unsafe.As<int, float>(ref i), 1.0f, 2.0f);
; Method CSPlayground.Program:Variable(float,float):float:this
G_M23030_IG01:  ;; offset=0000H
       vzeroupper 
						;; size=3 bbWeight=1 PerfScore 1.00

G_M23030_IG02:  ;; offset=0003H
       mov      rdx, 0x7FF8C0D428B8      ; data for CSPlayground.Program:x
       vmovd    ecx, xmm1
       vmovd    eax, xmm2
       lock     
       cmpxchg  dword ptr [rdx], ecx
       vmovd    xmm0, eax
						;; size=26 bbWeight=1 PerfScore 24.25

G_M23030_IG03:  ;; offset=001DH
       ret      
; Method CSPlayground.Program:Constant():float:this
G_M63042_IG01:  ;; offset=0000H
       vzeroupper 
						;; size=3 bbWeight=1 PerfScore 1.00

G_M63042_IG02:  ;; offset=0003H
       mov      rdx, 0x7FF8C0D328B8      ; data for CSPlayground.Program:x
       mov      ecx, 0x3F800000
       mov      eax, 0x40000000
       lock     
       cmpxchg  dword ptr [rdx], ecx
       vmovd    xmm0, eax
						;; size=28 bbWeight=1 PerfScore 20.75

G_M63042_IG03:  ;; offset=001FH
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00
; Method CSPlayground.Program:IntField():float:this
G_M13357_IG01:  ;; offset=0000H
       vzeroupper 
						;; size=3 bbWeight=1 PerfScore 1.00

G_M13357_IG02:  ;; offset=0003H
       mov      rdx, 0x7FF8C0D328BC      ; data for CSPlayground.Program:i
       mov      ecx, 0x3F800000
       mov      eax, 0x40000000
       lock     
       cmpxchg  dword ptr [rdx], ecx
       vmovd    xmm0, eax
						;; size=28 bbWeight=1 PerfScore 20.75

G_M13357_IG03:  ;; offset=001FH
       ret      

I don't know how to see codegen for mono/nativeaot, but as long as they implement Unsafe.As/BitCast correctly this would be very simple.

Author: huoyaoyuan
Assignees: -
Labels:

area-System.Threading, community-contribution

Milestone: -

/// <param name="comparand">The value that is compared to the value at <paramref name="location1"/>.</param>
/// <returns>The original value in <paramref name="location1"/>.</returns>
/// <exception cref="NullReferenceException">The address of <paramref name="location1"/> is a null pointer.</exception>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AggressiveInlining wasn't shown necessary in my first explorations, added just for consistency with uint/ulong ones.

@huoyaoyuan
Copy link
Member Author

Codegen for ARM64:

        public float Constant() => Interlocked.CompareExchange(ref x, 1.0f, 2.0f);

        public int ConstantInteger() => Interlocked.CompareExchange(ref i, 1, 2);
; Method CSPlayground.Program:ConstantInteger():int:this
G_M53319_IG01:  ;; offset=0000H
            stp     fp, lr, [sp, #-0x10]!
            mov     fp, sp
						;; size=8 bbWeight=1 PerfScore 1.50

G_M53319_IG02:  ;; offset=0008H
            movz    x0, #0x28BC      // data for CSPlayground.Program:i
            movk    x0, #0xC0D2 LSL #16
            movk    x0, #0x7FF8 LSL #32
            mov     w1, #1
            mov     w2, #2
            casal   w2, w1, [x0]
            mov     w0, w2
						;; size=28 bbWeight=1 PerfScore 6.00

G_M53319_IG03:  ;; offset=0024H
            ldp     fp, lr, [sp], #0x10
            ret     lr
						;; size=8 bbWeight=1 PerfScore 2.00
; Total bytes of code: 44
; Method CSPlayground.Program:Constant():float:this
G_M63042_IG01:  ;; offset=0000H
            stp     fp, lr, [sp, #-0x10]!
            mov     fp, sp
						;; size=8 bbWeight=1 PerfScore 1.50

G_M63042_IG02:  ;; offset=0008H
            movz    x0, #0x28B8      // data for CSPlayground.Program:x
            movk    x0, #0xC0D3 LSL #16
            movk    x0, #0x7FF8 LSL #32
            mov     w1, #0x3F800000
            mov     w2, #0x40000000
            casal   w2, w1, [x0]
            fmov    s0, w2
						;; size=28 bbWeight=1 PerfScore 6.50

G_M63042_IG03:  ;; offset=0024H
            ldp     fp, lr, [sp], #0x10
            ret     lr
						;; size=8 bbWeight=1 PerfScore 2.00
; Total bytes of code: 44

I don't really understand ARM asm, but since the patterns are almost the same, I think there's no issue.

@vargaz
Copy link
Contributor

vargaz commented Jun 6, 2023

The mono changes look ok.

@EgorBo
Copy link
Member

EgorBo commented Jun 6, 2023

I'd check codegen for x86/arm32 for the double one.
Although, should be fine

@huoyaoyuan
Copy link
Member Author

Double codegen for x86:

G_M6963_IG01:  ;; offset=0000H
       push     ebp
       mov      ebp, esp
       sub      esp, 32
       vzeroupper 
						;; size=9 bbWeight=1 PerfScore 2.50

G_M6963_IG02:  ;; offset=0009H
       vmovsd   xmm0, qword ptr [ebp+10H]
       vmovsd   qword ptr [ebp-08H], xmm0
       mov      ecx, dword ptr [ebp-08H]
       mov      eax, dword ptr [ebp-04H]
       vmovsd   xmm0, qword ptr [ebp+08H]
       vmovsd   qword ptr [ebp-10H], xmm0
       push     eax
       push     ecx
       push     dword ptr [ebp-0CH]
       push     dword ptr [ebp-10H]
       mov      ecx, 0x7FF8BFC728B8      ; data for CSPlayground.Program:x
       call     System.Threading.Interlocked:CompareExchange(byref,long,long):long
       mov      dword ptr [ebp-18H], eax
       mov      dword ptr [ebp-14H], edx
       vmovsd   xmm0, qword ptr [ebp-18H]
       vmovsd   qword ptr [ebp-20H], xmm0
       fld      qword ptr [ebp-20H]
						;; size=63 bbWeight=1 PerfScore 21.75

G_M6963_IG03:  ;; offset=0048H
       mov      esp, ebp
       pop      ebp
       ret      16
						;; size=6 bbWeight=1 PerfScore 2.75
; Method CSPlayground.Program:Constant():double:this
G_M36327_IG01:  ;; offset=0000H
       push     ebp
       mov      ebp, esp
       sub      esp, 16
       vzeroupper 
						;; size=9 bbWeight=1 PerfScore 2.50

G_M36327_IG02:  ;; offset=0009H
       push     0x3FF00000
       push     0
       push     0x40000000
       push     0
       mov      ecx, 0x7FF8BFC528B8      ; data for CSPlayground.Program:x
       call     System.Threading.Interlocked:CompareExchange(byref,long,long):long
       mov      dword ptr [ebp-08H], eax
       mov      dword ptr [ebp-04H], edx
       vmovsd   xmm0, qword ptr [ebp-08H]
       vmovsd   qword ptr [ebp-10H], xmm0
       fld      qword ptr [ebp-10H]
						;; size=43 bbWeight=1 PerfScore 11.75

G_M36327_IG03:  ;; offset=0034H
       mov      esp, ebp
       pop      ebp
       ret      
						;; size=4 bbWeight=1 PerfScore 1.75
; Total bytes of code: 56

The call to CompareExchange isn't inlined, since it's an FCall?

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

@EgorBo
Copy link
Member

EgorBo commented Jun 6, 2023

CI is not feeling well

The call to CompareExchange isn't inlined, since it's an FCall?

yes, that's expected

@MichalPetryka
Copy link
Contributor

I'd check codegen for x86/arm32 for the double one. Although, should be fine

double codegen will likely change with #85562 since that makes it partially handled as an intrinsic in bitcast.

@jkotas jkotas merged commit 1ade285 into dotnet:main Jun 7, 2023
@huoyaoyuan huoyaoyuan deleted the interlocked-bitcast branch June 8, 2023 01:21
@ghost ghost locked as resolved and limited conversation to collaborators Jul 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Threading community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants