Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid WriteBarrier assign for nulling struct #7659

Conversation

benaadams
Copy link
Member

@benaadams benaadams commented Feb 16, 2019

Its default in the .ctors so only needs the two fields set; Uninitalize sets it to default so Initalize again only needs to set the two fields and doesn't need to blank the Cache field.

Workaround for https://github.com/dotnet/coreclr/issues/22661 to skip 5 Jit_ByRefWriteBarriers (and 10 more in called methods + conditionally 4 more; so eliminates up to 19 CORINFO_HELP_ASSIGN_BYREF per request)

Jit_ByRefWriteBarrier shows up at around 2% of total run time in the TE benchmarks from this entry point:

image

+5% in TE Plaintext benchmarks without the stack clear and the additional skips of the interface lookup for revision in this PR.

| Description |       RPS | CPU (%) | Memory (MB) | Avg. Latency (ms) | Ratio |
| ----------- | --------- | ------- | ----------- | ----------------- | ----- |
|     Current | 2,183,750 |      93 |         171 |               3.9 |  1.00 |
|          PR | 2,289,737 |      90 |         169 |               2.8 |  1.05 |

/cc @davidfowl

/cc @kkokosa as I read about these in a book recently :)

@benaadams
Copy link
Member Author

Changes from this more expensive asm complete with rep stosd stack clearing

G_M56063_IG01:
       57                   push     rdi
       56                   push     rsi
       55                   push     rbp
       53                   push     rbx
       4881ECA8000000       sub      rsp, 168
       C5F877               vzeroupper 
       488BF1               mov      rsi, rcx
       488D7C2428           lea      rdi, [rsp+28H]
       B920000000           mov      ecx, 32
       33C0                 xor      rax, rax
       F3AB                 rep stosd 
       488BCE               mov      rcx, rsi
       488BD9               mov      rbx, rcx
       488BEA               mov      rbp, rdx

G_M56063_IG02:
       48896C2468           mov      gword ptr [rsp+68H], rbp
       488D4C2478           lea      rcx, bword ptr [rsp+78H]
       C5F857C0             vxorps   xmm0, xmm0
       C5FA7F01             vmovdqu  qword ptr [rcx], xmm0
       C5FA7F4110           vmovdqu  qword ptr [rcx+16], xmm0
       C5FA7F4120           vmovdqu  qword ptr [rcx+32], xmm0
       488BCD               mov      rcx, rbp
       49BBD811C8D9F87F0000 mov      r11, 0x7FF8D9C811D8
       3909                 cmp      dword ptr [rcx], ecx
       FF15CF52A3FF         call     [IFeatureCollection:get_Revision():int:this]
       89442470             mov      dword ptr [rsp+70H], eax
       488D7B38             lea      rdi, bword ptr [rbx+56]
       488D742468           lea      rsi, bword ptr [rsp+68H]
       E8E5FB675F           call     CORINFO_HELP_ASSIGN_BYREF
       48A5                 movsq    
       E8DEFB675F           call     CORINFO_HELP_ASSIGN_BYREF
       E8D9FB675F           call     CORINFO_HELP_ASSIGN_BYREF
       E8D4FB675F           call     CORINFO_HELP_ASSIGN_BYREF
       E8CFFB675F           call     CORINFO_HELP_ASSIGN_BYREF
       E8CAFB675F           call     CORINFO_HELP_ASSIGN_BYREF 
       E8C5FB675F           call     CORINFO_HELP_ASSIGN_BYREF

To this more svelte number

G_M56075_IG01:
       57                   push     rdi
       56                   push     rsi
       55                   push     rbp
       53                   push     rbx
       4883EC28             sub      rsp, 40
       488BF1               mov      rsi, rcx
       488BFA               mov      rdi, rdx

G_M56075_IG02:
       3936                 cmp      dword ptr [rsi], esi
       488D5E38             lea      rbx, bword ptr [rsi+56]
       488BCF               mov      rcx, rdi
       49BBD81109D7F87F0000 mov      r11, 0x7FF8D70911D8
       3909                 cmp      dword ptr [rcx], ecx
       FF15DF4FA4FF         call     [IFeatureCollection:get_Revision():int:this]
       894308               mov      dword ptr [rbx+8], eax
       488BCB               mov      rcx, rbx
       488BD7               mov      rdx, rdi
       E829F8685F           call     CORINFO_HELP_CHECKED_ASSIGN_REF

@benaadams benaadams force-pushed the Work-around-WriteBarrier-assign-for-nulling-struct branch from b58ef2c to e815690 Compare February 17, 2019 00:26
@benaadams
Copy link
Member Author

benaadams commented Feb 17, 2019

And finally down to one interface call to IFeatureCollection.Revision for Initialize !
(Rather than 3 - 5 calls)

; Assembly listing for method DefaultHttpContext:Initialize(ref):this
; Lcl frame size = 32

G_M56075_IG01:
       57                   push     rdi
       56                   push     rsi
       53                   push     rbx
       4883EC20             sub      rsp, 32
       488BF1               mov      rsi, rcx
       488BFA               mov      rdi, rdx

G_M56075_IG02:
       488BCF               mov      rcx, rdi
       49BBD8113AD7F87F0000 mov      r11, 0x7FF8D73A11D8
       3909                 cmp      dword ptr [rcx], ecx
       FF157654A4FF         call     [IFeatureCollection:get_Revision():int:this]
       8BD8                 mov      ebx, eax
       3936                 cmp      dword ptr [rsi], esi
       488D4E38             lea      rcx, bword ptr [rsi+56]
       895908               mov      dword ptr [rcx+8], ebx
       488BD7               mov      rdx, rdi
       E8BBFC695F           call     CORINFO_HELP_CHECKED_ASSIGN_REF
       488B5608             mov      rdx, gword ptr [rsi+8]
       8B0A                 mov      ecx, dword ptr [rdx]
       488D4A18             lea      rcx, bword ptr [rdx+24]
       488B5210             mov      rdx, gword ptr [rdx+16]
       8B02                 mov      eax, dword ptr [rdx]
       4883C238             add      rdx, 56
       488B12               mov      rdx, gword ptr [rdx]
       895908               mov      dword ptr [rcx+8], ebx
       E89CFC695F           call     CORINFO_HELP_CHECKED_ASSIGN_REF
       488B5610             mov      rdx, gword ptr [rsi+16]
       8B0A                 mov      ecx, dword ptr [rdx]
       488D4A10             lea      rcx, bword ptr [rdx+16]
       488B5208             mov      rdx, gword ptr [rdx+8]
       8B02                 mov      eax, dword ptr [rdx]
       4883C238             add      rdx, 56
       488B12               mov      rdx, gword ptr [rdx]
       895908               mov      dword ptr [rcx+8], ebx
       E87DFC695F           call     CORINFO_HELP_CHECKED_ASSIGN_REF
       488B4E18             mov      rcx, gword ptr [rsi+24]
       4885C9               test     rcx, rcx
       740F                 je       SHORT G_M56075_IG04

G_M56075_IG03:
       4883C108             add      rcx, 8
       895908               mov      dword ptr [rcx+8], ebx
       488BD7               mov      rdx, rdi
       E865FC695F           call     CORINFO_HELP_CHECKED_ASSIGN_REF

G_M56075_IG04:
       488B4E20             mov      rcx, gword ptr [rsi+32]
       4885C9               test     rcx, rcx
       7508                 jne      SHORT G_M56075_IG06

G_M56075_IG05:
       4883C420             add      rsp, 32
       5B                   pop      rbx
       5E                   pop      rsi
       5F                   pop      rdi
       C3                   ret      

G_M56075_IG06:
       4883C108             add      rcx, 8
       895908               mov      dword ptr [rcx+8], ebx
       488BD7               mov      rdx, rdi
       E845FC695F           call     CORINFO_HELP_CHECKED_ASSIGN_REF
       90                   nop      

G_M56075_IG07:
       4883C420             add      rsp, 32
       5B                   pop      rbx
       5E                   pop      rsi
       5F                   pop      rdi
       C3                   ret      

; Total bytes of code 180, prolog size 7 for method DefaultHttpContext:Initialize(ref):this

@benaadams
Copy link
Member Author

Pre

image

Post

image

Pre (7 Revision calls per request)

image

Post (5 Revision calls per request)

image

@benaadams
Copy link
Member Author

Should take a big chunk out of the 3.7% time now rather than just the 2% time of the WriteBarrier

image

@benaadams benaadams force-pushed the Work-around-WriteBarrier-assign-for-nulling-struct branch from e9b578f to e815690 Compare February 17, 2019 02:49
@benaadams benaadams force-pushed the Work-around-WriteBarrier-assign-for-nulling-struct branch from e815690 to be3a6ba Compare February 17, 2019 02:57
@benaadams
Copy link
Member Author

Squashed, so it was always a perfect change 😉

@davidfowl davidfowl merged commit 4646ed5 into dotnet:master Feb 17, 2019
@benaadams benaadams deleted the Work-around-WriteBarrier-assign-for-nulling-struct branch February 17, 2019 04:10
@benaadams benaadams changed the title Work around WriteBarrier assign for nulling struct Avoid WriteBarrier assign for nulling struct Feb 18, 2019
@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Aug 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants