FilteredByLevel_InsideScope benchmark has regressed #59267

adamsitnik · 2021-09-17T15:14:13Z

It seems to be affecting only Windows (-10% on average). Was detected in DrewScoggins/performance-2#3245 but not reported in runtime repo (cc @DrewScoggins).

Repro:

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net5.0 net6.0 --filter Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope

https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_x64_Windows%2010.0.18362%2fMicrosoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger%3a%20False%2c%20CaptureScopes%3a%20True).html

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

Result	Ratio	Operating System
Slower	0.83	Windows 10.0.19043.1165
Slower	0.85	Windows 10.0.20348
Same	0.91	Windows 10.0.20348
Slower	0.87	Windows 10.0.18363.1621
Slower	0.87	Windows 8.1
Slower	0.86	Windows 10.0.19042.685
Slower	0.83	Windows 10.0.19043.1165
Slower	0.84	Windows 10.0.22454
Slower	0.81	Windows 10.0.22451
Slower	0.81	Windows 10.0.19042.1165
Slower	0.52	Windows 7 SP1
Same	0.92	centos 8
Slower	0.85	debian 10
Same	0.90	rhel 7
Slower	0.87	sles 15
Same	0.92	opensuse-leap 15.3
Same	0.99	ubuntu 18.04
Same	0.97	ubuntu 18.04
Same	1.04	alpine 3.13
Same	0.99	ubuntu 16.04
Same	1.03	Windows 10.0.19043.1165
Faster	1.14	Windows 10.0.22000
Same	1.03	Windows 10.0.19043.1165
Same	0.90	Windows 10.0.18363.1621
Same	0.91	Windows 10.0.19043.1165
Same	1.02	macOS Big Sur 11.5.2
Same	0.97	macOS Big Sur 11.5.2
Same	0.98	macOS Big Sur 11.4

ghost · 2021-09-17T15:14:16Z

Tagging subscribers to this area: @maryamariyan
See info in area-owners.md if you want to be subscribed.

Issue Details

It seems to be affecting only Windows (-10% on average). Was detected in DrewScoggins/performance-2#3245 but not reported in runtime repo (cc @DrewScoggins).

Repro:

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net5.0 net6.0 --filter Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope

https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_x64_Windows%2010.0.18362%2fMicrosoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger%3a%20False%2c%20CaptureScopes%3a%20True).html

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

Result	Ratio	Operating System
Slower	0.83	Windows 10.0.19043.1165
Slower	0.85	Windows 10.0.20348
Same	0.91	Windows 10.0.20348
Slower	0.87	Windows 10.0.18363.1621
Slower	0.87	Windows 8.1
Slower	0.86	Windows 10.0.19042.685
Slower	0.83	Windows 10.0.19043.1165
Slower	0.84	Windows 10.0.22454
Slower	0.81	Windows 10.0.22451
Slower	0.81	Windows 10.0.19042.1165
Slower	0.52	Windows 7 SP1
Same	0.92	centos 8
Slower	0.85	debian 10
Same	0.90	rhel 7
Slower	0.87	sles 15
Same	0.92	opensuse-leap 15.3
Same	0.99	ubuntu 18.04
Same	0.97	ubuntu 18.04
Same	1.04	alpine 3.13
Same	0.99	ubuntu 16.04
Same	1.03	Windows 10.0.19043.1165
Faster	1.14	Windows 10.0.22000
Same	1.03	Windows 10.0.19043.1165
Same	0.90	Windows 10.0.18363.1621
Same	0.91	Windows 10.0.19043.1165
Same	1.02	macOS Big Sur 11.5.2
Same	0.97	macOS Big Sur 11.5.2
Same	0.98	macOS Big Sur 11.4

Author:	adamsitnik
Assignees:	-
Labels:	`os-windows`, `tenet-performance`, `area-Extensions-Logging`
Milestone:	-

adamsitnik · 2021-09-17T15:15:41Z

Linux was not affected:

https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_x64_ubuntu%2018.04%2fMicrosoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger%3a%20False%2c%20CaptureScopes%3a%20True).html

EgorBo · 2021-09-17T15:33:54Z

that spike has too many commits to identify 😢 (missing data for ~7 days)

DrewScoggins · 2021-09-21T15:48:45Z

After doing git bisect, this seems to be the offending commit #41897

tarekgh · 2021-09-23T23:05:52Z

@jkoritzinsky @AndyAyersMS do you think this regression is related to #41897?

jkoritzinsky · 2021-09-28T18:19:32Z

As part of that issue, our PGO data got out of date, so I would not be amazed to have seen a regression then. However, now that we've collected new PGO data for coreclr and clrjit, I would expect that any regression from #41897 would have been fixed since then when we finally got the PGO collection pipeline back online.

tarekgh · 2021-09-28T18:39:03Z

@DrewScoggins could you please validate if the regression in this issue is gone after PGO fix?

AndyAyersMS · 2021-09-28T20:28:04Z

Seems like we need more analysis here. Does anyone have a profile indicating where this test spends its time? I can't tell offhand if it is in jitted code or the native parts of the runtime.

PGO information got updated in January so if this was PGO related it would have cleared up long ago. So it is probably something else.

tarekgh · 2021-09-28T23:07:25Z

@jkoritzinsky I'll assign this issue to you as remarked before it is a regression caused by #41897. Please let's know if you think anything related to logging here. Thanks!

ghost · 2021-09-28T23:08:05Z

Tagging subscribers to this area: @dotnet/runtime-infrastructure
See info in area-owners.md if you want to be subscribed.

Issue Details

It seems to be affecting only Windows (-10% on average). Was detected in DrewScoggins/performance-2#3245 but not reported in runtime repo (cc @DrewScoggins).

Repro:

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net5.0 net6.0 --filter Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope

https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_x64_Windows%2010.0.18362%2fMicrosoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger%3a%20False%2c%20CaptureScopes%3a%20True).html

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

Result	Ratio	Operating System
Slower	0.83	Windows 10.0.19043.1165
Slower	0.85	Windows 10.0.20348
Same	0.91	Windows 10.0.20348
Slower	0.87	Windows 10.0.18363.1621
Slower	0.87	Windows 8.1
Slower	0.86	Windows 10.0.19042.685
Slower	0.83	Windows 10.0.19043.1165
Slower	0.84	Windows 10.0.22454
Slower	0.81	Windows 10.0.22451
Slower	0.81	Windows 10.0.19042.1165
Slower	0.52	Windows 7 SP1
Same	0.92	centos 8
Slower	0.85	debian 10
Same	0.90	rhel 7
Slower	0.87	sles 15
Same	0.92	opensuse-leap 15.3
Same	0.99	ubuntu 18.04
Same	0.97	ubuntu 18.04
Same	1.04	alpine 3.13
Same	0.99	ubuntu 16.04
Same	1.03	Windows 10.0.19043.1165
Faster	1.14	Windows 10.0.22000
Same	1.03	Windows 10.0.19043.1165
Same	0.90	Windows 10.0.18363.1621
Same	0.91	Windows 10.0.19043.1165
Same	1.02	macOS Big Sur 11.5.2
Same	0.97	macOS Big Sur 11.5.2
Same	0.98	macOS Big Sur 11.4

Author:	adamsitnik
Assignees:	jkoritzinsky
Labels:	`os-windows`, `tenet-performance`, `area-Infrastructure`
Milestone:	6.0.0

jkoritzinsky · 2021-09-28T23:21:27Z

There's nothing related to logging in that PR. That PR was a large infrastructure change to use a different native build orchestrator (Ninja instead of MSBuild). There's little to no managed build changes there. I agree with @AndyAyersMS that we'll need a trace to see where the benchmark spends its time to be able to dig down into why there is still a regression today.

jeffhandley · 2021-10-07T18:53:13Z

...we'll need a trace to see where the benchmark spends its time to be able to dig down into why there is still a regression today.

@jkoritzinsky Is the action on this in your court, or do you need this trace from someone else?

jkoritzinsky · 2021-10-07T19:10:20Z

I can try to get the trace, but I'd prefer if someone who is more familiar with how we have the process set up to run against a recent runtime build could collect it instead.

jeffhandley · 2021-10-07T21:21:37Z

I can try to get the trace, but I'd prefer if someone who is more familiar with how we have the process set up to run against a recent runtime build could collect it instead.

@adamsitnik Can you help out with getting this started please?

DrewScoggins · 2021-10-08T17:49:23Z

I can get some traces collected and take a first cut at them.

DrewScoggins · 2021-10-08T22:53:46Z

So I have collected traces, and you can find them here along with the copies of the runtime used to generate them, including PDBs. https://pvscmdupload.blob.core.windows.net/drewtest/59267.zip

My initial investigation shows that the regression is coming from more time spent in JIT_VirtualFunctionPointer and GenericHandleCacheTraits::CompareKeys. I dug into the disassembly of coreclr.dll for both, and I noticed a few thing. First we are allocating 0x20 more bytes on the stack than we were before. We are also now directly calling JitGenericHandleCacheTraits::CompareKeys as part of the execution of the JIT_VirtualFunctionPointer. Also looking at the code before we get into the search loop we seem to setup the cache more efficiently, with fewer moves.

I am not certain as to why this orchestration change caused this. My best guess is that we are not passing the same set of parameters to the native compiler any longer, and as such are getting different code out. Maybe someone else can speak with more authority on that.

Baseline Disasm

                             *************************************************************
                             *                           FUNCTION                         
                             *************************************************************
                             void *  __fastcall  JIT_VirtualFunctionPointer (Object * 
             void *            RAX:8          <RETURN>
             Object *          RCX:8          param_1
             CORINFO_CLASS_    RDX:8          param_2
             CORINFO_METHOD    R8:8           param_3
                             ?JIT_VirtualFunctionPointer@@YAPEAXPEAVObject@  XREF[3]:     1803da588 (*) ,  180400848 (*) , 
                             JIT_VirtualFunctionPointer                                   1804b7f50 (*)   
       1801083c0 48  89  5c       MOV        qword ptr [RSP  + 0x8 ],RBX
                 24  08
       1801083c5 48  89  74       MOV        qword ptr [RSP  + 0x10 ],RSI
                 24  10
       1801083ca 57              PUSH       RDI
       1801083cb 48  83  ec       SUB        RSP ,0x20
                 20
       1801083cf 49  8b  f0       MOV        RSI ,R8
       1801083d2 4c  8b  da       MOV        R11 ,RDX
       1801083d5 48  85  c9       TEST       RCX ,RCX
       1801083d8 74  44           JZ         LAB_18010841e
       1801083da 48  8b  19       MOV        RBX ,qword ptr [RCX ]
       1801083dd 33  d2           XOR        EDX ,EDX
       1801083df 8b  c6           MOV        EAX ,ESI
       1801083e1 45  8b  d3       MOV        R10D ,R11D
       1801083e4 c1  c8  05       ROR        EAX ,0x5
       1801083e7 03  c3           ADD        EAX ,EBX
       1801083e9 41  c1  c2       ROL        R10D ,0x5
                 05
       1801083ed 44  03  d0       ADD        R10D ,EAX
       1801083f0 48  8b  05       MOV        RAX ,qword ptr [g_pJitGenericHandleCache ]       = NaP
                 61  56  39 
                 00
       1801083f7 4c  8b  40       MOV        R8 ,qword ptr [RAX  + 0x20 ]
                 20
       1801083fb 41  8b  c2       MOV        EAX ,R10D
       1801083fe 41  f7  70       DIV        dword ptr [R8  + 0x8 ]
                 08
       180108402 49  8b  00       MOV        RAX ,qword ptr [R8 ]
       180108405 48  8b  04       MOV        RAX ,qword ptr [RAX  + RDX *0x8 ]
                 d0
       180108409 48  85  c0       TEST       RAX ,RAX
       18010840c 74  10           JZ         LAB_18010841e
       18010840e 66  90           NOP
                             LAB_180108410                                   XREF[1]:     18010841c (j)   
       180108410 44  39  50       CMP        dword ptr [RAX  + 0x8 ],R10D
                 08
       180108414 74  23           JZ         LAB_180108439
                             LAB_180108416                                   XREF[4]:     18010843d (j) ,  180108443 (j) , 
                                                                                          180108449 (j) ,  18010844f (j)   
       180108416 48  8b  00       MOV        RAX ,qword ptr [RAX ]
       180108419 48  85  c0       TEST       RAX ,RAX
       18010841c 75  f2           JNZ        LAB_180108410
                             LAB_18010841e                                   XREF[2]:     1801083d8 (j) ,  18010840c (j)   
       18010841e 4c  8b  c6       MOV        R8 ,RSI
       180108421 49  8b  d3       MOV        RDX ,R11
       180108424 e8  87  00       CALL       JIT_VirtualFunctionPointer_Framed                void * JIT_VirtualFunctionPointe
                 00  00
       180108429 48  8b  5c       MOV        RBX ,qword ptr [RSP  + 0x30 ]
                 24  30
       18010842e 48  8b  74       MOV        RSI ,qword ptr [RSP  + 0x38 ]
                 24  38
       180108433 48  83  c4       ADD        RSP ,0x20
                 20
       180108437 5f              POP        RDI
       180108438 c3              RET
                             LAB_180108439                                   XREF[1]:     180108414 (j)   
       180108439 48  39  58       CMP        qword ptr [RAX  + 0x18 ],RBX
                 18
       18010843d 75  d7           JNZ        LAB_180108416
       18010843f 4c  39  58       CMP        qword ptr [RAX  + 0x20 ],R11
                 20
       180108443 75  d1           JNZ        LAB_180108416
       180108445 48  39  70       CMP        qword ptr [RAX  + 0x28 ],RSI
                 28
       180108449 75  cb           JNZ        LAB_180108416
       18010844b f6  40  30       TEST       byte ptr [RAX  + 0x30 ],0x1
                 01
       18010844f 74  c5           JZ         LAB_180108416
       180108451 48  8b  40       MOV        RAX ,qword ptr [RAX  + 0x10 ]
                 10
       180108455 48  8b  5c       MOV        RBX ,qword ptr [RSP  + 0x30 ]
                 24  30
       18010845a 48  8b  74       MOV        RSI ,qword ptr [RSP  + 0x38 ]
                 24  38
       18010845f 48  83  c4       ADD        RSP ,0x20
                 20
       180108463 5f              POP        RDI
       180108464 c3              RET

Compare Disasm

                             *************************************************************
                             *                           FUNCTION                         
                             *************************************************************
                             void *  __fastcall  JIT_VirtualFunctionPointer (Object * 
             void *            RAX:8          <RETURN>
             Object *          RCX:8          param_1
             CORINFO_CLASS_    RDX:8          param_2
             CORINFO_METHOD    R8:8           param_3
                             ?JIT_VirtualFunctionPointer@@YAPEAXPEAVObject@  XREF[3]:     1803b75a6 (*) ,  1803ebf28 (*) , 
                             JIT_VirtualFunctionPointer                                   1804a4a04 (*)   
       1800f8190 48  8b  c4       MOV        RAX ,RSP
       1800f8193 48  89  58       MOV        qword ptr [RAX  + 0x8 ],RBX
                 08
       1800f8197 48  89  70       MOV        qword ptr [RAX  + 0x10 ],RSI
                 10
       1800f819b 57              PUSH       RDI
       1800f819c 48  83  ec       SUB        RSP ,0x40
                 40
       1800f81a0 49  8b  f8       MOV        RDI ,R8
       1800f81a3 48  8b  f2       MOV        RSI ,RDX
       1800f81a6 4c  8b  d9       MOV        R11 ,RCX
       1800f81a9 48  85  c9       TEST       RCX ,RCX
       1800f81ac 74  7c           JZ         LAB_1800f822a
       1800f81ae 4c  8b  09       MOV        R9 ,qword ptr [RCX ]
       1800f81b1 8b  df           MOV        EBX ,EDI
       1800f81b3 48  89  50       MOV        qword ptr [RAX  + -0x20 ],RDX
                 e0
       1800f81b7 33  d2           XOR        EDX ,EDX
       1800f81b9 4c  89  40       MOV        qword ptr [RAX  + -0x18 ],R8
                 e8
       1800f81bd 4c  89  48       MOV        qword ptr [RAX  + -0x28 ],R9
                 d8
       1800f81c1 48  c7  40       MOV        qword ptr [RAX  + -0x10 ],0x1
                 f0  01  00 
                 00  00
       1800f81c9 8b  c6           MOV        EAX ,ESI
       1800f81cb c1  c0  05       ROL        EAX ,0x5
       1800f81ce 41  03  c1       ADD        EAX ,R9D
       1800f81d1 c1  cb  05       ROR        EBX ,0x5
       1800f81d4 03  d8           ADD        EBX ,EAX
       1800f81d6 48  8b  05       MOV        RAX ,qword ptr [g_pJitGenericHandleCache ]       = NaP
                 8b  27  39 
                 00
       1800f81dd 4c  8b  40       MOV        R8 ,qword ptr [RAX  + 0x20 ]
                 20
       1800f81e1 8b  c3           MOV        EAX ,EBX
       1800f81e3 41  f7  70       DIV        dword ptr [R8  + 0x8 ]
                 08
       1800f81e7 49  8b  00       MOV        RAX ,qword ptr [R8 ]
       1800f81ea 4c  8b  14       MOV        R10 ,qword ptr [RAX  + RDX *0x8 ]
                 d0
       1800f81ee eb  14           JMP        LAB_1800f8204
                             LAB_1800f81f0                                   XREF[1]:     1800f820f (j)   
       1800f81f0 48  8d  54       LEA        RDX ,[RSP  + 0x20 ]
                 24  20
       1800f81f5 49  8b  ca       MOV        RCX ,R10
       1800f81f8 e8  3f  00       CALL       JitGenericHandleCacheTraits::CompareKeys         int CompareKeys(EEHashEntry * pa
                 00  00
       1800f81fd 85  c0           TEST       EAX ,EAX
       1800f81ff 75  10           JNZ        LAB_1800f8211
                             LAB_1800f8201                                   XREF[1]:     1800f820d (j)   
       1800f8201 4d  8b  12       MOV        R10 ,qword ptr [R10 ]
                             LAB_1800f8204                                   XREF[1]:     1800f81ee (j)   
       1800f8204 4d  85  d2       TEST       R10 ,R10
       1800f8207 74  08           JZ         LAB_1800f8211
       1800f8209 41  39  5a       CMP        dword ptr [R10  + 0x8 ],EBX
                 08
       1800f820d 75  f2           JNZ        LAB_1800f8201
       1800f820f eb  df           JMP        LAB_1800f81f0
                             LAB_1800f8211                                   XREF[2]:     1800f81ff (j) ,  1800f8207 (j)   
       1800f8211 4d  85  d2       TEST       R10 ,R10
       1800f8214 74  14           JZ         LAB_1800f822a
       1800f8216 49  8b  42       MOV        RAX ,qword ptr [R10  + 0x10 ]
                 10
                             LAB_1800f821a                                   XREF[1]:     1800f8238 (j)   
       1800f821a 48  8b  5c       MOV        RBX ,qword ptr [RSP  + 0x50 ]
                 24  50
       1800f821f 48  8b  74       MOV        RSI ,qword ptr [RSP  + 0x58 ]
                 24  58
       1800f8224 48  83  c4       ADD        RSP ,0x40
                 40
       1800f8228 5f              POP        RDI
       1800f8229 c3              RET
                             LAB_1800f822a                                   XREF[2]:     1800f81ac (j) ,  1800f8214 (j)   
       1800f822a 4c  8b  c7       MOV        R8 ,RDI
       1800f822d 48  8b  d6       MOV        RDX ,RSI
       1800f8230 49  8b  cb       MOV        RCX ,R11
       1800f8233 e8  4c  00       CALL       JIT_VirtualFunctionPointer_Framed                void * JIT_VirtualFunctionPointe
                 00  00
       1800f8238 eb  e0           JMP        LAB_1800f821a

trylek · 2021-10-08T23:11:02Z

From my selfish perspective the good news is that this regression apparently hasn't been triggered by my recent test build script cleanup change ;-). Other than that, these days I spend hours poring over iDNA analyses of CoreCLR runtime behavior and I'm trying to use them to better understand CoreCLR runtime startup costs; from that perspective these are certainly super-interesting pointers, thanks Drew! And have a great weekend too, by the way.

Tomas

jeffhandley · 2021-10-08T23:35:51Z

Thanks, @DrewScoggins!

@trylek -- does this give you any concern for 6.0 GA, or can we go ahead and move this to the 7.0.0 milestone for further investigation?

trylek · 2021-10-08T23:47:17Z

Frankly speaking I don't think we have too much of a wiggle room. Naturally putting new features in incurs some perf penalties and making code paths more complex in the CoreCLR runtime always gnaws at our perf, I recall David Wrighton had to pull off a couple tricks to compensate for the runtime slowdown due to the newly introduced static virtual methods. The results of Drew's investigation seem to point at subtle changes in JIT behavior so I guess someone like @AndyAyersMS should chime in as to what is expected vs. unexpected; my personal belief is that .NET Core 6 pulled off a bunch of cool tricks to boost perf in spite of getting more complex and the last time I heard that was also reflected in various benchmarks and 3rd party developer findings so from that perspective I believe it's reasonable to postpone completion of the perf regression investigation to .NET 7.

jeffhandley · 2021-10-08T23:56:01Z

Thanks! I'll move this to 7.0.0 and put it into the codegen area for further triage/investigation. If this is a recognized but acceptable trade-off, that's perfectly fine.

ghost · 2021-10-08T23:56:26Z

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

It seems to be affecting only Windows (-10% on average). Was detected in DrewScoggins/performance-2#3245 but not reported in runtime repo (cc @DrewScoggins).

Repro:

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net5.0 net6.0 --filter Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope

https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_x64_Windows%2010.0.18362%2fMicrosoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger%3a%20False%2c%20CaptureScopes%3a%20True).html

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

Result	Ratio	Operating System
Slower	0.83	Windows 10.0.19043.1165
Slower	0.85	Windows 10.0.20348
Same	0.91	Windows 10.0.20348
Slower	0.87	Windows 10.0.18363.1621
Slower	0.87	Windows 8.1
Slower	0.86	Windows 10.0.19042.685
Slower	0.83	Windows 10.0.19043.1165
Slower	0.84	Windows 10.0.22454
Slower	0.81	Windows 10.0.22451
Slower	0.81	Windows 10.0.19042.1165
Slower	0.52	Windows 7 SP1
Same	0.92	centos 8
Slower	0.85	debian 10
Same	0.90	rhel 7
Slower	0.87	sles 15
Same	0.92	opensuse-leap 15.3
Same	0.99	ubuntu 18.04
Same	0.97	ubuntu 18.04
Same	1.04	alpine 3.13
Same	0.99	ubuntu 16.04
Same	1.03	Windows 10.0.19043.1165
Faster	1.14	Windows 10.0.22000
Same	1.03	Windows 10.0.19043.1165
Same	0.90	Windows 10.0.18363.1621
Same	0.91	Windows 10.0.19043.1165
Same	1.02	macOS Big Sur 11.5.2
Same	0.97	macOS Big Sur 11.5.2
Same	0.98	macOS Big Sur 11.4

Author:	adamsitnik
Assignees:	adamsitnik
Labels:	`os-windows`, `tenet-performance`, `area-CodeGen-coreclr`, `needs further triage`
Milestone:	6.0.0

trylek · 2021-10-08T23:57:06Z

That matches my understanding unless Andy Ayers notices something unexpected w.r.t. JIT behavior.

DrewScoggins · 2021-10-11T06:50:37Z

That matches my understanding unless Andy Ayers notices something unexpected w.r.t. JIT behavior.

The only thing I would add I this change is in native code (coreclr.dll). I am a little suspicious that a change that looks like different codegen from the native compiler happened just after a big orchestrator change.

I assume it was checked during the change, but the flags passed to the native compiler didn't change, right?

DrewScoggins · 2021-10-11T18:22:47Z

I did a little more digging this morning. If you look at the options that we are using for doing the compile for coreclr.dll they are different from before and after this change.

The key difference is that we used to use these opt flags /O1 /Ob2 /Oi /Oy- and we now use /Ox /Ob2 /Oi /Oy-. Looking at #41897 and the discussion that happened there, it does not look like this was intended, but I am not sure. @jkoritzinsky can you chime in here?

Baseline:

ClCompile:
         C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.28.29910\bin\HostX64\x64\CL.exe /c
/ID:\git\runtime\src\coreclr\src\pal\prebuilt\inc /ID:\git\runtime\src\coreclr\..\..\artifacts\obj\coreclr
/ID:\git\runtime\src\coreclr\src\inc /ID:\git\runtime\src\coreclr\src\debug\inc /ID:\git\runtime\src\coreclr\src\debug\inc\amd64
/ID:\git\runtime\src\coreclr\src\debug\inc\dump /ID:\git\runtime\src\coreclr\src\md\inc
/ID:\git\runtime\src\coreclr\src\classlibnative\bcltype /ID:\git\runtime\src\coreclr\src\classlibnative\cryptography
/ID:\git\runtime\src\coreclr\src\classlibnative\inc /ID:\git\runtime\artifacts\obj\coreclr\Windows_NT.x64.Release\src\inc
/ID:\git\runtime\artifacts\obj\coreclr\Windows_NT.x64.Release\src\inc\etw /ID:\git\runtime\src\coreclr\src\dlls\mscoree\..\..\inc
/Zi /nologo /W3 /WX /diagnostics:column /MP /O1 /Ob2 /Oi /Oy- /GL /D _WINDLL /D _UNICODE /D UNICODE /D WIN32 /D
_WINDOWS /D NDEBUG /D URTBLDENV_FRIENDLY=Retail /D HOST_AMD64 /D HOST_64BIT /D HOST_WINDOWS /D
TARGET_AMD64 /D TARGET_64BIT /D TARGET_WINDOWS /D FEATURE_UTF8STRING /D _BLD_CLR /D DEBUGGING_SUPPORTED
/D PROFILING_SUPPORTED /D _WIN32 /D WINVER=0x0602 /D _WIN32_WINNT=0x0602 /D WIN32_LEAN_AND_MEAN /D
_CRT_SECURE_NO_WARNINGS /D EnC_SUPPORTED /D FEATURE_ARRAYSTUB_AS_IL /D FEATURE_MULTICASTSTUB_AS_IL /D
FEATURE_PORTABLE_SHUFFLE_THUNKS /D FEATURE_INSTANTIATINGSTUB_AS_IL /D FEATURE_CODE_VERSIONING /D
FEATURE_COLLECTIBLE_TYPES /D FEATURE_COMWRAPPERS /D FEATURE_COMINTEROP /D
FEATURE_COMINTEROP_APARTMENT_SUPPORT /D FEATURE_COMINTEROP_UNMANAGED_ACTIVATION /D
FEATURE_BASICFREEZE /D FEATURE_CORECLR /D FEATURE_CORESYSTEM /D FEATURE_DEFAULT_INTERFACES /D
FEATURE_EVENT_TRACE /D FEATURE_PERFTRACING /D FEATURE_HIJACK /D FEATURE_ICASTABLE /D
FEATURE_INTEROP_DEBUGGING /D FEATURE_ISYM_READER /D FEATURE_MANAGED_ETW /D
FEATURE_MANAGED_ETW_CHANNELS /D FEATURE_MULTICOREJIT /D FEATURE_PROFAPI_ATTACH_DETACH /D
FEATURE_READYTORUN /D FEATURE_REJIT /D FEATURE_SVR_GC /D FEATURE_SYMDIFF /D FEATURE_TIERED_COMPILATION /D
FEATURE_ON_STACK_REPLACEMENT /D FEATURE_PGO /D FEATURE_TYPEEQUIVALENCE /D UNIX_AMD64_ABI_ITF /D
FEATURE_USE_ASM_GC_WRITE_BARRIERS /D FEATURE_USE_SOFTWARE_WRITE_WATCH_FOR_GC_HEAP /D
FEATURE_MANUALLY_MANAGED_CARD_BUNDLES /D FEATURE_WIN32_REGISTRY /D _SECURE_SCL=0 /D UNICODE /D _UNICODE
/D FEATURE_DATABREAKPOINT /D FEATURE_EH_FUNCLETS /D FEATURE_STANDALONE_GC /D
FX_VER_INTERNALNAME_STR=CoreCLR.dll /D "CMAKE_INTDIR=\"Release\"" /D coreclr_EXPORTS /Gm- /EHa /MT /Zp8 /GS
/guard:cf /Gy /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR- /Fo"coreclr.dir\Release\\" /Fd"coreclr.dir\Release\vc142.pdb"
/Gd /TP /wd4960 /wd4961 /wd4603 /wd4627 /wd4838 /wd4456 /wd4457 /wd4458 /wd4459 /wd4091 /wd4291 /wd5105 /FC /Zl
/errorReport:queue /we4640 /we4007 /we4013 /we4102 /we4551 /we4700 /we4806  /Zm200 /Zc:strictStrings /w34092 /w34121
/w34125 /w34130 /w34132 /w34212 /w34530 /w35038 /w44177 /ZH:SHA_256 /source-charset:utf-8
D:\git\runtime\src\coreclr\src\dlls\mscoree\mscoree.cpp D:\git\runtime\src\coreclr\src\dlls\mscoree\unixinterface.cpp
D:\git\runtime\src\coreclr\src\dlls\mscoree\delayloadhook.cpp

Compare

ClCompile:
         C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.28.29910\bin\HostX64\x64\CL.exe /c
/ID:\git\runtime\src\coreclr\src\pal\prebuilt\inc /ID:\git\runtime\src\coreclr\..\..\artifacts\obj\coreclr
/ID:\git\runtime\src\coreclr\src\inc /ID:\git\runtime\src\coreclr\src\debug\inc /ID:\git\runtime\src\coreclr\src\debug\inc\amd64
/ID:\git\runtime\src\coreclr\src\debug\inc\dump /ID:\git\runtime\src\coreclr\src\md\inc
/ID:\git\runtime\src\coreclr\src\classlibnative\bcltype /ID:\git\runtime\src\coreclr\src\classlibnative\cryptography
/ID:\git\runtime\src\coreclr\src\classlibnative\inc /ID:\git\runtime\artifacts\obj\coreclr\Windows_NT.x64.Release\src\inc
/ID:\git\runtime\artifacts\obj\coreclr\Windows_NT.x64.Release\src\inc\etw /ID:\git\runtime\src\coreclr\src\dlls\mscoree\..\..\inc
/Zi /nologo /W3 /WX /diagnostics:column /MP /Ox /Ob2 /Oi /Oy- /GL /D _WINDLL /D _UNICODE /D UNICODE /D WIN32 /D
_WINDOWS /D NDEBUG /D URTBLDENV_FRIENDLY=Retail /D HOST_AMD64 /D HOST_64BIT /D HOST_WINDOWS /D
TARGET_AMD64 /D TARGET_64BIT /D TARGET_WINDOWS /D FEATURE_UTF8STRING /D _BLD_CLR /D DEBUGGING_SUPPORTED
/D PROFILING_SUPPORTED /D _WIN32 /D WINVER=0x0602 /D _WIN32_WINNT=0x0602 /D WIN32_LEAN_AND_MEAN /D
_CRT_SECURE_NO_WARNINGS /D EnC_SUPPORTED /D FEATURE_ARRAYSTUB_AS_IL /D FEATURE_MULTICASTSTUB_AS_IL /D
FEATURE_PORTABLE_SHUFFLE_THUNKS /D FEATURE_INSTANTIATINGSTUB_AS_IL /D FEATURE_CODE_VERSIONING /D
FEATURE_COLLECTIBLE_TYPES /D FEATURE_COMWRAPPERS /D FEATURE_COMINTEROP /D
FEATURE_COMINTEROP_APARTMENT_SUPPORT /D FEATURE_COMINTEROP_UNMANAGED_ACTIVATION /D
FEATURE_BASICFREEZE /D FEATURE_CORECLR /D FEATURE_CORESYSTEM /D FEATURE_DEFAULT_INTERFACES /D
FEATURE_EVENT_TRACE /D FEATURE_PERFTRACING /D FEATURE_HIJACK /D FEATURE_ICASTABLE /D
FEATURE_INTEROP_DEBUGGING /D FEATURE_ISYM_READER /D FEATURE_MANAGED_ETW /D
FEATURE_MANAGED_ETW_CHANNELS /D FEATURE_MULTICOREJIT /D FEATURE_PROFAPI_ATTACH_DETACH /D
FEATURE_READYTORUN /D FEATURE_REJIT /D FEATURE_SVR_GC /D FEATURE_SYMDIFF /D FEATURE_TIERED_COMPILATION /D
FEATURE_ON_STACK_REPLACEMENT /D FEATURE_PGO /D FEATURE_TYPEEQUIVALENCE /D UNIX_AMD64_ABI_ITF /D
FEATURE_USE_ASM_GC_WRITE_BARRIERS /D FEATURE_USE_SOFTWARE_WRITE_WATCH_FOR_GC_HEAP /D
FEATURE_MANUALLY_MANAGED_CARD_BUNDLES /D FEATURE_WIN32_REGISTRY /D _SECURE_SCL=0 /D UNICODE /D _UNICODE
/D FEATURE_DATABREAKPOINT /D FEATURE_EH_FUNCLETS /D FEATURE_STANDALONE_GC /D
FX_VER_INTERNALNAME_STR=CoreCLR.dll /D "CMAKE_INTDIR=\"Release\"" /D coreclr_EXPORTS /Gm- /EHa /MT /Zp8 /GS
/guard:cf /Gy /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR- /Fo"coreclr.dir\Release\\" /Fd"coreclr.dir\Release\vc142.pdb"
/Gd /TP /wd4960 /wd4961 /wd4603 /wd4627 /wd4838 /wd4456 /wd4457 /wd4458 /wd4459 /wd4091 /wd4291 /wd5105 /FC /Zl
/errorReport:queue /we4640 /we4007 /we4013 /we4102 /we4551 /we4700 /we4806  /Zm200 /Zc:strictStrings /w34092 /w34121
/w34125 /w34130 /w34132 /w34212 /w34530 /w35038 /w44177 /ZH:SHA_256 /source-charset:utf-8
D:\git\runtime\src\coreclr\src\dlls\mscoree\mscoree.cpp D:\git\runtime\src\coreclr\src\dlls\mscoree\unixinterface.cpp
D:\git\runtime\src\coreclr\src\dlls\mscoree\delayloadhook.cpp

jkoritzinsky · 2021-10-11T18:38:27Z

That change looks to be unintentional.

IIRC, there were multiple conflicting optimization flags in the old build (/Ox was specified by CMake, /O1 by our build). The CMake generator for Ninja would put all of the optimization flags on the command line (including the conflicting ones), but the CMake generator for MSBuild looks to always pick the last specified one. Looks like when I was resolving this, I accidentally picked the wrong setting (what the command line from the Ninja and NMake generators would use instead of what the MSBuild generator used).

AndyAyersMS · 2021-10-19T02:38:56Z

The results of Drew's investigation seem to point at subtle changes in JIT behavior

Per Drew's further analysis this looks like a change in the way a native part of the windows runtime was optimized, either directly from the flags change above, or indirectly via the impact that change in conjunction with PGO.

@jkoritzinsky does this change impact the entire set of runtime files?

Per this doc and subsequent, /O1 is equivalent to /Og /Os /Oy /Ob2 /GF /Gy and /Ox to /Ob2 /Oi /Ot /Oy.

So the main difference (ignoring /Og for now, which seems to be dead) is /Os vs /Ot, /Oi, and the extra /G* options.

Normally /Os (opt for size) vs /Ot (opt for speed) should not matter that much if you're also using PGO, as PGO will make its own speed/size determination. But there could be subtleties here as both these are "bundle options" and we later on tweak some of the things set by the bundles.

We explicitly set /Oi (intrinsic opts) after these so it should be the same either way. And likewise we explicitly turn on /Gy (function level linking).

So perhaps we lost /GF -- string pooling. This does not seem like it would explain the diff above.

At any rate I don't have an good explanation for this, but it seems like we ought to look into changing the opt options back to how it was.

jkoritzinsky · 2022-06-10T22:04:59Z

This setting does impact the entire set of runtime files.

I tried changing the options back and doing a local run and I didn't see an appreciable performance improvement across the board. I saw a few of these benchmarks get better and a few get worse (some significantly, like 20% worse). I don't have enough confidence that reverting the optimization change will substantially improve the benchmarks without causing issues in others.

Here's my results:

main:

Method	HasISupportLoggingScopeLogger	CaptureScopes	Mean	Error	StdDev	Median	Min	Max	Ratio	RatioSD	Gen 0	Gen 1	Gen 2	Allocated
FilteredByLevel	False	False	12.89 ns	0.172 ns	0.152 ns	12.88 ns	12.70 ns	13.15 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	False	False	1,435.48 ns	10.315 ns	9.144 ns	1,436.77 ns	1,417.73 ns	1,449.82 ns	111.38	1.33	-	-	-	-
NotFiltered	False	False	13,977.78 ns	1,574.853 ns	1,685.075 ns	13,600.00 ns	11,500.00 ns	17,100.00 ns	1,096.13	146.02	-	-	-	96 B
NotFiltered_InsideScope	False	False	16,800.00 ns	3,121.156 ns	3,339.602 ns	16,450.00 ns	12,600.00 ns	24,000.00 ns	1,336.98	284.93	-	-	-	96 B

FilteredByLevel	False	True	12.89 ns	0.100 ns	0.088 ns	12.90 ns	12.74 ns	13.04 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	False	True	2,868.11 ns	92.471 ns	106.490 ns	2,820.34 ns	2,774.83 ns	3,090.93 ns	223.36	8.99	-	-	-	-
NotFiltered	False	True	13,833.33 ns	947.692 ns	1,014.019 ns	14,050.00 ns	11,800.00 ns	15,200.00 ns	1,072.48	86.74	-	-	-	384 B
NotFiltered_InsideScope	False	True	17,633.33 ns	2,387.268 ns	2,554.350 ns	18,500.00 ns	12,400.00 ns	21,400.00 ns	1,340.50	212.74	-	-	-	384 B

FilteredByLevel	True	False	12.37 ns	0.062 ns	0.055 ns	12.35 ns	12.27 ns	12.48 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	True	False	1,406.04 ns	16.370 ns	14.512 ns	1,401.81 ns	1,390.54 ns	1,441.66 ns	113.71	1.17	-	-	-	-
NotFiltered	True	False	14,888.89 ns	3,088.538 ns	3,304.701 ns	14,100.00 ns	11,000.00 ns	22,000.00 ns	1,230.75	297.83	-	-	-	384 B
NotFiltered_InsideScope	True	False	15,505.56 ns	1,351.109 ns	1,445.672 ns	15,650.00 ns	12,900.00 ns	18,200.00 ns	1,267.84	117.57	-	-	-	384 B

FilteredByLevel	True	True	14.08 ns	0.573 ns	0.660 ns	13.86 ns	13.28 ns	15.51 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	True	True	16,838.24 ns	332.393 ns	341.343 ns	16,690.68 ns	16,477.62 ns	17,473.89 ns	1,202.93	60.30	-	-	-	120 B
NotFiltered	True	True	14,000.00 ns	1,782.794 ns	1,907.570 ns	13,900.00 ns	10,900.00 ns	17,200.00 ns	995.15	146.26	-	-	-	384 B
NotFiltered_InsideScope	True	True	56,922.22 ns	4,698.774 ns	5,027.636 ns	56,850.00 ns	48,900.00 ns	66,500.00 ns	4,050.30	456.65	-	-	-	504 B

Using /O2 across the board.

Method	HasISupportLoggingScopeLogger	CaptureScopes	Mean	Error	StdDev	Median	Min	Max	Ratio	RatioSD	Gen 0	Gen 1	Gen 2	Allocated
FilteredByLevel	False	False	12.93 ns	0.165 ns	0.147 ns	12.89 ns	12.76 ns	13.19 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	False	False	1,409.52 ns	13.509 ns	12.636 ns	1,410.69 ns	1,391.21 ns	1,433.44 ns	108.99	1.69	-	-	-	-
NotFiltered	False	False	13,888.24 ns	1,722.752 ns	1,769.139 ns	13,300.00 ns	10,700.00 ns	17,500.00 ns	1,070.69	137.79	-	-	-	384 B
NotFiltered_InsideScope	False	False	16,425.00 ns	1,479.476 ns	1,453.043 ns	17,100.00 ns	13,900.00 ns	19,200.00 ns	1,269.06	115.05	-	-	-	384 B

FilteredByLevel	False	True	12.14 ns	0.086 ns	0.071 ns	12.12 ns	12.05 ns	12.30 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	False	True	2,850.19 ns	39.585 ns	37.028 ns	2,837.73 ns	2,799.48 ns	2,934.18 ns	235.17	3.60	-	-	-	-
NotFiltered	False	True	13,894.74 ns	1,744.423 ns	1,938.921 ns	14,100.00 ns	10,600.00 ns	17,800.00 ns	1,132.08	182.99	-	-	-	384 B
NotFiltered_InsideScope	False	True	18,627.78 ns	1,743.102 ns	1,865.099 ns	18,000.00 ns	16,100.00 ns	23,000.00 ns	1,535.65	169.00	-	-	-	672 B

FilteredByLevel	True	False	11.56 ns	0.058 ns	0.051 ns	11.53 ns	11.50 ns	11.67 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	True	False	1,410.14 ns	11.084 ns	9.825 ns	1,410.25 ns	1,392.11 ns	1,427.61 ns	122.04	1.16	-	-	-	-
NotFiltered	True	False	14,227.78 ns	1,327.065 ns	1,419.944 ns	14,150.00 ns	11,600.00 ns	16,700.00 ns	1,223.73	116.98	-	-	-	384 B
NotFiltered_InsideScope	True	False	16,663.16 ns	2,706.251 ns	3,007.991 ns	16,800.00 ns	12,000.00 ns	23,000.00 ns	1,392.56	257.14	-	-	-	384 B

FilteredByLevel	True	True	12.59 ns	0.124 ns	0.110 ns	12.59 ns	12.43 ns	12.80 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	True	True	16,971.99 ns	491.024 ns	525.390 ns	16,848.39 ns	16,404.33 ns	18,168.06 ns	1,356.26	37.07	-	-	-	120 B
NotFiltered	True	True	14,044.44 ns	1,285.936 ns	1,375.937 ns	14,400.00 ns	10,800.00 ns	16,200.00 ns	1,114.45	111.85	-	-	-	384 B
NotFiltered_InsideScope	True	True	62,122.22 ns	6,162.752 ns	6,594.075 ns	60,750.00 ns	52,100.00 ns	78,600.00 ns	4,923.45	578.02	-	-	-	504 B

Using /O1 across the board.

Method	HasISupportLoggingScopeLogger	CaptureScopes	Mean	Error	StdDev	Median	Min	Max	Ratio	RatioSD	Gen 0	Gen 1	Gen 2	Allocated
FilteredByLevel	False	False	12.81 ns	0.090 ns	0.080 ns	12.79 ns	12.67 ns	12.95 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	False	False	1,418.64 ns	9.310 ns	7.775 ns	1,418.64 ns	1,407.11 ns	1,435.51 ns	110.72	1.02	-	-	-	-
NotFiltered	False	False	13,827.78 ns	2,919.012 ns	3,123.310 ns	12,750.00 ns	10,200.00 ns	21,700.00 ns	1,069.83	245.67	-	-	-	384 B
NotFiltered_InsideScope	False	False	16,173.68 ns	1,839.376 ns	2,044.462 ns	16,000.00 ns	13,300.00 ns	21,800.00 ns	1,221.86	132.57	-	-	-	384 B

FilteredByLevel	False	True	12.82 ns	0.129 ns	0.115 ns	12.78 ns	12.70 ns	13.04 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	False	True	2,809.46 ns	20.251 ns	18.943 ns	2,808.26 ns	2,772.07 ns	2,837.51 ns	219.32	1.75	-	-	-	-
NotFiltered	False	True	13,921.05 ns	1,531.401 ns	1,702.149 ns	14,100.00 ns	11,400.00 ns	17,100.00 ns	1,081.61	141.83	-	-	-	384 B
NotFiltered_InsideScope	False	True	18,047.06 ns	2,218.454 ns	2,278.189 ns	17,100.00 ns	15,100.00 ns	21,600.00 ns	1,417.68	166.08	-	-	-	384 B

FilteredByLevel	True	False	12.86 ns	0.170 ns	0.159 ns	12.84 ns	12.65 ns	13.22 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	True	False	1,515.62 ns	82.576 ns	95.095 ns	1,510.77 ns	1,393.06 ns	1,684.38 ns	119.34	7.47	-	-	-	-
NotFiltered	True	False	13,383.33 ns	2,101.634 ns	2,248.725 ns	13,700.00 ns	9,400.00 ns	17,800.00 ns	1,025.76	182.43	-	-	-	384 B
NotFiltered_InsideScope	True	False	15,433.33 ns	2,106.701 ns	2,254.147 ns	15,050.00 ns	12,200.00 ns	20,100.00 ns	1,165.73	156.40	-	-	-	96 B

FilteredByLevel	True	True	12.24 ns	0.161 ns	0.150 ns	12.25 ns	11.99 ns	12.55 ns	1.00	0.00	-	-	-	-
FilteredByLevel_InsideScope	True	True	16,779.27 ns	363.143 ns	388.559 ns	16,659.58 ns	16,425.86 ns	17,829.21 ns	1,374.81	32.18	-	-	-	120 B
NotFiltered	True	True	14,015.79 ns	1,539.217 ns	1,710.836 ns	13,800.00 ns	11,500.00 ns	17,600.00 ns	1,125.32	126.53	-	-	-	96 B
NotFiltered_InsideScope	True	True	64,489.47 ns	10,687.735 ns	11,879.389 ns	59,300.00 ns	50,000.00 ns	89,200.00 ns	5,433.07	1,031.12	-	-	-	504 B

jkoritzinsky · 2022-06-10T22:05:29Z

@AndyAyersMS any suggestions on next steps here?

AndyAyersMS · 2022-06-10T22:14:11Z

Recent perf history shows we've regained most of the missing perf (at least on this one test):

So if it was me I'd just close this.

FWIW the big drop in March 2022 seems to have come from #65926.

jkoritzinsky · 2022-06-10T22:28:29Z

I'll close this then.

adamsitnik added os-windows tenet-performance Performance related issue area-Extensions-Logging labels Sep 17, 2021

dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Sep 17, 2021

adamsitnik mentioned this issue Sep 17, 2021

.NET 6.0 Microbenchmarks Performance Study Report #59272

Closed

19 tasks

tarekgh removed the untriaged New issue has not been triaged by the area owner label Sep 23, 2021

tarekgh added this to the 6.0.0 milestone Sep 23, 2021

tarekgh assigned tarekgh and jkoritzinsky and unassigned tarekgh Sep 28, 2021

tarekgh added area-Infrastructure and removed area-Extensions-Logging labels Sep 28, 2021

jeffhandley assigned adamsitnik and unassigned jkoritzinsky Oct 7, 2021

jeffhandley added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Oct 7, 2021

jeffhandley added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-Infrastructure labels Oct 8, 2021

jeffhandley modified the milestones: 6.0.0, 7.0.0 Oct 8, 2021

adamsitnik removed their assignment Oct 11, 2021

JulieLeeMSFT assigned jkoritzinsky May 20, 2022

JulieLeeMSFT removed the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Jun 10, 2022

jkoritzinsky closed this as completed Jun 10, 2022

ghost locked as resolved and limited conversation to collaborators Jul 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilteredByLevel_InsideScope benchmark has regressed #59267

FilteredByLevel_InsideScope benchmark has regressed #59267

adamsitnik commented Sep 17, 2021

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

ghost commented Sep 17, 2021

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

adamsitnik commented Sep 17, 2021

EgorBo commented Sep 17, 2021 •

edited

Loading

DrewScoggins commented Sep 21, 2021

tarekgh commented Sep 23, 2021

jkoritzinsky commented Sep 28, 2021

tarekgh commented Sep 28, 2021

AndyAyersMS commented Sep 28, 2021

tarekgh commented Sep 28, 2021

ghost commented Sep 28, 2021

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

jkoritzinsky commented Sep 28, 2021

jeffhandley commented Oct 7, 2021

jkoritzinsky commented Oct 7, 2021

jeffhandley commented Oct 7, 2021

DrewScoggins commented Oct 8, 2021

DrewScoggins commented Oct 8, 2021

trylek commented Oct 8, 2021

jeffhandley commented Oct 8, 2021

trylek commented Oct 8, 2021

jeffhandley commented Oct 8, 2021

ghost commented Oct 8, 2021

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

trylek commented Oct 8, 2021

DrewScoggins commented Oct 11, 2021

DrewScoggins commented Oct 11, 2021 •

edited

Loading

jkoritzinsky commented Oct 11, 2021

AndyAyersMS commented Oct 19, 2021

jkoritzinsky commented Jun 10, 2022

jkoritzinsky commented Jun 10, 2022

AndyAyersMS commented Jun 10, 2022

jkoritzinsky commented Jun 10, 2022

FilteredByLevel_InsideScope benchmark has regressed #59267

FilteredByLevel_InsideScope benchmark has regressed #59267

Comments

adamsitnik commented Sep 17, 2021

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

ghost commented Sep 17, 2021

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

adamsitnik commented Sep 17, 2021

EgorBo commented Sep 17, 2021 • edited Loading

DrewScoggins commented Sep 21, 2021

tarekgh commented Sep 23, 2021

jkoritzinsky commented Sep 28, 2021

tarekgh commented Sep 28, 2021

AndyAyersMS commented Sep 28, 2021

tarekgh commented Sep 28, 2021

ghost commented Sep 28, 2021

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

jkoritzinsky commented Sep 28, 2021

jeffhandley commented Oct 7, 2021

jkoritzinsky commented Oct 7, 2021

jeffhandley commented Oct 7, 2021

DrewScoggins commented Oct 8, 2021

DrewScoggins commented Oct 8, 2021

trylek commented Oct 8, 2021

jeffhandley commented Oct 8, 2021

trylek commented Oct 8, 2021

jeffhandley commented Oct 8, 2021

ghost commented Oct 8, 2021

Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)

trylek commented Oct 8, 2021

DrewScoggins commented Oct 11, 2021

DrewScoggins commented Oct 11, 2021 • edited Loading

jkoritzinsky commented Oct 11, 2021

AndyAyersMS commented Oct 19, 2021

jkoritzinsky commented Jun 10, 2022

jkoritzinsky commented Jun 10, 2022

AndyAyersMS commented Jun 10, 2022

jkoritzinsky commented Jun 10, 2022

EgorBo commented Sep 17, 2021 •

edited

Loading

DrewScoggins commented Oct 11, 2021 •

edited

Loading