-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FilteredByLevel_InsideScope benchmark has regressed #59267
Comments
Tagging subscribers to this area: @maryamariyan Issue DetailsIt seems to be affecting only Windows (-10% on average). Was detected in DrewScoggins/performance-2#3245 but not reported in runtime repo (cc @DrewScoggins). Repro: git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net5.0 net6.0 --filter Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)
|
After doing git bisect, this seems to be the offending commit #41897 |
@jkoritzinsky @AndyAyersMS do you think this regression is related to #41897? |
As part of that issue, our PGO data got out of date, so I would not be amazed to have seen a regression then. However, now that we've collected new PGO data for coreclr and clrjit, I would expect that any regression from #41897 would have been fixed since then when we finally got the PGO collection pipeline back online. |
@DrewScoggins could you please validate if the regression in this issue is gone after PGO fix? |
Seems like we need more analysis here. Does anyone have a profile indicating where this test spends its time? I can't tell offhand if it is in jitted code or the native parts of the runtime. PGO information got updated in January so if this was PGO related it would have cleared up long ago. So it is probably something else. |
@jkoritzinsky I'll assign this issue to you as remarked before it is a regression caused by #41897. Please let's know if you think anything related to logging here. Thanks! |
Tagging subscribers to this area: @dotnet/runtime-infrastructure Issue DetailsIt seems to be affecting only Windows (-10% on average). Was detected in DrewScoggins/performance-2#3245 but not reported in runtime repo (cc @DrewScoggins). Repro: git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net5.0 net6.0 --filter Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)
|
There's nothing related to logging in that PR. That PR was a large infrastructure change to use a different native build orchestrator (Ninja instead of MSBuild). There's little to no managed build changes there. I agree with @AndyAyersMS that we'll need a trace to see where the benchmark spends its time to be able to dig down into why there is still a regression today. |
@jkoritzinsky Is the action on this in your court, or do you need this trace from someone else? |
I can try to get the trace, but I'd prefer if someone who is more familiar with how we have the process set up to run against a recent runtime build could collect it instead. |
@adamsitnik Can you help out with getting this started please? |
I can get some traces collected and take a first cut at them. |
So I have collected traces, and you can find them here along with the copies of the runtime used to generate them, including PDBs. https://pvscmdupload.blob.core.windows.net/drewtest/59267.zip My initial investigation shows that the regression is coming from more time spent in I am not certain as to why this orchestration change caused this. My best guess is that we are not passing the same set of parameters to the native compiler any longer, and as such are getting different code out. Maybe someone else can speak with more authority on that. Baseline Disasm
Compare Disasm
|
From my selfish perspective the good news is that this regression apparently hasn't been triggered by my recent test build script cleanup change ;-). Other than that, these days I spend hours poring over iDNA analyses of CoreCLR runtime behavior and I'm trying to use them to better understand CoreCLR runtime startup costs; from that perspective these are certainly super-interesting pointers, thanks Drew! And have a great weekend too, by the way. Tomas |
Thanks, @DrewScoggins! @trylek -- does this give you any concern for 6.0 GA, or can we go ahead and move this to the 7.0.0 milestone for further investigation? |
Frankly speaking I don't think we have too much of a wiggle room. Naturally putting new features in incurs some perf penalties and making code paths more complex in the CoreCLR runtime always gnaws at our perf, I recall David Wrighton had to pull off a couple tricks to compensate for the runtime slowdown due to the newly introduced static virtual methods. The results of Drew's investigation seem to point at subtle changes in JIT behavior so I guess someone like @AndyAyersMS should chime in as to what is expected vs. unexpected; my personal belief is that .NET Core 6 pulled off a bunch of cool tricks to boost perf in spite of getting more complex and the last time I heard that was also reflected in various benchmarks and 3rd party developer findings so from that perspective I believe it's reasonable to postpone completion of the perf regression investigation to .NET 7. |
Thanks! I'll move this to 7.0.0 and put it into the codegen area for further triage/investigation. If this is a recognized but acceptable trade-off, that's perfectly fine. |
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsIt seems to be affecting only Windows (-10% on average). Was detected in DrewScoggins/performance-2#3245 but not reported in runtime repo (cc @DrewScoggins). Repro: git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net5.0 net6.0 --filter Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)
|
That matches my understanding unless Andy Ayers notices something unexpected w.r.t. JIT behavior. |
The only thing I would add I this change is in native code (coreclr.dll). I am a little suspicious that a change that looks like different codegen from the native compiler happened just after a big orchestrator change. I assume it was checked during the change, but the flags passed to the native compiler didn't change, right? |
I did a little more digging this morning. If you look at the options that we are using for doing the compile for coreclr.dll they are different from before and after this change. The key difference is that we used to use these opt flags Baseline:
Compare
|
That change looks to be unintentional. IIRC, there were multiple conflicting optimization flags in the old build (/Ox was specified by CMake, /O1 by our build). The CMake generator for Ninja would put all of the optimization flags on the command line (including the conflicting ones), but the CMake generator for MSBuild looks to always pick the last specified one. Looks like when I was resolving this, I accidentally picked the wrong setting (what the command line from the Ninja and NMake generators would use instead of what the MSBuild generator used). |
Per Drew's further analysis this looks like a change in the way a native part of the windows runtime was optimized, either directly from the flags change above, or indirectly via the impact that change in conjunction with PGO. @jkoritzinsky does this change impact the entire set of runtime files? Per this doc and subsequent, So the main difference (ignoring Normally We explicitly set So perhaps we lost At any rate I don't have an good explanation for this, but it seems like we ought to look into changing the opt options back to how it was. |
This setting does impact the entire set of runtime files. I tried changing the options back and doing a local run and I didn't see an appreciable performance improvement across the board. I saw a few of these benchmarks get better and a few get worse (some significantly, like 20% worse). I don't have enough confidence that reverting the optimization change will substantially improve the benchmarks without causing issues in others. Here's my results: main:
Using /O2 across the board.
Using /O1 across the board.
|
@AndyAyersMS any suggestions on next steps here? |
Recent perf history shows we've regained most of the missing perf (at least on this one test): So if it was me I'd just close this. FWIW the big drop in March 2022 seems to have come from #65926. |
I'll close this then. |
It seems to be affecting only Windows (-10% on average). Was detected in DrewScoggins/performance-2#3245 but not reported in runtime repo (cc @DrewScoggins).
Repro:
https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_x64_Windows%2010.0.18362%2fMicrosoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger%3a%20False%2c%20CaptureScopes%3a%20True).html
Microsoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)
The text was updated successfully, but these errors were encountered: