-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: more unexpected dynamic PGO schema mismatches #85856
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsI had hoped that #85805 would have fixed all the cases where we have dynamic PGO data available but can't match up the schema with the current flow graph. But that's not the case. Another problem is that the initial flow graph for a method may be slightly different, depending on whether or not the method is an inlinee. One culprit is this bit of code (there may be more I haven't spotted yet): runtime/src/coreclr/jit/fgbasic.cpp Lines 2975 to 2980 in e61e022
So for inlinees only, we will not break blocks because of jump to next. This causes divergence like the following:
We have been relying on the fact that if we build our instrumentation plan super early we always get the same flow graph. Seems like the pragmatic thing to do is to always do this optimization if we're optimizing or instrumenting, and not just for inlinees. There is matching logic in the importer and perhaps elsewhere so this should probably be encapsulated into a helper. Doing that will (temporarily) break the ability to ingest static PGO for (hopefully just a few) root methods that have branch to next like this. Hopefully not too many of them. And SPMI will have a number of diffs as well, both cases where we no longer read PGO data for root methods and cases where we do read PGO data for inlinees).
|
…e data Always try and merge "branch to next" blocks when building the intial flow graph if BBINSTR or BBOPT is set. Fixes dotnet#85856.
Recollecting ASP.NET now that #85960 is in -- will see if I can get through that collection without hitting more cases. |
Still seeing cases of this -- when |
The JIT will sometimes decide to instrument a Tier0 method even if `BBINSTR` is not passed by the VM (this is enabled when the VM passes `BBINSTR_IF_LOOPS` so that we can provide some PGO data to OSR methods). In such cases we build the flow graph and then decide to instrument, so the flow graph shape may differ from the case where we know up front that we are going to instrument or optimize. Remedy this by also running the early branch to next flow graph opt when a Tier0 JIT is passed `BBINSTR_IF_LOOPS`. Addresses another case of dotnet#85856.
The JIT will sometimes decide to instrument a Tier0 method even if `BBINSTR` is not passed by the VM (this is enabled when the VM passes `BBINSTR_IF_LOOPS` so that we can provide some PGO data to OSR methods). In such cases we build the flow graph and then decide to instrument, so the flow graph shape may differ from the case where we know up front that we are going to instrument or optimize. Remedy this by also running the early branch to next flow graph opt when a Tier0 JIT is passed `BBINSTR_IF_LOOPS`. Addresses another case of #85856.
I recollected ASP.NET after #85873 and don't see any more mismatches. Will wait for the normal collection (which kicks off today) to refresh the others and then see if they're all clean too. If so then I'll put up the PR to turn on assertion checking for mismatches. |
Enabled the assert in #85898 so think we're in good shape now. |
I had hoped that #85805 would have fixed all the cases where we have dynamic PGO data available but can't match up the schema with the current flow graph and so throw away perfectly good PGO data.
But that's not the case. The initial flow graph for a method may be slightly different, depending on whether or not the method is an inlinee. One culprit is this bit of code (there may be more I haven't spotted yet):
runtime/src/coreclr/jit/fgbasic.cpp
Lines 2975 to 2980 in e61e022
So for inlinees only, we will not break blocks because of jump to next. This causes divergence like the following:
We have been relying on the fact that if we build our instrumentation plan super early we always get the same flow graph.
Seems like the pragmatic thing to do is to always do this optimization if we're optimizing or instrumenting, and not just for inlinees. There is matching logic in the importer and perhaps elsewhere so this should probably be encapsulated into a helper.
Doing that will (temporarily) break the ability to ingest static PGO for (hopefully just a few) root methods that have branch to next like this. Hopefully not too many of them. And SPMI will have a number of diffs as well, both cases where we no longer read PGO data for root methods and cases where we do read PGO data for inlinees).
The text was updated successfully, but these errors were encountered: