-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix interaction of PGO and partial compilation #66101
Comments
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsWe currently assume that a Tier0+BBINSTR jit can anticipate all the (root level) probes that might be needed during a subsequent OSR+BBINSTR rejit. But this assumption breaks down with partial compilation as the Tier0 jit only sees a subset of the full method. As a result, the OSR+BBINSTR schema is incompatible and OSR instrumentation fails. We need to fix this. Things work out for normal OSR because the Tier0 jit importation is a superset, and if OSR imports less, we consider OSR subset schema to be compatible with the Tier0 schema. See notes on #65992 for more context.
|
Cleanup properly if we fail to instrument because of a schema allocation failure. This fixes dotnet#65992. More work is needed to ensure schema allocation does not fail. This is tracked by dotnet#66101.
Future for now, as there aren't any specific plans yet on when we might try and enable partial compilation. |
Enable edge based profiles for OSR, partial compilation, and optimized plus instrumented cases. For OSR this requires deferring flow graph modifications until after we have built the initial probe list, so that the initial list reflects the entirety of the method. This set of candidate edge probes is thus the same no matter how the method is compiled. A given compile may schematize a subset of these probes and materialize a subset of what gets schematized; this is tolerated by the PGO mechanism provided that the initial instrumented jitting produces a schema which is a superset of the schema produced by any subsequent instrumented rejitting. This is normally the case. Partial compilation may still need some work to ensure full schematization but it is currently off by default. Will address this subsequently. For optimized compiles we give the EfficientEdgeCountInstrumentor the same kind of probe relocation abilities that we have in the BlockCountInstrumentor. In particular we need to move probes that might appear in return blocks that follow implicit tail call blocks, since those return blocks must remain empty. The details on how we do this are a bit different but the idea is the same: we create duplicate copies of any probe that was going to appear in the return block and instead instrument each pred. If the pred reached the return via a critical edge, we split the edge and put the probe there. This analysis relies on cheap preds, so to ensure we can use them we move all the critial edge splitting so it happens before we need the cheap pred lists. The ability to do block profiling is retained but will no longer be used without special config settings. There were also a few bug fixes in the spanning tree visitor. It must visit a superset of the blocks we end up importing and was missing visits in some cases. This should improve jit time and code quality for instrumented code. Fixes dotnet#47942. Fixes dotnet#66101. Contributes to dotnet#74873.
We currently assume that a Tier0+BBINSTR jit can anticipate all the (root level) probes that might be needed during a subsequent OSR+BBINSTR rejit. But this assumption breaks down with partial compilation as the Tier0 jit only sees a subset of the full method.
As a result, the OSR+BBINSTR schema is incompatible and OSR instrumentation fails. We need to fix this.
Things work out for normal OSR because the Tier0 jit importation is a superset, and if OSR imports less, we consider OSR subset schema to be compatible with the Tier0 schema.
See notes on #65992 for more context.
The text was updated successfully, but these errors were encountered: