Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix interaction of PGO and partial compilation #66101

Closed
Tracked by #74873
AndyAyersMS opened this issue Mar 2, 2022 · 2 comments · Fixed by #80481
Closed
Tracked by #74873

Fix interaction of PGO and partial compilation #66101

AndyAyersMS opened this issue Mar 2, 2022 · 2 comments · Fixed by #80481
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@AndyAyersMS
Copy link
Member

We currently assume that a Tier0+BBINSTR jit can anticipate all the (root level) probes that might be needed during a subsequent OSR+BBINSTR rejit. But this assumption breaks down with partial compilation as the Tier0 jit only sees a subset of the full method.

As a result, the OSR+BBINSTR schema is incompatible and OSR instrumentation fails. We need to fix this.

Things work out for normal OSR because the Tier0 jit importation is a superset, and if OSR imports less, we consider OSR subset schema to be compatible with the Tier0 schema.

See notes on #65992 for more context.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Mar 2, 2022
@ghost
Copy link

ghost commented Mar 2, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

We currently assume that a Tier0+BBINSTR jit can anticipate all the (root level) probes that might be needed during a subsequent OSR+BBINSTR rejit. But this assumption breaks down with partial compilation as the Tier0 jit only sees a subset of the full method.

As a result, the OSR+BBINSTR schema is incompatible and OSR instrumentation fails. We need to fix this.

Things work out for normal OSR because the Tier0 jit importation is a superset, and if OSR imports less, we consider OSR subset schema to be compatible with the Tier0 schema.

See notes on #65992 for more context.

Author: AndyAyersMS
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged

Milestone: -

AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this issue Mar 2, 2022
Cleanup properly if we fail to instrument because of a schema allocation
failure. This fixes dotnet#65992.

More work is needed to ensure schema allocation does not fail. This
is tracked by dotnet#66101.
AndyAyersMS added a commit that referenced this issue Mar 3, 2022
Cleanup properly if we fail to instrument because of a schema allocation
failure. This fixes #65992.

More work is needed to ensure schema allocation does not fail. This
is tracked by #66101.
@AndyAyersMS AndyAyersMS removed the untriaged New issue has not been triaged by the area owner label Mar 7, 2022
@AndyAyersMS AndyAyersMS added this to the Future milestone Mar 7, 2022
@AndyAyersMS
Copy link
Member Author

Future for now, as there aren't any specific plans yet on when we might try and enable partial compilation.

AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this issue Jan 11, 2023
Enable edge based profiles for OSR, partial compilation, and optimized plus
instrumented cases.

For OSR this requires deferring flow graph modifications until after we have
built the initial probe list, so that the initial list reflects the entirety
of the method. This set of candidate edge probes is thus the same no matter how
the method is compiled. A given compile may schematize a subset of these probes
and materialize a subset of what gets schematized; this is tolerated by the PGO
mechanism provided that the initial instrumented jitting produces a schema which
is a superset of the schema produced by any subsequent instrumented rejitting.
This is normally the case.

Partial compilation may still need some work to ensure full schematization but
it is currently off by default. Will address this subsequently.

For optimized compiles we give the EfficientEdgeCountInstrumentor the same kind
of probe relocation abilities that we have in the BlockCountInstrumentor. In
particular we need to move probes that might appear in return blocks that follow
implicit tail call blocks, since those return blocks must remain empty.

The details on how we do this are a bit different but the idea is the same: we
create duplicate copies of any probe that was going to appear in the return block
and instead instrument each pred. If the pred reached the return via a critical
edge, we split the edge and put the probe there. This analysis relies on cheap
preds, so to ensure we can use them we move all the critial edge splitting so it
happens before we need the cheap pred lists.

The ability to do block profiling is retained but will no longer be used without
special config settings.

There were also a few bug fixes in the spanning tree visitor. It must visit a
superset of the blocks we end up importing and was missing visits in some cases.

This should improve jit time and code quality for instrumented code.

Fixes dotnet#47942.
Fixes dotnet#66101.
Contributes to dotnet#74873.
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jan 11, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jan 13, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Feb 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant