Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mitigation for a GC Stress race after an inline pinvoke #38246

Merged
merged 1 commit into from
Jun 23, 2020

Conversation

AndyAyersMS
Copy link
Member

In the post-call part of a pinvoke inline call frame, it's not safe
to start a stress mode GC in the window between checking
g_TrapReturningThreads and the call to CORINFO_HELP_STOP_FOR_GC.

The call instruction is already getting special treatement, but there may
be other instructions between the check and call. Instead of trying
to pattern match them all, suppress GC stress if g_TrapReturningThreads
is true, the thread is in cooperative mode, and there's an active inline
call frame.

Closes #37236.

In the post-call part of a pinvoke inline call frame, it's not safe
to start a stress mode GC in the window between checking
`g_TrapReturningThreads` and the call to `CORINFO_HELP_STOP_FOR_GC`.

The call instruction is already getting special treatement, but there may
be other instructions between the check and call. Instead of trying
to pattern match them all, suppress GC stress if `g_TrapReturningThreads`
is true, the thread is in cooperative mode, and there's an active inline
call frame.

Closes dotnet#37236.
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 23, 2020
@AndyAyersMS
Copy link
Member Author

@jkotas PTAL

With this change, the test case from #37236 has now run ~5000 times without a failure; without the fix I was able to repro after less than 1000 tries. Will keep it going a bit further.

@AndyAyersMS
Copy link
Member Author

GC count looks comparable before/after fix:

;; Before
GC Counts: Gen0=16546, Gen1=16557, Gen2=1367
GC Counts: Gen0=16389, Gen1=16400, Gen2=1459
GC Counts: Gen0=16477, Gen1=16488, Gen2=1764

;; After
GC Counts: Gen0=16564, Gen1=16575, Gen2=1329
GC Counts: Gen0=16505, Gen1=16516, Gen2=1238
GC Counts: Gen0=16455, Gen1=16466, Gen2=1678

@jkotas
Copy link
Member

jkotas commented Jun 23, 2020

You can try to delete all special-casing of JIT_RareDisableHelper from gccover.cpp. This check should be able to handle all situations that the special-casing was for. If we find situations that this check is not able to handle, it means the fix is incomplete.

@jkotas
Copy link
Member

jkotas commented Jun 23, 2020

Thank you!

@AndyAyersMS
Copy link
Member Author

You can try to delete all special-casing of JIT_RareDisableHelper

I was thinking the same thing. Will try it as a follow-up.

@jkotas jkotas merged commit d6b8109 into dotnet:master Jun 23, 2020
AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this pull request Jun 24, 2020
We shouldn't need this anymore as the case it protects against should be
covered by the new check added in dotnet#38246.
AndyAyersMS added a commit that referenced this pull request Jun 25, 2020
…#38317)

We shouldn't need this anymore as the case it protects against should be
covered by the new check added in #38246.
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test failure: Interop\\COM\\NativeClients\\Events\\Events.cmd
3 participants