-
Notifications
You must be signed in to change notification settings - Fork 5.2k
JIT: Move Thread sync and exec context save/restore to happen around runtime async bodies #121448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
|
Still not ready -- needs more testing, and I need to introduce the test case. But this is a heads up that some changes are incoming around the save/restore behavior. |
|
This doesn't feel right around Sync/Execution save and restore at suspension points. It looks correct for save/restore around the initial call, but the async callsite on suspension behavior looks wrong to me. Let me see if I can write up a few test cases. Do we have a convenient async testbed for these sorts of tests? In general, I think save/restore at the top level makes sense, but using that context saved at the top level during a suspend as the "captured" context I think is wrong. |
src/tests/async is where they live, there is a synchronization context test.
We aren't doing that, we are getting new contexts from |
|
@jakobbotsch Ah, I was confused by some of the terminology around the JIT here. What happens if there is an exception while restoring the old contexts on suspension? Are we still inside the try/finally which will ALSO attempt to restore the old contexts? Notably, restoring a context involves running arbitrary code since you can have an AsyncLocal with value changed notifications, and what is the behavior if they throw? I don't want us to try to run the ExecutionContext restore twice. I have no idea what that would do. Now that I've looked at this, I think otherwise its all good. |
No, suspension/resumption code is not inside the try clause. So if |
|
The failures all look like infra issues (QUIC failure, OSX timeouts and "The SSL connection could not be established" failures). cc @dotnet/jit-contrib PTAL @AndyAyersMS @VSadov |
| DISPSTMT(restoreStmt); | ||
| for (BasicBlock* block : Blocks()) | ||
| { | ||
| AddContextArgsToAsyncCalls(block); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two new well known args for the synchronization context and execution context. This is modelling the fact that there is a use of these two values when we later expand the async call; we need to restore the Thread fields on the suspension path from these values.
| CORINFO_CALL_INFO callInfo = {}; | ||
| callInfo.hMethod = captureCall->gtCallMethHnd; | ||
| callInfo.methodFlags = info.compCompHnd->getMethodAttribs(callInfo.hMethod); | ||
| impMarkInlineCandidate(captureCall, MAKE_METHODCONTEXT(callInfo.hMethod), false, &callInfo, compInlineContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially was adding the try-fault and doing return merging from fgAddInternal + morph, but I really want to be able to inline these helpers, so I moved it here and taught the pass to do its own return merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a stress mode even if CORINFO_ASYNC_SAVE_CONTEXTS is not set that runs the return merging logic anyways? Would require adding a stress mode arg to CreateReturnBB.
| private static void CaptureContinuationContext(SynchronizationContext syncCtx, ref object context, ref ContinuationFlags flags) | ||
| private static void CaptureContinuationContext(ref object continuationContext, ref ContinuationFlags flags) | ||
| { | ||
| SynchronizationContext? syncCtx = Thread.CurrentThreadAssumedInitialized._synchronizationContext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now just get it from the Thread object here; it is relying on the fact that if the callee was runtime async, then it has restored the fields already. This is matching the behavior as Task.UnsafeSetContinuationForAwait which also happens as the continuation gets created. (And importantly, when the callee was not async we get a possibly-changed sync context here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CurrentThreadAssumedInitialized was safe here because any entry into runtime async would go through a Task-returning thunk, which would touch CurrentThread.
If the thunk no longer captures/restores, this might need to use CurrentThread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove it from the thunks then we should just replace it by an explicit call to initialize the current thread. For now we can just keep it in the thunks though.
| // and generally never on stack. | ||
| // | ||
| regNumber CallArgs::GetCustomRegister(Compiler* comp, CorInfoCallConvExtension cc, WellKnownArg arg) | ||
| bool CallArgs::GetCustomRegister(Compiler* comp, CorInfoCallConvExtension cc, WellKnownArg arg, regNumber* reg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this to add the possibility of well known args that do not affect the ABI computation. That allows adding arbitrary uses that do not affect how other args are passed.
I'm using it for the new AsyncExecutionContext and AsyncSynchronizationContext, and I also put StackArrayLocal under this category (even though I think the call is always expanded out).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the async context (ExecutionContext and SynchronizationContext) handling for runtime async methods. The changes move from a per-await-call context save/restore model to a method-level context management approach, simplifying the implementation and enabling runtime async support by default.
Key Changes
- Method-level context management: ExecutionContext and SynchronizationContext are now saved once at method entry and restored at method exit/exception, rather than around each individual async call.
- Simplified async call handling: Removed per-call suspended indicator logic and execution context tracking fields from
AsyncCallInfo. - Enabled by default: Runtime async support is now enabled by default via the
RuntimeAsyncconfiguration flag.
Reviewed Changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/tests/async/synchronization-context/synchronization-context.cs | Adds test for async-to-sync context switching behavior |
| src/tests/async/Directory.Build.targets | Removes file that disabled async tests for certain configurations |
| src/coreclr/vm/method.hpp | Adds RequiresAsyncContextSaveAndRestore() method and fixes typo in comment |
| src/coreclr/vm/jitinterface.cpp | Passes CORINFO_ASYNC_SAVE_CONTEXTS flag to JIT for eligible methods |
| src/coreclr/jit/morph.cpp | Refactors GetCustomRegister to return bool and updates well-known arg names |
| src/coreclr/jit/lclmorph.cpp | Removes async suspended indicator local handling |
| src/coreclr/jit/importercalls.cpp | Simplifies async call setup by removing per-call context save/restore logic |
| src/coreclr/jit/importer.cpp | Removes tailcall restrictions for task awaits and adds new context args to inline handling |
| src/coreclr/jit/gentree.h | Refactors AsyncCallInfo structure and renames well-known args |
| src/coreclr/jit/gentree.cpp | Updates GetCustomRegister signature and removes obsolete methods |
| src/coreclr/jit/flowgraph.cpp | Minor comment punctuation fix |
| src/coreclr/jit/fginline.cpp | Corrects type and preserves async flag for inlined async calls |
| src/coreclr/jit/compiler.hpp | Removes async suspended indicator local handling from visitor |
| src/coreclr/jit/compiler.h | Adds method-level context locals and removes per-call tracking field |
| src/coreclr/jit/async.h | Renames field and adds RestoreContexts method |
| src/coreclr/jit/async.cpp | Major refactoring: implements method-level context save/restore with try/fault wrapper |
| src/coreclr/inc/corinfo.h | Adds CORINFO_ASYNC_SAVE_CONTEXTS flag |
| src/coreclr/inc/clrconfigvalues.h | Enables runtime async by default |
| src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs | Updates parameter names and removes sync context argument from continuation context capture |
| // Create IR to restore contexts on suspension. | ||
| // | ||
| // Parameters: | ||
| // block - Block that contains the async calll |
Copilot
AI
Nov 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comment: "calll" should be "call".
| // block - Block that contains the async calll | |
| // block - Block that contains the async call |
|
|
||
| // Runtime-async | ||
| RETAIL_CONFIG_DWORD_INFO(UNSUPPORTED_RuntimeAsync, W("RuntimeAsync"), 0, "Enables runtime async method support") | ||
| RETAIL_CONFIG_DWORD_INFO(UNSUPPORTED_RuntimeAsync, W("RuntimeAsync"), 1, "Enables runtime async method support") |
Copilot
AI
Nov 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] This change enables runtime async by default (changing from 0 to 1). While this may be intentional for this PR, consider whether this should be behind a feature flag or enabled gradually, as it represents a significant behavior change that could impact production systems. Ensure this change is properly documented and communicated.
| RETAIL_CONFIG_DWORD_INFO(UNSUPPORTED_RuntimeAsync, W("RuntimeAsync"), 1, "Enables runtime async method support") | |
| RETAIL_CONFIG_DWORD_INFO(UNSUPPORTED_RuntimeAsync, W("RuntimeAsync"), 0, "Enables runtime async method support") |
| // We cannot tailcall ValueTask returning methods as we need to preserve | ||
| // the Continuation instance for ValueTaskSource handling (the BCL needs | ||
| // to look at continuation.Next). We cannot easily differentiate between | ||
| // ValueTask and Task here, so we just disable it more generally. | ||
| if ((prefixFlags & PREFIX_IS_TASK_AWAIT) != 0) | ||
| { | ||
| return false; | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is moved to the same place as the check for CORINFO_FLG_SYNCH, since it is similar in spirit (we need to introduce try-finally, so we cannot tailcall).
| asyncInfo.ExecutionContextHandling = ExecutionContextHandling::None; | ||
| asyncInfo.ContinuationContextHandling = ContinuationContextHandling::None; | ||
| asyncInfo.SaveAndRestoreSynchronizationContextField = false; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably keep this for IL instantiating stubs, but since the case is rare this is probably ok.
AndyAyersMS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JIT changes LGTM, left some notes on (adding) comments.
| ValidateNoAsyncSavesNecessaryInStatement(stmt); | ||
| continue; | ||
| } | ||
| EHblkDsc* newEntry = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to call out that since this new EH region encloses all others it will be placed at the end of the EH table, so setting XTnew to the current table length is the correct thing to do.
| CORINFO_CALL_INFO callInfo = {}; | ||
| callInfo.hMethod = captureCall->gtCallMethHnd; | ||
| callInfo.methodFlags = info.compCompHnd->getMethodAttribs(callInfo.hMethod); | ||
| impMarkInlineCandidate(captureCall, MAKE_METHODCONTEXT(callInfo.hMethod), false, &callInfo, compInlineContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a stress mode even if CORINFO_ASYNC_SAVE_CONTEXTS is not set that runs the return merging logic anyways? Would require adding a stress mode arg to CreateReturnBB.
| { | ||
| newReturnBB->bbWeight = newReturnBB->computeIncomingWeight(); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(fgReturnCount == 1)? (may be too annoying; I know sometimes we have trouble counting...)
| { | ||
| tree = tree->AsLclVarCommon()->Data(); | ||
| } | ||
| // Create try-fault structure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe mention that this is actually a try/finally but you're going to manually inline the finally and will only need one copy of it thanks to the return merging logic below.
|
Do we still need context store/restore in task-returning thunks after this? CC: @eduardo-vp |
| canTailCall = false; | ||
| szCanTailCallFailReason = "Caller is Reverse P/Invoke"; | ||
| } | ||
| else if (compIsAsync()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can non-variant/infrastructure helpers use tailcalls?
For the existing helpers, none of the calls that they contain would fit the tailcall pattern, so the question is mostly theoretical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this check could be for CORINFO_ASYNC_SAVE_CONTEXTS instead. Instantiating stubs could maybe benefit.
It would require us to keep the other code I pointed at above (#121448 (comment))
|
|
||
| [MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
| private static void RestoreContexts(bool suspended, ExecutionContext? previousExecCtx, SynchronizationContext? previousSyncCtx) | ||
| private static void RestoreContexts(bool started, ExecutionContext? previousExecCtx, SynchronizationContext? previousSyncCtx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a comment about meaning of "started". As I understand it is an indicator of whether we see a normal/initial entry into the method vs. we are resuming.
It was not obvious on the first read when I saw this method's changes.
This fixes an issue where there was a behavioral difference between async1 and async2 for async to sync runtime async calls. Before this PR the JIT would always save and restore contexts around awaits of task-returning methods, regardless of whether they were implemented as runtime async functions or not. That does not match async1: in async1, the context save and restore happens around the body in each async method. The former logic was an optimization, but the optimization is only correct for runtime async to runtime async calls.
We cannot in general know if a callee is implemented by runtime async or not, so we have to move the context save/restore to happen around the body of all runtime async methods.
An example test program that saw the behavioral difference before is the following:
Async1: 123 124
Runtime async before: 123 123
Runtime async after: 123 124