Skip to content

Conversation

janvorli
Copy link
Member

Currently, the code detecting whether an exception should be propagated into native frames and not reported as unhandled relies on presence of the DebuggerU2MFrame with "CatchAllExceptions" set. Recent fixes due to this marking missing causing problems revealed that it would be actually better to do the detection in an inverse manner. That means always propagate the exception to native frames unless there is an explicit marker indicating that the exception needs to be reported as unhandled. Such a marker should only mark the main entry point for an application and a managed thread entry. All other places should just let the exception flow into the native code.

This change fixes it by creating a new kind of explicit frame to mark the code paths that should report unhandled exception. It also removes recent fixes in this area that are no longer needed.

Currently, the code detecting whether an exception should be propagated
into native frames and not reported as unhandled relies on presence of
the DebuggerU2MFrame with "CatchAllExceptions" set. Recent fixes due to
this marking missing causing problems revealed that it would be actually
better to do the detection in an inverse manner. That means always
propagate the exception to native frames unless there is an explicit
marker indicating that the exception needs to be reported as unhandled.
Such a marker should only mark the main entry point for an application
and a managed thread entry. All other places should just let the
exception flow into the native code.

This change fixes it by creating a new kind of explicit frame to mark
the code paths that should report unhandled exception.
@janvorli janvorli added this to the 10.0.0 milestone Jul 14, 2025
@janvorli janvorli self-assigned this Jul 14, 2025
@Copilot Copilot AI review requested due to automatic review settings July 14, 2025 19:12
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR changes the approach for detecting unhandled exceptions in the CoreCLR runtime. Instead of relying on the presence of DebuggerU2MCatchHandlerFrame with "CatchAllExceptions" to determine exception propagation behavior, it introduces a new explicit marker frame that indicates when exceptions should be reported as unhandled.

Key changes include:

  • Introduction of UnhandledExceptionMarkerFrame for explicit unhandled exception marking
  • Removal of the CatchesAllExceptions functionality from DebuggerU2MCatchHandlerFrame
  • Updated exception handling logic to use the new marker frame approach

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/vm/FrameTypes.h Adds new frame type UnhandledExceptionMarkerFrame
src/coreclr/vm/frames.h Defines the new UnhandledExceptionMarkerFrame class and removes CatchesAllExceptions from DebuggerU2MCatchHandlerFrame
src/coreclr/vm/threads.cpp Updates managed thread dispatch to use new marker frame
src/coreclr/vm/corhost.cpp Adds unhandled exception marker frame to assembly execution
src/coreclr/vm/exceptionhandling.cpp Updates exception handling logic to check for new marker frame
src/coreclr/vm/jitinterface.cpp Removes debugger catch handler frame usage
src/coreclr/vm/interoplibinterface_comwrappers.cpp Removes debugger catch handler frame usage
src/coreclr/vm/dispatchinfo.cpp Updates constructor call for simplified DebuggerU2MCatchHandlerFrame

{
LIMITED_METHOD_DAC_CONTRACT;
return m_catchesAllExceptions;
}
Copy link
Preview

Copilot AI Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UnhandledExceptionMarkerFrame class is missing the GetTransitionType() method override that is present in DebuggerU2MCatchHandlerFrame. This inconsistency may cause issues with frame type identification during stack walks.

Copilot uses AI. Check for mistakes.

@jkotas
Copy link
Member

jkotas commented Jul 14, 2025

the main entry point for an application and a managed thread entry

And finalizer thread?

@janvorli
Copy link
Member Author

And finalizer thread?

Ah, yes, that one needs it too.

@janvorli
Copy link
Member Author

And finalizer thread?

Ah, yes, that one needs it too.

Actually, not. The finalizer thread uses ManagedThreadBase_DispatchOuter that already has that treatment.

@janvorli
Copy link
Member Author

/azp run runtime-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

bool isNotHandledByRuntime =
(pFrame == FRAME_TOP) ||
(IsTopmostDebuggerU2MCatchHandlerFrame(pFrame) && !((DebuggerU2MCatchHandlerFrame*)pFrame)->CatchesAllExceptions())
bool reportUnhandledException =
Copy link
Member

@jkotas jkotas Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment about DebuggerU2MCatchHandlerFrame above needs updating

FRAME_TYPE_NAME(DebuggerExitFrame)
FRAME_TYPE_NAME(DebuggerU2MCatchHandlerFrame)
FRAME_TYPE_NAME(ExceptionFilterFrame)
FRAME_TYPE_NAME(UnhandledExceptionMarkerFrame)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the new Frame type need to be hooked up into cDac?

cc @max-charlamb

Copy link
Member

@jkotas jkotas Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, I am wondering whether it would be better to have a managed code at the root of these threads so that the unhandled exception handling for them works the same way as exception handling for external threads. Would it allow us to get rid of this special casing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand how that would work and how it would work the same way as for external threads. For external threads, we assume that the thread will either catch it or the exception will be reported as c++ unhandled. So we let it flow into the native code. But we cannot do that for the unhandled exceptions that we want to report as such, because at the point we would reach some top level catch, the stack frames of the throwing location would be gone. So when we would break into the debugger or have a dump caused by the unhandled exception, there would be no evidence on why it happened.

Copy link
Member

@jkotas jkotas Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an unhandled external exception thrown on external thread (and the external thread does not have any catch in native code), we will end up reporting it like a regular unhandled managed exception as far as I can tell.

I have tried this: https://gist.github.com/jkotas/2d4f15056768c1b221a78684b4245fec

Console output:

Thread 24600 started.
Unhandled exception. System.Exception: My exception
   at Program.ThreadFunction(IntPtr lpParameter) in C:\repro\Program.cs:line 74

Stacktrace in the crash dump is - notice that it includes ThreadFunction where the exception was thrown originally:

kernel32!WerpReportFault+0xc5 [onecore\windows\feedback\core\faultrep\lib\faultrep.cpp @ 1384] 
KERNELBASE!UnhandledExceptionFilter+0x34c [minkernel\kernelbase\xcpt.c @ 701] 
ntdll!RtlpThreadExceptionFilter+0x2e [minkernel\ldr\rtlstrt.c @ 958] 
ntdll!RtlUserThreadStart$filt$0+0x3f [minkernel\ldr\rtlstrt.c @ 1185] 
ntdll!__C_specific_handler+0x93 [minkernel\crts\crtw32\misc\riscchandler.c @ 446] 
ntdll!RtlpExecuteHandlerForException+0xf [minkernel\ntos\rtl\amd64\xcptmisc.asm @ 132] 
ntdll!RtlDispatchException+0x437 [minkernel\ntos\rtl\amd64\exdsptch.c @ 680] 
ntdll!RtlRaiseException+0x206 [minkernel\ntos\rtl\amd64\raise.c @ 240] 
KERNELBASE!RaiseException+0x8a [minkernel\kernelbase\xcpt.c @ 954] 
coreclr!SfiNext+0x1e5 [D:\a\_work\1\s\src\runtime\src\coreclr\vm\exceptionhandling.cpp @ 4048] 
System_Private_CoreLib!System.Runtime.EH.DispatchEx+0x48e [_/src/runtime/src/coreclr/nativeaot/Runtime.Base/src/System/Runtime/ExceptionHandling.cs @ 900] 
System_Private_CoreLib!System.Runtime.EH.RhThrowEx+0x2d [_/src/runtime/src/coreclr/nativeaot/Runtime.Base/src/System/Runtime/ExceptionHandling.cs @ 673] 
coreclr!CallDescrWorkerInternal+0x83 [D:\a\_work\1\s\src\runtime\src\coreclr\vm\amd64\CallDescrWorkerAMD64.asm @ 74] 
coreclr!DispatchCallSimple+0x66 [D:\a\_work\1\s\src\runtime\src\coreclr\vm\callhelpers.cpp @ 238] 
coreclr!DispatchManagedException+0x1f9 [D:\a\_work\1\s\src\runtime\src\coreclr\vm\exceptionhandling.cpp @ 1665] 
coreclr!IL_Throw+0x10e [D:\a\_work\1\s\src\runtime\src\coreclr\vm\jithelpers.cpp @ 852] 
repro!Program.ThreadFunction+0x6b
repro!ILStubClass.IL_STUB_ReversePInvoke(Int64)+0x87
kernel32!BaseThreadInitThunk+0x17 [clientcore\base\win32\client\thread.c @ 77] 
ntdll!RtlUserThreadStart+0x2c [minkernel\ldr\rtlstrt.c @ 1184] 

I do not see what we are getting by special casing finalizer and other threads created by the runtime.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to understand why this is so complicated. If it is too complicated for no good reason and it is too risky to simplify for .NET 10, it is fine to postpone the simplification to .NET 11.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would not let the exception flow to the native caller on Unix. The idea is that the finalizer and other threads would start with a regular reverse P/Invoke transition (same way as on Native AOT). Unhandled exceptions stop at reverse P/Invoke transition on non-Windows. I believe that we preserve the stacktrace of the original fault when exception goes unhandled at reverse P/Invoke transition on Unix.

In other words, the main reason for this special casing seems to be that these few places do not enter the runtime via a regular reverse P/Invoke transition. They use a custom scheme that we need to compensate for.

I can believe that the diagnostic tests depend on implementation details and changing anything is going to break them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how it would work for the Main method entry. We call the Main via CallDescrWorker. Where would we fit in the reverse pinvoke transition?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Runtime C/C++ code would call the reverse PInvoke directly, without CallDescrWorker. The reverse PInvoke would take function pointer that points to Main method, some information about Main method signature (there are a few possible shapes - returning int vs. returning void, etc.), and the arguments to pass to main method.

The extra reverse PInvoke may impact diagnostic as discussed above. It would be show up in stacktraces, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your example works because we report the exception as unhandled in the EH and then re-raise it to flow into the native code to give the external native code a chance to catch it. If we didn't treat it as unhandled, the EH would currently unwind stack upto the native code frame and raise the exception from there. That's the way we use for CallDescrWorker. But that way looses the managed frames that triggered the exception.

If we used the same mechanism as we currently use for re-raising the exception when we consider it unhandled, but wanted to do it for any reverse pinvoke transition without checking if there are managed frames on the stack on top of that, then we would have a problem. This exception is re-raised when there are all the managed frames upto the one that has thrown the exception still on the stack. So re-raising from that point requires us to make sure that when the exception calls ProcessCLRException for all managed frames below the native frame, we don't do anything. That's currently done by marking the thread by TSNC_ProcessedUnhandledException. But if the exception would end up flowing through the native frames uncaught and then reach managed code again, we want the ProcessCLRException to start doing its job again past that point.
I am not trying to say we cannot do that, but it complicates things and will require some extra state to store the address of the boundary frame that would decide whether the ProcessCLRException should ignore the exception for a given frame or not. And in case the exception will be caught in the native code between those blocks of managed code, the state will become stale and we would need to clean it up at a right time later.


#ifdef LOG_EXECUTABLE_ALLOCATOR_STATISTICS
ExecutableAllocator::DumpHolderUsage();
ExecutionManager::DumpExecutionManagerUsage();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ExecuteInDefaultAppDomain below need the same change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've missed that one.

unhandledExceptionMarkerFrame.Push(pThread);
}

INSTALL_UNHANDLED_MANAGED_EXCEPTION_TRAP;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the unhandledExceptionMarkerFrame be coupled with INSTALL_UNHANDLED_MANAGED_EXCEPTION_TRAP? They seem to be handling same scenario.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion, it looks like it would make sense to put the frame creation / destruction into the INSTALL_UNHANDLED_MANAGED_EXCEPTION_TRAP / UNINSTALL_UNHANDLED_MANAGED_EXCEPTION_TRAP

@janvorli
Copy link
Member Author

Closing this since the known issues were handled by another PR.

@janvorli janvorli closed this Aug 11, 2025
@github-actions github-actions bot locked and limited conversation to collaborators Sep 11, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants