Skip to content

Conversation

AaronRobinsonMSFT
Copy link
Member

The ExceptionHandling.RaiseAppDomainUnhandledExceptionEvent() API is permitted to be called by multiple threads but only one should be triggering the event handlers.

Fixes #117840

…I thread safe.

The RaiseAppDomainUnhandledExceptionEvent() API
is permitted to be called by multiple threads but only
one should be triggering the event handlers.
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds thread safety to the ExceptionHandling.RaiseAppDomainUnhandledExceptionEvent() method to ensure only one thread can trigger the unhandled exception event handlers while other threads wait for completion.

  • Introduces a volatile static field to track exception handling state across threads
  • Implements atomic operations with waiting mechanism to serialize event handler execution
  • Adds test infrastructure using UnsafeAccessor to reset the static field after tests

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
ExceptionHandling.cs Adds thread synchronization logic using volatile field and atomic operations
RaiseEvent.cs Adds test helper using UnsafeAccessor to reset static state between tests

Copy link
Contributor

Tagging subscribers to this area: @dotnet/interop-contrib
See info in area-owners.md if you want to be subscribed.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@AaronRobinsonMSFT AaronRobinsonMSFT requested a review from jkotas July 19, 2025 16:37
@AaronRobinsonMSFT
Copy link
Member Author

AaronRobinsonMSFT commented Jul 19, 2025

@jkotas @tommcdon Would you like me to remove the locking additions in RuntimeExceptionHelpers.SerializeCrashInfo(), introduced in #117832?

@jkotas
Copy link
Member

jkotas commented Jul 19, 2025

Would you like me to remove the locking additions in RuntimeExceptionHelpers.SerializeCrashInfo(), introduced in #117832?

It should stay. RuntimeExceptionHelpers.SerializeCrashInfo can be called by FailFast without going through OnUnhandledException. We want to prevent the race condition when one thread calls FailFast and other thread hits regular unhandled exception.

@AaronRobinsonMSFT

This comment was marked as outdated.

@jkotas
Copy link
Member

jkotas commented Jul 19, 2025

Thoughts on https://dev.azure.com/dnceng-public/public/_build/results?buildId=1100021&view=ms.vss-test-web.build-test-results-tab&runId=30092706&resultId=100479&paneView=debug

As far as I can tell, there are no unhandled exceptions in this test - the exceptions are handled in native code. It looks like a pre-existing bug to me that UnhandledException event is invoked in this test. @janvorli thoughts about this?

@jkotas
Copy link
Member

jkotas commented Jul 19, 2025

It looks like a pre-existing bug to me that UnhandledException event is invoked in this test

Related #117620

@janvorli
Copy link
Member

Let me take a look why we invoke the handler.

@janvorli
Copy link
Member

Currently, any managed exception that goes through a reverse pinvoke transition and there are no more managed frames on the stack is reported as unhandled, because we really cannot know whether it would be handled or not in the native caller. We invoke the InternalUnhandledExceptionFilter_Worker on the first pass and then on the 2nd pass on Windows, we re-raise the exception to let it flow into the external code just in case the code would end up handling it.

As discussed in the #117620, we want to change that behavior to let the exception always flow into the external native calling code on Windows. However, there are some gotchas. Let me discuss those in the #117620 though.

@AaronRobinsonMSFT
Copy link
Member Author

As discussed in the #117620, we want to change that behavior to let the exception always flow into the external native calling code on Windows. However, there are some gotchas. Let me discuss those in the #117620 though.

Thanks JanV. I'll hold off on this PR until we get some resolution here.

@AaronRobinsonMSFT AaronRobinsonMSFT marked this pull request as draft July 21, 2025 16:12
@janvorli
Copy link
Member

I'll hold off on this PR until we get some resolution here.

@AaronRobinsonMSFT I actually cannot repro the CI failures locally with your branch checked out. While the failing test reports the exception as unhandled via the event, it still passes for me.

@AaronRobinsonMSFT
Copy link
Member Author

I'll hold off on this PR until we get some resolution here.

@AaronRobinsonMSFT I actually cannot repro the CI failures locally with your branch checked out. While the failing test reports the exception as unhandled via the event, it still passes for me.

I believe the failure is only on Windows, the test passes for me too on non-Windows. I've not tried Windows yet. I think there might be some trickiness here based on the predicate for the failfast, it needs the same thread. Is it possible this is a flaky issue? Let me try triggering the CI and see if it fails again.

@AaronRobinsonMSFT
Copy link
Member Author

AaronRobinsonMSFT commented Jul 21, 2025

@janvorli This hit the first time I ran it locally on Windows Debug.

...\runtime\artifacts\tests\coreclr\windows.x64.Debug\Exceptions\ForeignThread\ForeignThreadExceptions>ForeignThreadExceptions.cmd
BEGIN EXECUTION
 "...\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\corerun.exe" -p "System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true"  ForeignThreadExceptions.dll 
Caught exception thrown in a function called by a delegate called through Reverse PInvoke.
Caught exception thrown in a delegate called through Reverse PInvoke on a foreign thread.
Caught hardware exception in a delegate called through Reverse PInvoke on a foreign thread.
Unhandled exception. System.Exception: Exception unhandled in any managed code
   at ForeignThreadExceptionsTest.<>c.<RunTest>b__5_3() in ...\runtime\src\tests\Exceptions\ForeignThread\ForeignThreadExceptions.cs:line 69
Caught exception once
Process terminated.
OnUnhandledException called recursively
   at System.Environment.FailFast(System.Runtime.CompilerServices.StackCrawlMarkHandle, System.String, System.Runtime.CompilerServices.ObjectHandleOnStack, System.String)
   at System.Environment.FailFast(System.Threading.StackCrawlMark ByRef, System.String, System.Exception, System.String)
   at System.Environment.FailFast(System.String)
   at System.AppContext.OnUnhandledException(System.Object)
   at System.Runtime.ExceptionServices.InternalCalls.RhpSfiNext(System.Runtime.StackFrameIterator ByRef, UInt32*, Boolean*, Boolean*)   
   at System.Runtime.EH.DispatchEx(System.Runtime.StackFrameIterator ByRef, ExInfo ByRef)
   at System.Runtime.EH.RhThrowEx(System.Object, ExInfo ByRef)
   at ForeignThreadExceptionsTest+<>c.<RunTest>b__5_3()
Expected: 100
Actual: -2146232797
END EXECUTION - FAILED
FAILED

@janvorli
Copy link
Member

@AaronRobinsonMSFT I wonder what's different for me. I have checked out your branch, built it on Windows x64 in Debug and yet I don't get the crash, neither with nor without debugger attached. I'll try to rebuild everything...

@janvorli
Copy link
Member

I cannot repro it even after a full clean build of both the repo and tests

@AaronRobinsonMSFT
Copy link
Member Author

I cannot repro it even after a full clean build of both the repo and tests

Okay. It might have to wait until tomorrow, but what can I get for you? Full heap DMP? Anything else?

@janvorli
Copy link
Member

Full heap DMP? Anything else?

Dump plus pdbs of the build would be great, thank you!

@janvorli
Copy link
Member

@AaronRobinsonMSFT Sigh, my mistake. While I have pulled down your branch, by accident I had checked out another one. I am sorry for the confusion.

@janvorli
Copy link
Member

The problem causing the failure is that the test really causes an unhandled exception twice on the same thread. The foreign thread catches the exception that runtime considers unhandled because it has flown to external native code with no more managed frames and then it calls a reverse pinvoke on the same thread and it throws again, causing second "unhandled" exception.

@janvorli
Copy link
Member

Thus your logic thinks there was a recursion. To fix that on the EH side, the processing of exceptions that escape into foreign native code with no more managed frames on top of it will need to be changed so that they are no longer considered unhandled.

@jkotas
Copy link
Member

jkotas commented Jul 22, 2025

it will need to be changed so that they are no longer considered unhandled.

The current behavior is a regression introduced by EH unification and follow up bug fixes. These exceptions should be considered handled - like it was in .NET Framework and in .NET Core before EH unification. Is this correct?

@janvorli
Copy link
Member

The current behavior is a regression introduced by EH unification and follow up bug fixes. These exceptions should be considered handled - like it was in .NET Framework and in .NET Core before EH unification. Is this correct?

Right. Although there will always be some differences in the EH due to the new EH not using SEH. The question will be which way of handling the differences is more acceptable than others. I have originally thought the concept of considering exceptions escaping last managed frame on a foreign thread unhandled from the managed code point of view was reasonable, but the recent issues related to that has proven me wrong.

@jkotas
Copy link
Member

jkotas commented Aug 8, 2025

@AaronRobinsonMSFT I think this one is good to go in now.

@AaronRobinsonMSFT AaronRobinsonMSFT marked this pull request as ready for review August 8, 2025 22:59
@AaronRobinsonMSFT AaronRobinsonMSFT merged commit 3230699 into dotnet:main Aug 8, 2025
136 of 138 checks passed
@AaronRobinsonMSFT AaronRobinsonMSFT deleted the runtime_117840 branch August 8, 2025 23:01
@github-actions github-actions bot locked and limited conversation to collaborators Sep 8, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ExceptionHandling.RaiseAppDomainUnhandledExceptionEvent: Prevent handlers from being invoked multiple times
4 participants