Skip to content

Conversation

@rcj1
Copy link
Contributor

@rcj1 rcj1 commented Dec 24, 2025

This PR implements exception stack trace collection in runtime-async stacks for both coreclr and nativeAOT.

  • Augmenting stack unwinding logic in AsyncHelpers.CoreCLR.cs to append frames from runtime-async async-await chain to exception stacktrace using DiagnosticIP of Continuation.

  • Implements ThrowExact helper to throw exception that has been stored in Continuation, while keeping its existing stack trace intact. Accomplished by creating throw helpers that set ExKind.RethrowFlag. Would be open to rename "RethrowFlag" and companions to "NoEraseFlag" or something similar, or leave as is.

  • ILC changes: Ensuring that stack trace records are emitted for runtime-async methods, and hidden for resumption stubs.

  • Testing: Added tests for line/method/document correctness including v1-v2 chaining.

@dotnet/ilc-contrib

jakobbotsch and others added 30 commits October 1, 2025 14:54
- Add new JIT-EE API to report back debug information about the
  generated state machine and continuations
- Refactor debug info storage on VM side to be more easily extensible.
  The new format has either a thin or fat header. The fat header is used
  when we have either uninstrumented bounds, patchpoint info, rich debug
  info or async debug info, and stores the blob sizes of all of those
  components in addition to the bounds and vars.
- Add new async debug information to the storage on the VM side
- Set get target method desc for async resumption stubs, to be used for
  mapping from continuations back to the async IL function that it will
  resume.
_ASSERTE(gc.pException != NULL);

// populate exception with information from the continuation object
OBJECTHANDLE handle = AppDomain::GetCurrentDomain()->CreateHandle(gc.pException);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this GC handle going to be deleted? It looks like a leak. We may need StackTraceInfo::AppendElement overload that takes OBJECTREF. It would be wasteful to create GC handle just to pass an object reference to a method as an argument.

if (method.HasCustomAttribute("System.Diagnostics", "StackTraceHiddenAttribute")
|| (method.OwningType is MetadataType mdType && mdType.HasCustomAttribute("System.Diagnostics", "StackTraceHiddenAttribute")))
|| (method.OwningType is MetadataType mdType && mdType.HasCustomAttribute("System.Diagnostics", "StackTraceHiddenAttribute"))
|| (method is ILCompiler.AsyncResumptionStub)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichalStrehovsky Could you please double check changes in this method?

We have all sort of stubs. It looks odd that we check for AsyncResumptionStub and no other stubs here.

Copy link
Contributor Author

@rcj1 rcj1 Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In coreclr, we filter these out of stacktraces with IsDiagnosticsHidden() in StackTraceInfo::AppendElement, which filters out IL stubs and async thunks. Since we do not pre-filter the stacktrace in nativeaot, we use the hidden attribute; I do believe we need this check, but we may want more.

For example, with the following test app

using System;

class Program
{
    static void Main()
    {
        Action a = () => Console.WriteLine("A");
        Action b = () => throw new Exception("This i");

        Action m = a + b;
        m();
    }
} 

We get the following stacktrace

at Program.<>c.<Main>b__0_1() + 0x27
at aot!<BaseAddress>+0x22046d
at Program.Main() + 0x9f

where aot!+0x22046d is the delegate multicast stub. How should these look - maybe we broaden this to ILStubMethod to match coreclr?

@rcj1
Copy link
Contributor Author

rcj1 commented Jan 24, 2026

The new tests are failing on NativeAOT on Unix, e.g.:

2026-01-24T04:26:55.9401240Z   Illegal text-relocations:
2026-01-24T04:26:55.9405630Z     text-relocation in '_System_Diagnostics_StackTrace_Tests_System_Diagnostics_V1Methods__<AsyncCallable>System_Diagnostics_StackTrace_Tests_System_Diagnostics_V1Methods__Test0'+0x160 (/Users/runner/work/1/s/artifacts/obj/System.Diagnostics.StackTrace.Tests/Release/net11.0/native/System.Diagnostics.StackTrace.Tests.o) to '_System_Diagnostics_StackTrace_Tests_System_Diagnostics_V1Methods__<AsyncCallable>System_Diagnostics_StackTrace_Tests_System_Diagnostics_V1Methods__Test0'

They need to be disabled or we need to wait for @jakobbotsch fix (#123433).

Ok I can add that into the platform detection condition with a comment on this issue

@jkotas
Copy link
Member

jkotas commented Jan 24, 2026

/azp run runtime-nativeaot-outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

// Fetch the stacktrace and the keepAlive array from the exception object. It returns clones of those arrays in case the
// stack trace was created by a different thread.
((EXCEPTIONREF)ObjectFromHandle(hThrowable))->GetStackTrace(gc.stackTrace.m_pStackTraceArray, &gc.pKeepAliveArray, pThread);
((EXCEPTIONREF)pThrowable)->GetStackTrace(gc.stackTrace.m_pStackTraceArray, &gc.pKeepAliveArray, pThread);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of CI legs failed due to infra issues, but the few that completed has test failures that suggest GC heap corruption.

I think you need to GCPROTECT the throwable. You may want to move the GCPROTECT up towards the start of the method so that there is still just one GCPROTECT in the whole method.

@jkotas
Copy link
Member

jkotas commented Jan 25, 2026

/azp run runtime-nativeaot-outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

_ASSERTE(IsException(pMT));

PTR_ThreadExceptionState pCurTES = pThread->GetExceptionState();
// Check if the flag indicating foreign exception raise has been setup or not,
Copy link
Member

@jkotas jkotas Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli Could you please review these changes? In particular, the changes around IsRaisingForeignException flag for runtime async frames. Do we need to worry about it at all for runtime async frames?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe so. This flag is only set during ExceptionDispatchInfo.Throw(), so the only place it would matter is if Edi.Throw() is immediately above a runtime-async frame in a stacktrace. However, above a runtime-async frame we will either have another runtime-async frame, or DispatchContinuations, the only call site of this method.

@rcj1
Copy link
Contributor Author

rcj1 commented Jan 25, 2026

/azp run runtime-nativeaot-outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rcj1
Copy link
Contributor Author

rcj1 commented Jan 26, 2026

/azp run runtime-nativeaot-outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

}
}

#if !(NATIVEAOT && !WINDOWS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt that this is going to work. The build of libraries partition is runtime flavor agnostic. The only place under libraries where you can use NATIVEAOT ifdef is CoreLib - that's because the CoreLib is not actually built from under libraries, but from runtime flavor specific location (e.g. src\coreclr\System.Private.CoreLib or src\coreclr\nativeaot\System.Private.CoreLib).

Building libraries tests for NativeAOT can be disabled at project granularity https://github.com/dotnet/runtime/blob/main/src/libraries/tests.proj#L401

@rcj1
Copy link
Contributor Author

rcj1 commented Jan 26, 2026

/azp run runtime-nativeaot-outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rcj1
Copy link
Contributor Author

rcj1 commented Jan 26, 2026

/azp run runtime-nativeaot-outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).


mov r8d, 1 // r8d = ExKind.Throw

LOCAL_LABEL(RhpThrowExImpl):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be marked as LOCAL_LABEL but rather ALTERNATE_ENTRY as it is called from outside of this function. For all architectures.

if (method.HasCustomAttribute("System.Diagnostics", "StackTraceHiddenAttribute")
|| (method.OwningType is MetadataType mdType && mdType.HasCustomAttribute("System.Diagnostics", "StackTraceHiddenAttribute")))
|| (method.OwningType is MetadataType mdType && mdType.HasCustomAttribute("System.Diagnostics", "StackTraceHiddenAttribute"))
|| (method is Internal.IL.Stubs.ILStubMethod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why the IL stubs were not a problem before this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #122722 (comment), they did show up as unknowns but this should filter them out.

_ASSERTE(pException != NULL);
// populate exception with information from the continuation object
EECodeInfo codeInfo((PCODE)diagnosticIP);
if (codeInfo.IsValid())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a valid case when the codeInfo is not valid here? If not, it would be nice to add _ASSERTE here.

return;
}

AppendElementImpl(pThrowable, currentIP, currentSP, pFunc, pCf, pThread, fRaisingForeignException);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it cannot happen now, if some GC triggering call gets included between the point we get the pThrowable and this call, GC gets triggered and moves the object, we could end up passing stale pThrowable to the call. It would be safer to use ObjectFromHandle(hThrowable) as argument here.

#ifndef HAVE_EXKIND_H
#define HAVE_EXKIND_H

#include <cstdint>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is seems that since you've moved it to the common.h, it is not needed here anymore, according to @am11's comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that was just regarding @am11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants