Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change in exception handling in 9.0 #101772

Closed
amcasey opened this issue May 1, 2024 · 24 comments · Fixed by #104531
Closed

Change in exception handling in 9.0 #101772

amcasey opened this issue May 1, 2024 · 24 comments · Fixed by #104531
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@amcasey
Copy link
Member

amcasey commented May 1, 2024

Description

If you have a try-catch-catch, I don't believe C# allows the second catch to handle exceptions thrown by the first catch. At least, that's the behavior I'm seeing in 8.0. In 9.0, there seem to be cases where the second catch can handle them.

Reproduction Steps

try
{
    Console.WriteLine("Try");
    throw new NotSupportedException();
}
catch (NotSupportedException)
{
    Console.WriteLine("NotSupportedException");
    throw new Exception("Repro failed");
}
catch (InvalidOperationException)
{
    Console.WriteLine("InvalidOperationException");
    try
    {
        System.Diagnostics.Debug.Fail("How did we get here?");
    }
    finally // Required for repro
    {
        System.Diagnostics.Debug.Fail("How did we get here?");
    }
}
catch (Exception)
{
    Console.WriteLine("Exception");
    Console.WriteLine("Repro succeeded");
}
finally
{
    Console.WriteLine("Finally");
}

Expected behavior

8.0 printed

Try
NotSupportedException
Unhandled exception. System.Exception: Repro failed [SNIP]
Finally

Actual behavior

9.0.100-preview.5.24229.2 prints

Try
NotSupportedException
Exception
Repro succeeded
Finally

Regression?

No response

Known Workarounds

No response

Configuration

9.0.100-preview.5.24229.2 (in aspnetcore repo)
Win11 22631.3447 on x64 (haven't tried others)

Other information

The nested try finally in the unreachable catch block seems essential to the repro.

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label May 1, 2024
@stephentoub
Copy link
Member

cc: @janvorli, is this related to your work?

@amcasey
Copy link
Member Author

amcasey commented May 1, 2024

It goes back to the 8.0 behavior with set DOTNET_LegacyExceptionHandling=1 (which I learned about just now).

amcasey added a commit to amcasey/aspnetcore that referenced this issue May 1, 2024
To confirm that CI catches dotnet/runtime#101772
@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label May 2, 2024
@mangod9 mangod9 added this to the 9.0.0 milestone May 2, 2024
@janvorli
Copy link
Member

janvorli commented May 6, 2024

I'll look into it, the .NET 9 behavior with the new EH implementation is clearly behaving incorrectly.

@janvorli
Copy link
Member

janvorli commented May 6, 2024

I have investigated it. It turns out that NativeAOT shares the same issue when compiled for Debug. Release build doesn't have this issue even in coreclr. The issue is caused by the JIT ordering of exception clauses. In this specific case, the clause for the finally marked by the "// Required for repro" comment is placed in between the clauses of the main try block. When the new EH and the NativeAOT EH unwinds out of the "catch", the unwinder moves to the frame of the first throw. It then wants to skip all clauses that belong to the same try block as the catch clause that was invoked to handle the first exception. But the inserted clause from the other try block causes it to believe there are no more clauses for the same try.
Here are the clauses from a debug build as I have printed them using an added logging:

  EH clause to consider curIdx=0, _tryStartOffset=2F, _tryEndOffset=75, isSameTry=False, clauseKind=RH_EH_CLAUSE_TYPED
  EH clause to consider curIdx=1, _tryStartOffset=152, _tryEndOffset=166, isSameTry=False, clauseKind=RH_EH_CLAUSE_FAULT
  EH clause to consider curIdx=2, _tryStartOffset=2F, _tryEndOffset=75, isSameTry=False, clauseKind=RH_EH_CLAUSE_TYPED
  EH clause to consider curIdx=3, _tryStartOffset=2F, _tryEndOffset=75, isSameTry=True, clauseKind=RH_EH_CLAUSE_TYPED
  EH clause to consider curIdx=4, _tryStartOffset=2F, _tryEndOffset=76, isSameTry=False, clauseKind=RH_EH_CLAUSE_FAULT

cc: @dotnet/jit-contrib

@JulieLeeMSFT JulieLeeMSFT added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-ExceptionHandling-coreclr labels May 7, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@JamesNK
Copy link
Member

JamesNK commented May 7, 2024

FYI there is an aspnetcore test that is impacted by this - dotnet/aspnetcore#55564. I'm guessing that's how Andrew noticed it.

It only fails in a local test run because CI isn't DEBUG. We can leave it failing until there is a fix and then check it's resolved.

@amanasifkhalid
Copy link
Member

amanasifkhalid commented May 8, 2024

From the JIT dump, this is the current ordering of the funclet section:

BB07 [0002]  1  4  0                       0    [012..02A)                           (throw ) T4 H0 F catch { }   i rare keep hascall gcsafe flet newobj
BB12 [0005]  2  4  1 BB10                  0    [046..054)-> BB11(1)                 (finret) T4 H1 F finally { } i rare keep hascall gcsafe flet
BB08 [0003]  1  4  2                       1    [02A..037)-> BB09(1)                 (always) T4 H2 F catch {     i keep hascall gcsafe flet
BB09 [0004]  1  1  2 BB08                  0    [037..046)-> BB10(1)                 (always) T1 H2   try { }     i rare keep hascall gcsafe
BB10 [0014]  1  4  2 BB09                  0    [???..???)-> BB12(1)                 (callf ) T4 H2               i rare internal
BB11 [0015]  1  4  2 BB12                  0    [???..???)-> BB13(1)                 (callfr) T4 H2               i rare internal
BB13 [0006]  1  4  2 BB11                  0    [054..057)-> BB15(1)                 ( cret ) T4 H2   }           i rare
BB14 [0007]  1  4  3                       1    [057..072)-> BB15(1)                 ( cret ) T4 H3 F catch { }   i keep hascall gcsafe flet
BB18 [0009]  2     4 BB16                  1    [074..082)-> BB17(1)                 (finret)    H4 F finally { } i keep hascall gcsafe flet

H1 is the "required for repro" finally handler. The handlers H0, H2, and H3 correspond to the outer try. H4 is the just the last finally handler that prints "Finally".

I tried changing the JIT's funclet creation logic to sort handlers based on their corresponding try regions' indices, such that handlers for T0 come first, then handlers for T1 come next, etc. Here's the new funclet layout:

BB07 [0002]  1  4  0                       0    [012..02A)                           (throw ) T4 H0 F catch { }   i LIR rare keep label hascall gcsafe flet newobj
BB08 [0003]  1  4  2                       1    [02A..037)-> BB09(1)                 (always) T4 H2 F catch {     i LIR keep label hascall gcsafe flet
BB09 [0004]  1  1  2 BB08                  0    [037..046)-> BB10(1)                 (always) T1 H2   try { }     i LIR rare keep label hascall gcsafe
BB10 [0014]  1  4  2 BB09                  0    [???..???)-> BB12(1)                 (callf ) T4 H2               i LIR rare internal label
BB11 [0015]  1  4  2 BB12                  0    [???..???)-> BB13(1)                 (callfr) T4 H2               i LIR rare internal
BB13 [0006]  1  4  2 BB11                  0    [054..057)-> BB15(1)                 ( cret ) T4 H2   }           i LIR rare label
BB14 [0007]  1  4  3                       1    [057..072)-> BB15(1)                 ( cret ) T4 H3 F catch { }   i LIR keep label hascall gcsafe flet
BB12 [0005]  2  4  1 BB10                  0    [046..054)-> BB11(1)                 (finret) T4 H1 F finally { } i LIR rare keep label hascall gcsafe flet
BB18 [0009]  2     4 BB16                  1    [074..082)-> BB17(1)                 (finret)    H4 F finally { } i LIR keep label hascall gcsafe flet

With this layout, where the handlers for each try region are contiguous, I'm still getting the same incorrect behavior. @janvorli how is isSameTry computed? I'm assuming when the runtime searches for an exception handler, it iterates in the order the handlers are placed in memory, right?

@AndyAyersMS
Copy link
Member

@amanasifkhalid is this a block layout issue or an EH region descriptor issue?

@amanasifkhalid
Copy link
Member

@AndyAyersMS between the two, I think the latter is more likely? Unless I'm misunderstanding the invariants around how EH descriptors should be maintained when creating funclets, I don't see any obvious issues. Before funclet creation, the blocklist looks like this:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight   [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1    [???..???)-> BB02(1)                 (always)                     i internal hascall
BB02 [0017]  1       BB01                  1    [???..???)-> BB04(0.5),BB03(0.5)     ( cond )                     internal
BB03 [0018]  1       BB02                  0.50 [???..???)-> BB04(1)                 (always)                     internal
BB04 [0016]  2       BB02,BB03             0    [???..???)-> BB05(1)                 (always)                     i rare internal hascall
BB05 [0011]  1  4    BB04                  0    [000..000)-> BB06(1)                 (always) T4      try {       i rare keep internal
BB06 [0001]  1  0    BB05                  0    [000..012)                           (throw ) T0      try { try { try { } } } i rare keep hascall gcsafe newobj
BB07 [0002]  1  4  0                       0    [012..02A)                           (throw ) T4 H0   catch { }   i rare keep hascall gcsafe newobj
BB08 [0003]  1  4  2                       1    [02A..037)-> BB09(1)                 (always) T4 H2   catch {     i keep hascall gcsafe
BB09 [0004]  1  1  2 BB08                  0    [037..046)-> BB10(1)                 (always) T1 H2   try { }     i rare keep hascall gcsafe
BB10 [0014]  1  4  2 BB09                  0    [???..???)-> BB12(1)                 (callf ) T4 H2               i rare internal
BB11 [0015]  1  4  2 BB12                  0    [???..???)-> BB13(1)                 (callfr) T4 H2               i rare internal
BB12 [0005]  2  4  1 BB10                  0    [046..054)-> BB11(1)                 (finret) T4 H1   finally { } i rare keep hascall gcsafe
BB13 [0006]  1  4  2 BB11                  0    [054..057)-> BB15(1)                 ( cret ) T4 H2   }           i rare
BB14 [0007]  1  4  3                       1    [057..072)-> BB15(1)                 ( cret ) T4 H3   catch { }   i keep hascall gcsafe
BB15 [0008]  2  4    BB13,BB14             1    [072..074)-> BB16(1)                 (always) T4      }           i
BB16 [0012]  1       BB15                  1    [???..???)-> BB18(1)                 (callf )                     i internal
BB17 [0013]  1       BB18                  1    [???..???)-> BB19(1)                 (callfr)                     i internal
BB18 [0009]  2     4 BB16                  1    [074..082)-> BB17(1)                 (finret)    H4   finally { } i keep hascall gcsafe
BB19 [0010]  1       BB17                  1    [082..083)                           (return)                     i
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

And afterwards:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight   [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1    [???..???)-> BB02(1)                 (always)                     i internal hascall
BB02 [0017]  1       BB01                  1    [???..???)-> BB04(0.5),BB03(0.5)     ( cond )                     internal
BB03 [0018]  1       BB02                  0.50 [???..???)-> BB04(1)                 (always)                     internal
BB04 [0016]  2       BB02,BB03             0    [???..???)-> BB05(1)                 (always)                     i rare internal hascall
BB05 [0011]  1  4    BB04                  0    [000..000)-> BB06(1)                 (always) T4      try {       i rare keep internal
BB06 [0001]  1  0    BB05                  0    [000..012)                           (throw ) T0      try { try { try { } } } i rare keep hascall gcsafe newobj
BB15 [0008]  2  4    BB13,BB14             1    [072..074)-> BB16(1)                 (always) T4      }           i
BB16 [0012]  1       BB15                  1    [???..???)-> BB18(1)                 (callf )                     i internal
BB17 [0013]  1       BB18                  1    [???..???)-> BB19(1)                 (callfr)                     i internal
BB19 [0010]  1       BB17                  1    [082..083)                           (return)                     i
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ funclets follow
BB07 [0002]  1  4  0                       0    [012..02A)                           (throw ) T4 H0 F catch { }   i rare keep hascall gcsafe flet newobj
BB12 [0005]  2  4  1 BB10                  0    [046..054)-> BB11(1)                 (finret) T4 H1 F finally { } i rare keep hascall gcsafe flet
BB08 [0003]  1  4  2                       1    [02A..037)-> BB09(1)                 (always) T4 H2 F catch {     i keep hascall gcsafe flet
BB09 [0004]  1  1  2 BB08                  0    [037..046)-> BB10(1)                 (always) T1 H2   try { }     i rare keep hascall gcsafe
BB10 [0014]  1  4  2 BB09                  0    [???..???)-> BB12(1)                 (callf ) T4 H2               i rare internal
BB11 [0015]  1  4  2 BB12                  0    [???..???)-> BB13(1)                 (callfr) T4 H2               i rare internal
BB13 [0006]  1  4  2 BB11                  0    [054..057)-> BB15(1)                 ( cret ) T4 H2   }           i rare
BB14 [0007]  1  4  3                       1    [057..072)-> BB15(1)                 ( cret ) T4 H3 F catch { }   i keep hascall gcsafe flet
BB18 [0009]  2     4 BB16                  1    [074..082)-> BB17(1)                 (finret)    H4 F finally { } i keep hascall gcsafe flet
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Here's the EH table:

index  eTry, eHnd
  0  ::   2        - Try at BB06..BB06 [000..012), Handler at BB07..BB07 [012..02A)
  1  ::   4     2  - Try at BB09..BB09 [037..046), Finally at BB12..BB12 [046..054)
  2  ::   3        - Try at BB06..BB06 [000..012), Handler at BB08..BB13 [02A..057)
  3  ::   4        - Try at BB06..BB06 [000..012), Handler at BB14..BB14 [057..072)
  4  ::            - Try at BB05..BB15 [000..074), Finally at BB18..BB18 [074..082)

One thing that looks weird to me is H1 is nested in H2, but after funclet creation, it ends up right before the beginning of H2, and we don't move the start of H2 back to include H1. In fgCreateFunclets, we move funclets to the end of the method starting from the smallest EH index, and nested EH regions are expected to have smaller indices than their parent regions, so we end up moving the nested funclets out of their parent funclets, and placing the parent funclets after them in the blocklist. So I'm guessing funclets have to be "flat", i.e. no nested handler regions, right?

On a side note, we might be able to simplify fgCreateFunclets so that it moves all the funclets to the end of the method in one pass, and then fix up the EH descriptors in a separate pass, similar to what the new block layout algorithm does. Right now, fgRelocateEHRange's logic for adjusting EH descriptors looks more complicated/expensive than it has to be.

@AndyAyersMS
Copy link
Member

If this is a regression vs 8.0 then comparing jit dumps for 8 & 9 might prove instructive. Perhaps there was an implicit constraint somewhere that now needs to be explicit, given the changes we've made over the past few months?

@amanasifkhalid
Copy link
Member

amanasifkhalid commented May 10, 2024

I took a look at the dumps for .NET 8 vs 9, and I don't see any diffs in the EH descriptors or funclet layout (aside from some JIT-specific semantic differences, like the absence of BBJ_NONE in .NET 9, and the introduction of BBJ_CALLFINALLYRET). Here's .NET 8's final layout:

-----------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight    lp [IL range]     [jump]      [EH region]         [flags]
-----------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1       [???..???)                                     i internal label hascall LIR 
BB02 [0017]  1       BB01                  1       [???..???)-> BB04 ( cond )                     internal LIR 
BB03 [0018]  1       BB02                  0.50    [???..???)                                     internal LIR 
BB04 [0016]  2       BB02,BB03             0       [???..???)                                     i internal rare label hascall LIR 
BB05 [0011]  1  4    BB04                  0       [000..000)                 T4      try {       keep i internal try rare label LIR 
BB06 [0001]  1  0    BB05                  0       [000..012)        (throw ) T0      try { try { try { } } } keep i try rare label hascall gcsafe newobj LIR 
BB15 [0008]  2  4    BB13,BB14             1       [072..074)-> BB16 (always) T4      }           i label LIR 
BB16 [0012]  1       BB15                  1       [???..???)-> BB18 (callf )                     i internal label LIR 
BB17 [0013]  1       BB18                  1       [???..???)-> BB19 (ALWAYS)                     i internal LIR KEEP 
BB19 [0010]  1       BB17                  1       [082..083)        (return)                     i label LIR 
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ funclets follow
BB07 [0002]  1  4  0                       0       [012..02A)        (throw ) T4 H0 F catch { }   keep i rare label hascall gcsafe flet newobj LIR 
BB12 [0005]  2  4  1 BB10                  0       [046..054)        (finret) T4 H1 F finally { } keep i rare label hascall gcsafe flet LIR 
BB08 [0003]  1  4  2                       1       [02A..037)                 T4 H2 F catch {     keep i label hascall gcsafe flet LIR 
BB09 [0004]  1  1  2 BB08                  0       [037..046)-> BB10 (always) T1 H2   try { }     keep i try rare label hascall gcsafe LIR 
BB10 [0014]  1  4  2 BB09                  0       [???..???)-> BB12 (callf ) T4 H2               i internal rare label LIR 
BB11 [0015]  1  4  2 BB12                  0       [???..???)-> BB13 (ALWAYS) T4 H2               i internal rare LIR KEEP 
BB13 [0006]  1  4  2 BB11                  0       [054..057)-> BB15 ( cret ) T4 H2   }           i rare label LIR 
BB14 [0007]  1  4  3                       1       [057..072)-> BB15 ( cret ) T4 H3 F catch { }   keep i label hascall gcsafe flet LIR 
BB18 [0009]  2     4 BB16                  1       [074..082)        (finret)    H4 F finally { } keep i label hascall gcsafe flet LIR 
-----------------------------------------------------------------------------------------------------------------------------------------

And on .NET 9:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight   [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1    [???..???)-> BB02(1)                 (always)                     i LIR internal label hascall
BB02 [0017]  1       BB01                  1    [???..???)-> BB04(0.5),BB03(0.5)     ( cond )                     LIR internal
BB03 [0018]  1       BB02                  0.50 [???..???)-> BB04(1)                 (always)                     LIR internal
BB04 [0016]  2       BB02,BB03             0    [???..???)-> BB05(1)                 (always)                     i LIR rare internal label hascall
BB05 [0011]  1  4    BB04                  0    [000..000)-> BB06(1)                 (always) T4      try {       i LIR rare keep internal label
BB06 [0001]  1  0    BB05                  0    [000..012)                           (throw ) T0      try { try { try { } } } i LIR rare keep label hascall gcsafe newobj
BB15 [0008]  2  4    BB13,BB14             1    [072..074)-> BB16(1)                 (always) T4      }           i LIR label
BB16 [0012]  1       BB15                  1    [???..???)-> BB18(1)                 (callf )                     i LIR internal label
BB17 [0013]  1       BB18                  1    [???..???)-> BB19(1)                 (callfr)                     i LIR internal
BB19 [0010]  1       BB17                  1    [082..083)                           (return)                     i LIR label
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ funclets follow
BB07 [0002]  1  4  0                       0    [012..02A)                           (throw ) T4 H0 F catch { }   i LIR rare keep label hascall gcsafe flet newobj
BB12 [0005]  2  4  1 BB10                  0    [046..054)-> BB11(1)                 (finret) T4 H1 F finally { } i LIR rare keep label hascall gcsafe flet
BB08 [0003]  1  4  2                       1    [02A..037)-> BB09(1)                 (always) T4 H2 F catch {     i LIR keep label hascall gcsafe flet
BB09 [0004]  1  1  2 BB08                  0    [037..046)-> BB10(1)                 (always) T1 H2   try { }     i LIR rare keep label hascall gcsafe
BB10 [0014]  1  4  2 BB09                  0    [???..???)-> BB12(1)                 (callf ) T4 H2               i LIR rare internal label
BB11 [0015]  1  4  2 BB12                  0    [???..???)-> BB13(1)                 (callfr) T4 H2               i LIR rare internal
BB13 [0006]  1  4  2 BB11                  0    [054..057)-> BB15(1)                 ( cret ) T4 H2   }           i LIR rare label
BB14 [0007]  1  4  3                       1    [057..072)-> BB15(1)                 ( cret ) T4 H3 F catch { }   i LIR keep label hascall gcsafe flet
BB18 [0009]  2     4 BB16                  1    [074..082)-> BB17(1)                 (finret)    H4 F finally { } i LIR keep label hascall gcsafe flet
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

The diffs in the usage of the BBF_KEEP_BBJ_ALWAYS flag could be suspect, but the codegen and GC/EH tables are identical, which leads me to believe this might not be JIT-related.

@janvorli
Copy link
Member

. @janvorli how is isSameTry computed?

@amanasifkhalid the isSameTry is the CORINFO_EH_CLAUSE_SAMETRY flag. That also needs to be correct in order to make things work.

@janvorli
Copy link
Member

@janvorli how is isSameTry computed? I'm assuming when the runtime searches for an exception handler, it iterates in the order the handlers are placed in memory, right?

I am sorry for a late response. I have written my response here few days ago, but apparently I must have forgotten to push the comment button.
The isSameTry is the flag CORINFO_EH_CLAUSE_SAMETRY that JIT sets here:

// CORINFO_EH_CLAUSE_SAMETRY flag means that the current clause covers same
// try block as the previous one. The runtime cannot reliably infer this information from
// native code offsets because of different try blocks can have same offsets. Alternative
// solution to this problem would be inserting extra nops to ensure that different try
// blocks have different offsets.
if (EHblkDsc::ebdIsSameTry(HBtab, HBtab - 1))
{
// The SAMETRY bit should only be set on catch clauses. This is ensured in IL, where only 'catch' is
// allowed to be mutually-protect. E.g., the C# "try {} catch {} catch {} finally {}" actually exists in
// IL as "try { try {} catch {} catch {} } finally {}".
assert(HBtab->HasCatchHandler());
flags = (CORINFO_EH_CLAUSE_FLAGS)(flags | CORINFO_EH_CLAUSE_SAMETRY);
}

It needs to be set correctly too in order to make things work.

@amanasifkhalid
Copy link
Member

@janvorli no worries, thank you for pointing this snippet out! I'll take another look at this later today

@JamesNK
Copy link
Member

JamesNK commented Jun 28, 2024

What is the status of this issue? ASP.NET Core local tests are still impacted by the regression.

It seems like a pretty low-level problem. Leaving it to the end of .NET 9 dev cycle doesn't seem like a good idea.

@amanasifkhalid
Copy link
Member

amanasifkhalid commented Jun 28, 2024

@janvorli I tweaked the JIT to propagate the CORINFO_EH_CLAUSE_SAMETRY flag even when the catch handlers that map to the same try region aren't contiguous in terms of EH indices, though looking at the runtime side, this isn't enough to fix the problem:

// Now, we continue skipping while the try region is identical to the one that invoked the
// previous dispatch.
if ((ehClause._tryStartOffset == lastTryStart) && (ehClause._tryEndOffset == lastTryEnd)
#if !NATIVEAOT
&& (ehClause._isSameTry)
#endif
)
continue;

The isSameTry check isn't done on NativeAOT, and either way, the try offset comparison will short-circuit the if-statement once we get to the problematic finally clause. I don't think we can force this finally clause to have a higher index because of the JIT invariant that nested EH regions must have indices smaller than their parent regions. I don't know if this invariant applies to the runtime's EH semantics, though I'm assuming we cannot change the order in which the JIT reports EH clauses to the runtime to fix this?

During unwinding, if we are trying to skip over try regions with the same offsets as the last try region, and we encounter an ehClause where ehClause._tryStartOffset > lastTryStart, then there's no way this clause can handle the exception, since ehClause maps to a try region that is not the same as the last one, and does not wrap the last one, right? So we should be able to skip over it?

@amanasifkhalid
Copy link
Member

Just to confirm nothing relevant on the JIT side has changed, I can reproduce this issue on .NET 8 when targeting NativeAOT in Debug -- the ordering of the EH clauses is the same as above.

@amanasifkhalid
Copy link
Member

Also just to clarify on the behavioral diffs between Debug and Release, when optimizing, the JIT is able to remove the "required for repro" finally clause during its empty try removal opt pass (Compiler::fgRemoveEmptyTry); in Debug builds, this pass doesn't run. If we tweak the problematic try-finally to have code with side effects in Release builds, such as by replacing the Debug.Fails with Console.WriteLine, the JIT cannot optimize away the try-finally, so the incorrect EH behavior reproduces in Release builds. This JIT behavior is the same between .NET 8 and 9.

@janvorli I don't think this can be fixed on the codegen side. I'm guessing the new/NativeAOT unwinder needs to be able to tolerate sibling EH clauses being interweaved with "same try" clauses when skipping frames -- would it be too costly to change the unwinder to tolerate this?

@amanasifkhalid amanasifkhalid removed the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 5, 2024
@jkotas
Copy link
Member

jkotas commented Jul 5, 2024

I'm assuming we cannot change the order in which the JIT reports EH clauses to the runtime to fix this?

It should be fine to change the order in which the JIT encodes EH clauses (without affecting any of the internal JIT invariants). The current EH clause encoding in genReportEH is not 1:1 mapping of the JIT EH table either.

If it helps, #88072 (comment) has more context about the motivation for CORINFO_EH_CLAUSE_SAMETRY flag.

@jkotas
Copy link
Member

jkotas commented Jul 5, 2024

The isSameTry check isn't done on NativeAOT,

For native AOT, the equivalent check is done earlier at build time:

// If the previous clause has same try offset and length as the current clause,
// but belongs to a different try block (CORINFO_EH_CLAUSE_SAMETRY is not set),
// emit a special marker to allow runtime distinguish this case.
if ((previousClause.TryOffset == clause.TryOffset) &&
(previousClause.TryLength == clause.TryLength) &&
((clause.Flags & CORINFO_EH_CLAUSE_FLAGS.CORINFO_EH_CLAUSE_SAMETRY) == 0))
. It would be a good idea to add a comment.

@jkotas
Copy link
Member

jkotas commented Jul 5, 2024

IMHO, this should be fixed in the JIT. I do not think that it is worth it to be changing the invariants of the EH encodings to fix this bug.

@amanasifkhalid
Copy link
Member

It should be fine to change the order in which the JIT encodes EH clauses (without affecting any of the internal JIT invariants). The current EH clause encoding in genReportEH is not 1:1 mapping of the JIT EH table either.

Thank you for clarifying this! With that in mind, I agree it's easier to fix this on the JIT side. I've opened #104531 to adjust the order in which EH clauses are reported to the VM. With those changes, I can no longer repro the incorrect behavior from above; here's what the EH table reported the VM now looks like (EH#3 is the problematic try-finally):

EH#0: try [002B..0071) handled by [0082..00FA) (class: 100000E)
EH#1: try [002B..0071) handled by [0126..017B) (class: 100000F) same try
EH#2: try [002B..0071) handled by [017B..01C3) (class: 1000010) same try
EH#3: try [014E..0162) handled by [00FA..0126) (finally)
EH#4: try [002B..0072) handled by [01C3..01EF) (finally)
EH#5: try [0082..00FA) handled by [01C3..01EF) (finally) duplicated
EH#6: try [00FA..0126) handled by [01C3..01EF) (finally) duplicated
EH#7: try [0126..017B) handled by [01C3..01EF) (finally) duplicated
EH#8: try [017B..01C3) handled by [01C3..01EF) (finally) duplicated
EH#9: try [0072..0072) handled by [0072..007B) (finally) cloned finally
EH#10: try [0162..0162) handled by [0162..016C) (finally) cloned finally

@amanasifkhalid amanasifkhalid added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-ExceptionHandling-coreclr labels Jul 8, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@JamesNK
Copy link
Member

JamesNK commented Jul 17, 2024

FYI the ASP.NET Core tests now pass.

@github-actions github-actions bot locked and limited conversation to collaborators Aug 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
9 participants