Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Linq.Expressions CompileDeepTree_NoStackOverflowFast fails with Stack Overflow #53309

Closed
SingleAccretion opened this issue May 26, 2021 · 1 comment · Fixed by #53342
Closed
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@SingleAccretion
Copy link
Contributor

SingleAccretion commented May 26, 2021

Recent runs that observed the failure: first, second. Test in question.

To diagnose the cause, I used the following snippet:

var flags = BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Static | BindingFlags.Instance;
var t = new Thread(x => RuntimeHelpers.PrepareMethod(typeof(Program).GetMethod("Problem", flags).MethodHandle), X * 1024);
t.Start();

class Program
{
    // Pseudo syntax, I used InlineIL.Fody
    static int Problem()
    {
        ldc.i4.0
    
        // Repeated 100 times
        ldc.i4.1
        add
    
        ret
    }
}

Release runtime + Release Jit were tested (since that's what the CI was using), with tiered compilation off.

git bisect showed the following:
- Commits before d435388 only needeed X of 65 to succeed.
- Commits after and including d435388 needed X of 129 and above to succeed, matching the timing of the failures in CI (this test succeeded on 21st of May).

This test has a history of failing on Debug Jits (indeed, an attribute pointing to just that) - see #21374. But now, apparently, it needs 129 KB of stack even in Release. I am not sure what is the proper fix for this.

cc @AndyAyersMS

@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels May 26, 2021
@AndyAyersMS
Copy link
Member

@jakobbotsch want to take this one?

@jakobbotsch jakobbotsch self-assigned this May 26, 2021
@jakobbotsch jakobbotsch removed the untriaged New issue has not been triaged by the area owner label May 26, 2021
@jakobbotsch jakobbotsch added this to the 6.0.0 milestone May 26, 2021
@jakobbotsch
Copy link
Member

The issue is that the native PGO data we are passing after d435388 causes MSVC to use more stack space in fgMorphSmpOp. Before that commit the stack frame for fgMorphSmpOp needs 0x2A0 bytes, while after it needs 0x340 bytes. As #21374 points out, the stack overflow has fgMorphSmpOp and fgMorphTree calling each other recursively, once for each GT_ADD we end up with (which is limited, see below). fgMorphTree uses 0xF0 bytes of stack in both builds, so in the failing we need 100*(0x340+0xF0) = 105 KB of stack plus what's needed for the rest of the frames.

The importer already limits the depth of trees it can create:

/* We need to restrict the max tree depth as many of the Compiler
functions are recursive. We do this by spilling the stack */
if (verCurrentState.esStackDepth)
{
/* Has it been a while since we last saw a non-empty stack (which
guarantees that the tree depth isnt accumulating. */
if ((opcodeOffs - lastSpillOffs) > MAX_TREE_SIZE && impCanSpillNow(prevOpcode))
{
impSpillStackEnsure();
lastSpillOffs = opcodeOffs;
}
}
else
{
lastSpillOffs = opcodeOffs;
impBoxTempInUse = false; // nothing on the stack, box temp OK to use again
}

We could decrease this limit, though that seems unnecessary. The original point of this test was to have a fast inner loop test corresponding to the slower outer loop one above it. However, in #21374 this new test was also moved to outer loop, so it does not seem like there is much point in having it anymore.

jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue May 27, 2021
The manually assigned 128 KiB stack here is bordering on the amount of
stack space that release JIT needs to JIT functions with deep trees.
Since the JIT already has a limit on the depth of trees it creates

https://github.com/dotnet/runtime/blob/44f050acc0f7791d6cd7ac772945444912bcf299/src/coreclr/jit/importer.cpp#L11234-L11252

it seems unnecessary to pressimize the JIT further to allow this test to
pass. Also, the test was originally added as a
faster equivalent to the test above it that could run in inner loop
(dotnet/corefx#14444), but it was subsequently
moved to outer loop (dotnet/corefx#19563), so
now it does not seem like there's much point in retaining it.

Fix dotnet#53309
@ghost ghost locked as resolved and limited conversation to collaborators Jun 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants