Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Secondary frame stackoverflow test failure on linux #107226

Closed
am11 opened this issue Aug 31, 2024 · 2 comments · Fixed by #107264
Closed

Secondary frame stackoverflow test failure on linux #107226

am11 opened this issue Aug 31, 2024 · 2 comments · Fixed by #107264
Labels
area-ExceptionHandling-coreclr in-pr There is an active PR which will close this issue when it is merged os-linux Linux OS (any supported distro)
Milestone

Comments

@am11
Copy link
Member

am11 commented Aug 31, 2024

This test:

public static void TestStackOverflowSmallFrameSecondaryThread()

is failing in runtime-coreclr superpmi-collect-test pipeline on linux x64, arm64 and arm:

Running stackoverflow test(largeframe main)
"Stack overflow."
"Repeated 42 times:"
"--------------------------------"
"   at TestStackOverflow.Program.InfiniteRecursionB2()"
"   at TestStackOverflow.Program.InfiniteRecursionA2()"
"   at TestStackOverflow.Program.InfiniteRecursionC2()"
"--------------------------------"
"   at TestStackOverflow.Program.InfiniteRecursionB2()"
"   at TestStackOverflow.Program.InfiniteRecursionA2()"
"   at TestStackOverflow.Program.Test(Boolean)"
"   at TestStackOverflow.Program.Main(System.String[])"
""
Running stackoverflow test(smallframe secondary)
"Stack overflow."
"Stack overflow."
""
System.Exception: Missing "TestStackOverflow.Program.Test" method frame
   at TestStackOverflow.Program.TestStackOverflowSmallFrameSecondaryThread()
   at __GeneratedMainWrapper.Main()
Expected: 100
Actual: 101
END EXECUTION - FAILED

cc @janvorli

@am11 am11 added os-linux Linux OS (any supported distro) area-ExceptionHandling-coreclr labels Aug 31, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Aug 31, 2024
@janvorli
Copy link
Member

janvorli commented Sep 2, 2024

The problem is due to the superpmi libsuperpmi-shim-collector.so sigsegv handler being called as the first one and the GetCurrentPalThread returning NULL in this dll, because the PAL threads were created by the libcoreclr.so. That results in the sigsegv_handler printing "Stack overflow" and calling PROCAbort.
I am testing a fix that just calls the previous signal handler when GetCurrentPalThread returns NULL.

@janvorli
Copy link
Member

janvorli commented Sep 2, 2024

It is actually a bit more involved. The GetCurrentPalThread sometimes returns non-NULL even when called from the libsuperpmi-shim-collector.so SIGSEGV handler. That happens when the superpmi stuff ended up being called on that thread before the crash occurs. Which means we have jitted something on that thread. Then the PAL thread instance gets created also in the context of the libsuperpmi-shim-collector.so.
Nevertheless, it doesn't change the way to fix that I've mentioned above. I just wanted to share the full story here. If the PAL thread got registered, we still end up falling back to the SIGSEGV handler in the libcoreclr.so, because the superpmi doesn't implement hardware exception handling callbacks that could handle the stack overflow.

@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Sep 2, 2024
@janvorli janvorli removed the untriaged New issue has not been triaged by the area owner label Sep 3, 2024
@janvorli janvorli added this to the 10.0.0 milestone Sep 3, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Oct 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-ExceptionHandling-coreclr in-pr There is an active PR which will close this issue when it is merged os-linux Linux OS (any supported distro)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants