-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test failure: JIT/jit64/mcc/interop/mcc_i03/mcc_i03.cmd #108640
Comments
@jakobbotsch, PTAL. |
@am11 Where can I find the symbols for the ILC compiled crossgen2 in Core_Root? There is a dump available for this crash in crossgen2.exe, but I cannot find any symbols. |
The AV is inside the stack allocation probe helper, so it seems the problem here is a stack overflow. The stack trace is quite long indeed: It seems to be stuck in some repeated EH dispatching state, perhaps due to bad EH generated by the JIT. Hard to investigate further without symbols. cc @janvorli |
I think we are not copying symbols in CORE_ROOT right now. Meanwhile, we can take the git SHA from file properties, checkout runtime and build |
I'll try to repro it locally |
Do we get artifacts/bin in repro? (I don't recall it has been a while I downloaded the dumps) |
I'm somewhat skeptical that this will be simple (for one, I think my local toolset versions is not the exact same as CI is using to build).
No, that's not included. |
I was able to repro this with a local build by running the test in a loop while also loading my system (in my case by running SPMI replay in the background). Here is the full stack trace with symbols: https://gist.github.com/jakobbotsch/df0320ec2e72066a6f0d97a23527e65c It looks like some sort of crash while trying to allocate memory in |
This issue is a bug in UCRT. When ucrt is statically linked to exe, it deinitializes some locks it uses at exit. When a secondary thread is still running and calls into UCRT, e.g. to invoke malloc, the code tries to take the deinitized lock and crashes. In our case, the main thread is on the exit path and a secondary thread, one of the background GC threads, has just entered its thread function and ends up calling malloc from PalSetCurrentThreadName. The following simple repro C app crashes with the same problem (UCRT must be debug version and linked statically): #include <stdio.h>
#include <Windows.h>
DWORD __stdcall ThreadFunction(void* lpThreadParameter)
{
for (int i = 0; i < 1000000; i++)
{
malloc(31);
Sleep(1);
}
return 0;
}
int main(int argc, char** argv)
{
CreateThread(NULL, 0, ThreadFunction, NULL, 0, NULL);
exit(0);
} The issue seems to have been fixed in UCRT in March 2024. But the ucrt we are using still doesn't contain the fix. |
Actually, it looks like the fix that was made in UCRT was for something slightly different. |
I've discussed this with the UCRT devs and it is actually an expected behavior. When using secondary threads, the UCRT expects that those threads are either shut down or remain blocked during the shutdown if graceful shutdown is expected. |
I've run into a different flavor of this crash in a test run just yesterday. A background thread had this:
While the main thread was already trying to exit:
I'll send a PR to exit with quick_exit to improve reliability of our debug/checked runs. |
What is this background thread trying to do? Is there a run-away test that is missing wait for the async operations to finish before completing the test? |
Is this a recent UCRT behavior change? It is surprising that we have not seen this issue before. |
From the stack it looks like some xunit infrastructure stuff, not test code. The dump should still be available with |
We have seen these issues occasionally before: dotnet/runtimelab#1487 (comment) . It appears that some recent changes in CRT implementation makes them more likely to be hit. It is unfortunate that this UCRT crash is considered "by design". |
Failed in: runtime-coreclr r2r 20241006.1
Failed tests:
Error message:
Stack trace:
The text was updated successfully, but these errors were encountered: