Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault when calling MarshalNative::GCHandleInternalGet in System.Net.Requests.Tests in rolling CI #69125

Closed
jakobbotsch opened this issue May 10, 2022 · 24 comments
Assignees
Labels
area-System.Net.Security blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'
Milestone

Comments

@jakobbotsch
Copy link
Member

Pipeline: https://dev.azure.com/dnceng/public/_build/results?buildId=1761915&view=results
Job: Libraries Test Run checked coreclr Linux x64 Release
Log: https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-c09b54c3fa68465384/System.Net.Requests.Tests/1/console.51c5c155.log?helixlogtype=result

/datadisks/disk1/work/AA5E0919/w/BBDA0A1B/e /datadisks/disk1/work/AA5E0919/w/BBDA0A1B/e
  Discovering: System.Net.Requests.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Net.Requests.Tests (found 348 of 367 test cases)
  Starting:    System.Net.Requests.Tests (parallel test collections = on, max threads = 2)
    System.Net.Tests.FtpWebRequestTest.Ftp_AppendFile [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_RenameFile [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_LargeFile [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_CreateAndDelete [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_RenameFileSubDir_Success [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_MakeAndRemoveDir_Success [SKIP]
      Condition(s) not met: "LocalServerAvailable"
./RunTests.sh: line 168: 10507 Segmentation fault      (core dumped) "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Net.Requests.Tests.runtimeconfig.json --depsfile System.Net.Requests.Tests.deps.json xunit.console.dll System.Net.Requests.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE
/datadisks/disk1/work/AA5E0919/w/BBDA0A1B/e
----- end Tue May 10 09:05:08 UTC 2022 ----- exit code 139 ----------------------------------------------------------
exit code 139 means SIGSEGV Illegal memory access. Deref invalid pointer, overrunning buffer, stack overflow etc. Core dumped.
ulimit -c value: unlimited

The dump shows segfault in Object::ValidateInner as part of MarshalNative::GCHandleInternalGet:

* thread #1, name = 'dotnet', stop reason = signal SIGSEGV
  * frame #0: 0x00007fc745dff6ed libcoreclr.so`Object::ValidateInner(int, int, int) [inlined] Object::GetGCSafeMethodTable(this=0x0000000000000007) const at object.h:446:59
    frame #1: 0x00007fc745dff6ed libcoreclr.so`Object::ValidateInner(this=0x0000000000000007, bDeep=YES, bVerifyNextHeader=YES, bVerifySyncBlock=YES) at object.cpp:528
    frame #2: 0x00007fc745e0135d libcoreclr.so`OBJECTREF::OBJECTREF(Object*) [inlined] Object::Validate(this=0x0000000000000007, bDeep=YES, bVerifyNextHeader=YES, bVerifySyncBlock=YES) at object.cpp:508:9
    frame #3: 0x00007fc745e012b0 libcoreclr.so`OBJECTREF::OBJECTREF(this=0x00007f85fecf9760, pObject=0x0000000000000007) at object.cpp:1131
    frame #4: 0x00007fc745f70130 libcoreclr.so`MarshalNative::GCHandleInternalGet(OBJECTHANDLE__*) [inlined] ObjectFromHandle(handle=0x00007fc7476e6a40) at gchandleutilities.h:42:24
    frame #5: 0x00007fc745f70104 libcoreclr.so`MarshalNative::GCHandleInternalGet(handle=0x00007fc7476e6a40) at marshalnative.cpp:534
    frame #6: 0x00007fc6cf648f26
    frame #7: 0x00007fc6c8310356 libssl.so.1.1`SSL_set_fd + 86

Top frames of dumpstack:

(lldb) dumpstack
OS Thread Id: 0x296f (1)
TEB information is not available so a stack size of 0xFFFF is assumed
Current frame: libcoreclr.so!Object::ValidateInner(int, int, int) + 0x1ad [/__w/1/s/src/coreclr/vm/object.h:446]
Child-SP         RetAddr          Caller, Callee
00007F85FECF9710 00007fc745e0135d libcoreclr.so!OBJECTREF::OBJECTREF(Object*) + 0x11d [/__w/1/s/src/coreclr/vm/object.cpp:1132], calling libcoreclr.so!Object::ValidateInner(int, int, int) [/__w/1/s/src/coreclr/vm/object.cpp:513]
00007F85FECF9740 00007fc745f70130 libcoreclr.so!MarshalNative::GCHandleInternalGet(OBJECTHANDLE__*) + 0x50 [/__w/1/s/src/coreclr/vm/gchandleutilities.h:44], calling libcoreclr.so!OBJECTREF::OBJECTREF(Object*) [/__w/1/s/src/coreclr/vm/object.cpp:1117]
00007F85FECF9780 00007fc6cf648f26 (MethodDesc 00007fc6cf5c4370 + 0x66 Interop+OpenSsl.NewSessionCallback(IntPtr, IntPtr)), calling 00007fc745f700e0 (stub for System.Runtime.InteropServices.GCHandle.InternalGet(IntPtr))
00007F85FECF97B8 00007fc6cf648ef8 (MethodDesc 00007fc6cf5c4370 + 0x38 Interop+OpenSsl.NewSessionCallback(IntPtr, IntPtr)), calling libcoreclr.so!JIT_PInvokeBegin [/__w/1/s/src/coreclr/pal/inc/unixasmmacrosamd64.inc:896]
00007F85FECF9820 00007fc6c8310356 libssl.so.1.1!SSL_set_fd + 0x56
00007F85FECF9850 00007fc6c8329131 libssl.so.1.1!___lldb_unnamed_symbol512$$libssl.so.1.1 + 0x3e1, calling libssl.so.1.1!SSL_get_rfd + 0x10
00007F85FECF98E0 00007fc6c832a9e5 libssl.so.1.1!___lldb_unnamed_symbol521$$libssl.so.1.1 + 0x6b5, calling libssl.so.1.1!___lldb_unnamed_symbol511$$libssl.so.1.1 + 0x170
00007F85FECF9990 00007fc7463675f7 libcoreclr.so!GetCurrentThreadId + 0x77 [/__w/1/s/src/coreclr/pal/src/include/pal/thread.hpp:781], calling libcoreclr.so!__tls_get_addr
00007F85FECF99F0 00007fc6c82fce70 libssl.so.1.1!___lldb_unnamed_symbol114$$libssl.so.1.1 + 0x130
00007F85FECF9A40 00007fc745d78788 libcoreclr.so!Frame::Pop(Thread*) + 0xf8 [/__w/1/s/src/coreclr/vm/frames.cpp:447], calling libcoreclr.so!Thread::SetFrame(Frame*) [/__w/1/s/src/coreclr/vm/threads.cpp:211]
00007F85FECF9A90 00007fc6c8303995 libssl.so.1.1!___lldb_unnamed_symbol152$$libssl.so.1.1 + 0xb45
00007F85FECF9AA0 00007fc6c8303950 libssl.so.1.1!___lldb_unnamed_symbol152$$libssl.so.1.1 + 0xb00, calling libssl.so.1.1!EC_KEY_get_conv_form
00007F85FECF9AE0 00007fc6c830e252 libssl.so.1.1!ERR_load_SSL_strings + 0x22
00007F85FECF9B00 00007f861c275cc9 libcrypto.so.1.1!___lldb_unnamed_symbol1922$$libcrypto.so.1.1 + 0xf9, calling libcrypto.so.1.1 + 0xffffffff
00007F85FECF9B10 00007f861c1ee2e3 libcrypto.so.1.1!ERR_add_error_data + 0x13, calling libcrypto.so.1.1!___lldb_unnamed_symbol1922$$libcrypto.so.1.1 + 0x130
00007F85FECF9B40 00007fc6c830e373 libssl.so.1.1!___lldb_unnamed_symbol287$$libssl.so.1.1 + 0xb3, calling libssl.so.1.1!SSL_CONF_CTX_set_ssl_ctx + 0x50
00007F85FECF9B60 00007fc6c858ca32 libSystem.Security.Cryptography.Native.OpenSsl.so!CryptoNative_SslRead + 0x32 [/__w/1/s/src/native/libs/System.Security.Cryptography.Native/pal_ssl.c:458]
00007F85FECF9B90 00007fc6cf6408fc (MethodDesc 00007fc6cf8629b0 + 0x7c ILStubClass.IL_STUB_PInvoke(IntPtr, Byte*, Int32, SslErrorCode*))
00007F85FECF9BD0 00007fc6cf6408fc (MethodDesc 00007fc6cf8629b0 + 0x7c ILStubClass.IL_STUB_PInvoke(IntPtr, Byte*, Int32, SslErrorCode*))
00007F85FECF9C20 00007fc6cf644886 (MethodDesc 00007fc6cf5c4de8 + 0xa6 Interop+Ssl.SslRead(Microsoft.Win32.SafeHandles.SafeSslHandle, Byte ByRef, Int32, SslErrorCode ByRef)), calling 00007fc6cf3a4678 (stub for Interop+Ssl.<SslRead>g____PInvoke__|22_0(IntPtr, Byte*, Int32, SslErrorCode*))
00007F85FECF9CA0 00007fc6cf644557 (MethodDesc 00007fc6cf5c4230 + 0xb7 Interop+OpenSsl.Decrypt(Microsoft.Win32.SafeHandles.SafeSslHandle, System.Span`1<Byte>, SslErrorCode ByRef)), calling 00007fc6cf614300
00007F85FECF9D40 00007fc6cf6441b1 (MethodDesc 00007fc6cf5a5d70 + 0x91 System.Net.Security.SslStreamPal.DecryptMessage(System.Net.Security.SafeDeleteSslContext, System.Span`1<Byte>, Int32 ByRef, Int32 ByRef)), calling 00007fc6cf6142b8
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label May 10, 2022
@ghost
Copy link

ghost commented May 10, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Pipeline: https://dev.azure.com/dnceng/public/_build/results?buildId=1761915&view=results
Job: Libraries Test Run checked coreclr Linux x64 Release
Log: https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-c09b54c3fa68465384/System.Net.Requests.Tests/1/console.51c5c155.log?helixlogtype=result

/datadisks/disk1/work/AA5E0919/w/BBDA0A1B/e /datadisks/disk1/work/AA5E0919/w/BBDA0A1B/e
  Discovering: System.Net.Requests.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Net.Requests.Tests (found 348 of 367 test cases)
  Starting:    System.Net.Requests.Tests (parallel test collections = on, max threads = 2)
    System.Net.Tests.FtpWebRequestTest.Ftp_AppendFile [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_RenameFile [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_LargeFile [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_CreateAndDelete [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_RenameFileSubDir_Success [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_MakeAndRemoveDir_Success [SKIP]
      Condition(s) not met: "LocalServerAvailable"
./RunTests.sh: line 168: 10507 Segmentation fault      (core dumped) "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Net.Requests.Tests.runtimeconfig.json --depsfile System.Net.Requests.Tests.deps.json xunit.console.dll System.Net.Requests.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE
/datadisks/disk1/work/AA5E0919/w/BBDA0A1B/e
----- end Tue May 10 09:05:08 UTC 2022 ----- exit code 139 ----------------------------------------------------------
exit code 139 means SIGSEGV Illegal memory access. Deref invalid pointer, overrunning buffer, stack overflow etc. Core dumped.
ulimit -c value: unlimited

The dump shows segfault in Object::ValidateInner as part of MarshalNative::GCHandleInternalGet:

* thread #1, name = 'dotnet', stop reason = signal SIGSEGV
  * frame #0: 0x00007fc745dff6ed libcoreclr.so`Object::ValidateInner(int, int, int) [inlined] Object::GetGCSafeMethodTable(this=0x0000000000000007) const at object.h:446:59
    frame #1: 0x00007fc745dff6ed libcoreclr.so`Object::ValidateInner(this=0x0000000000000007, bDeep=YES, bVerifyNextHeader=YES, bVerifySyncBlock=YES) at object.cpp:528
    frame #2: 0x00007fc745e0135d libcoreclr.so`OBJECTREF::OBJECTREF(Object*) [inlined] Object::Validate(this=0x0000000000000007, bDeep=YES, bVerifyNextHeader=YES, bVerifySyncBlock=YES) at object.cpp:508:9
    frame #3: 0x00007fc745e012b0 libcoreclr.so`OBJECTREF::OBJECTREF(this=0x00007f85fecf9760, pObject=0x0000000000000007) at object.cpp:1131
    frame #4: 0x00007fc745f70130 libcoreclr.so`MarshalNative::GCHandleInternalGet(OBJECTHANDLE__*) [inlined] ObjectFromHandle(handle=0x00007fc7476e6a40) at gchandleutilities.h:42:24
    frame #5: 0x00007fc745f70104 libcoreclr.so`MarshalNative::GCHandleInternalGet(handle=0x00007fc7476e6a40) at marshalnative.cpp:534
    frame #6: 0x00007fc6cf648f26
    frame #7: 0x00007fc6c8310356 libssl.so.1.1`SSL_set_fd + 86

Top frames of dumpstack:

(lldb) dumpstack
OS Thread Id: 0x296f (1)
TEB information is not available so a stack size of 0xFFFF is assumed
Current frame: libcoreclr.so!Object::ValidateInner(int, int, int) + 0x1ad [/__w/1/s/src/coreclr/vm/object.h:446]
Child-SP         RetAddr          Caller, Callee
00007F85FECF9710 00007fc745e0135d libcoreclr.so!OBJECTREF::OBJECTREF(Object*) + 0x11d [/__w/1/s/src/coreclr/vm/object.cpp:1132], calling libcoreclr.so!Object::ValidateInner(int, int, int) [/__w/1/s/src/coreclr/vm/object.cpp:513]
00007F85FECF9740 00007fc745f70130 libcoreclr.so!MarshalNative::GCHandleInternalGet(OBJECTHANDLE__*) + 0x50 [/__w/1/s/src/coreclr/vm/gchandleutilities.h:44], calling libcoreclr.so!OBJECTREF::OBJECTREF(Object*) [/__w/1/s/src/coreclr/vm/object.cpp:1117]
00007F85FECF9780 00007fc6cf648f26 (MethodDesc 00007fc6cf5c4370 + 0x66 Interop+OpenSsl.NewSessionCallback(IntPtr, IntPtr)), calling 00007fc745f700e0 (stub for System.Runtime.InteropServices.GCHandle.InternalGet(IntPtr))
00007F85FECF97B8 00007fc6cf648ef8 (MethodDesc 00007fc6cf5c4370 + 0x38 Interop+OpenSsl.NewSessionCallback(IntPtr, IntPtr)), calling libcoreclr.so!JIT_PInvokeBegin [/__w/1/s/src/coreclr/pal/inc/unixasmmacrosamd64.inc:896]
00007F85FECF9820 00007fc6c8310356 libssl.so.1.1!SSL_set_fd + 0x56
00007F85FECF9850 00007fc6c8329131 libssl.so.1.1!___lldb_unnamed_symbol512$$libssl.so.1.1 + 0x3e1, calling libssl.so.1.1!SSL_get_rfd + 0x10
00007F85FECF98E0 00007fc6c832a9e5 libssl.so.1.1!___lldb_unnamed_symbol521$$libssl.so.1.1 + 0x6b5, calling libssl.so.1.1!___lldb_unnamed_symbol511$$libssl.so.1.1 + 0x170
00007F85FECF9990 00007fc7463675f7 libcoreclr.so!GetCurrentThreadId + 0x77 [/__w/1/s/src/coreclr/pal/src/include/pal/thread.hpp:781], calling libcoreclr.so!__tls_get_addr
00007F85FECF99F0 00007fc6c82fce70 libssl.so.1.1!___lldb_unnamed_symbol114$$libssl.so.1.1 + 0x130
00007F85FECF9A40 00007fc745d78788 libcoreclr.so!Frame::Pop(Thread*) + 0xf8 [/__w/1/s/src/coreclr/vm/frames.cpp:447], calling libcoreclr.so!Thread::SetFrame(Frame*) [/__w/1/s/src/coreclr/vm/threads.cpp:211]
00007F85FECF9A90 00007fc6c8303995 libssl.so.1.1!___lldb_unnamed_symbol152$$libssl.so.1.1 + 0xb45
00007F85FECF9AA0 00007fc6c8303950 libssl.so.1.1!___lldb_unnamed_symbol152$$libssl.so.1.1 + 0xb00, calling libssl.so.1.1!EC_KEY_get_conv_form
00007F85FECF9AE0 00007fc6c830e252 libssl.so.1.1!ERR_load_SSL_strings + 0x22
00007F85FECF9B00 00007f861c275cc9 libcrypto.so.1.1!___lldb_unnamed_symbol1922$$libcrypto.so.1.1 + 0xf9, calling libcrypto.so.1.1 + 0xffffffff
00007F85FECF9B10 00007f861c1ee2e3 libcrypto.so.1.1!ERR_add_error_data + 0x13, calling libcrypto.so.1.1!___lldb_unnamed_symbol1922$$libcrypto.so.1.1 + 0x130
00007F85FECF9B40 00007fc6c830e373 libssl.so.1.1!___lldb_unnamed_symbol287$$libssl.so.1.1 + 0xb3, calling libssl.so.1.1!SSL_CONF_CTX_set_ssl_ctx + 0x50
00007F85FECF9B60 00007fc6c858ca32 libSystem.Security.Cryptography.Native.OpenSsl.so!CryptoNative_SslRead + 0x32 [/__w/1/s/src/native/libs/System.Security.Cryptography.Native/pal_ssl.c:458]
00007F85FECF9B90 00007fc6cf6408fc (MethodDesc 00007fc6cf8629b0 + 0x7c ILStubClass.IL_STUB_PInvoke(IntPtr, Byte*, Int32, SslErrorCode*))
00007F85FECF9BD0 00007fc6cf6408fc (MethodDesc 00007fc6cf8629b0 + 0x7c ILStubClass.IL_STUB_PInvoke(IntPtr, Byte*, Int32, SslErrorCode*))
00007F85FECF9C20 00007fc6cf644886 (MethodDesc 00007fc6cf5c4de8 + 0xa6 Interop+Ssl.SslRead(Microsoft.Win32.SafeHandles.SafeSslHandle, Byte ByRef, Int32, SslErrorCode ByRef)), calling 00007fc6cf3a4678 (stub for Interop+Ssl.<SslRead>g____PInvoke__|22_0(IntPtr, Byte*, Int32, SslErrorCode*))
00007F85FECF9CA0 00007fc6cf644557 (MethodDesc 00007fc6cf5c4230 + 0xb7 Interop+OpenSsl.Decrypt(Microsoft.Win32.SafeHandles.SafeSslHandle, System.Span`1<Byte>, SslErrorCode ByRef)), calling 00007fc6cf614300
00007F85FECF9D40 00007fc6cf6441b1 (MethodDesc 00007fc6cf5a5d70 + 0x91 System.Net.Security.SslStreamPal.DecryptMessage(System.Net.Security.SafeDeleteSslContext, System.Span`1<Byte>, Int32 ByRef, Int32 ByRef)), calling 00007fc6cf6142b8
Author: jakobbotsch
Assignees: -
Labels:

area-System.Net, untriaged

Milestone: -

@jakobbotsch jakobbotsch added the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label May 10, 2022
@jakobbotsch
Copy link
Member Author

This looks like the same as #68037, so looks like that error is persistent.

@ghost
Copy link

ghost commented May 10, 2022

Tagging subscribers to this area: @dotnet/ncl, @vcsjones
See info in area-owners.md if you want to be subscribed.

Issue Details

Pipeline: https://dev.azure.com/dnceng/public/_build/results?buildId=1761915&view=results
Job: Libraries Test Run checked coreclr Linux x64 Release
Log: https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-c09b54c3fa68465384/System.Net.Requests.Tests/1/console.51c5c155.log?helixlogtype=result

/datadisks/disk1/work/AA5E0919/w/BBDA0A1B/e /datadisks/disk1/work/AA5E0919/w/BBDA0A1B/e
  Discovering: System.Net.Requests.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Net.Requests.Tests (found 348 of 367 test cases)
  Starting:    System.Net.Requests.Tests (parallel test collections = on, max threads = 2)
    System.Net.Tests.FtpWebRequestTest.Ftp_AppendFile [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_RenameFile [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_LargeFile [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_CreateAndDelete [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_RenameFileSubDir_Success [SKIP]
      Condition(s) not met: "LocalServerAvailable"
    System.Net.Tests.FtpWebRequestTest.Ftp_MakeAndRemoveDir_Success [SKIP]
      Condition(s) not met: "LocalServerAvailable"
./RunTests.sh: line 168: 10507 Segmentation fault      (core dumped) "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Net.Requests.Tests.runtimeconfig.json --depsfile System.Net.Requests.Tests.deps.json xunit.console.dll System.Net.Requests.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE
/datadisks/disk1/work/AA5E0919/w/BBDA0A1B/e
----- end Tue May 10 09:05:08 UTC 2022 ----- exit code 139 ----------------------------------------------------------
exit code 139 means SIGSEGV Illegal memory access. Deref invalid pointer, overrunning buffer, stack overflow etc. Core dumped.
ulimit -c value: unlimited

The dump shows segfault in Object::ValidateInner as part of MarshalNative::GCHandleInternalGet:

* thread #1, name = 'dotnet', stop reason = signal SIGSEGV
  * frame #0: 0x00007fc745dff6ed libcoreclr.so`Object::ValidateInner(int, int, int) [inlined] Object::GetGCSafeMethodTable(this=0x0000000000000007) const at object.h:446:59
    frame #1: 0x00007fc745dff6ed libcoreclr.so`Object::ValidateInner(this=0x0000000000000007, bDeep=YES, bVerifyNextHeader=YES, bVerifySyncBlock=YES) at object.cpp:528
    frame #2: 0x00007fc745e0135d libcoreclr.so`OBJECTREF::OBJECTREF(Object*) [inlined] Object::Validate(this=0x0000000000000007, bDeep=YES, bVerifyNextHeader=YES, bVerifySyncBlock=YES) at object.cpp:508:9
    frame #3: 0x00007fc745e012b0 libcoreclr.so`OBJECTREF::OBJECTREF(this=0x00007f85fecf9760, pObject=0x0000000000000007) at object.cpp:1131
    frame #4: 0x00007fc745f70130 libcoreclr.so`MarshalNative::GCHandleInternalGet(OBJECTHANDLE__*) [inlined] ObjectFromHandle(handle=0x00007fc7476e6a40) at gchandleutilities.h:42:24
    frame #5: 0x00007fc745f70104 libcoreclr.so`MarshalNative::GCHandleInternalGet(handle=0x00007fc7476e6a40) at marshalnative.cpp:534
    frame #6: 0x00007fc6cf648f26
    frame #7: 0x00007fc6c8310356 libssl.so.1.1`SSL_set_fd + 86

Top frames of dumpstack:

(lldb) dumpstack
OS Thread Id: 0x296f (1)
TEB information is not available so a stack size of 0xFFFF is assumed
Current frame: libcoreclr.so!Object::ValidateInner(int, int, int) + 0x1ad [/__w/1/s/src/coreclr/vm/object.h:446]
Child-SP         RetAddr          Caller, Callee
00007F85FECF9710 00007fc745e0135d libcoreclr.so!OBJECTREF::OBJECTREF(Object*) + 0x11d [/__w/1/s/src/coreclr/vm/object.cpp:1132], calling libcoreclr.so!Object::ValidateInner(int, int, int) [/__w/1/s/src/coreclr/vm/object.cpp:513]
00007F85FECF9740 00007fc745f70130 libcoreclr.so!MarshalNative::GCHandleInternalGet(OBJECTHANDLE__*) + 0x50 [/__w/1/s/src/coreclr/vm/gchandleutilities.h:44], calling libcoreclr.so!OBJECTREF::OBJECTREF(Object*) [/__w/1/s/src/coreclr/vm/object.cpp:1117]
00007F85FECF9780 00007fc6cf648f26 (MethodDesc 00007fc6cf5c4370 + 0x66 Interop+OpenSsl.NewSessionCallback(IntPtr, IntPtr)), calling 00007fc745f700e0 (stub for System.Runtime.InteropServices.GCHandle.InternalGet(IntPtr))
00007F85FECF97B8 00007fc6cf648ef8 (MethodDesc 00007fc6cf5c4370 + 0x38 Interop+OpenSsl.NewSessionCallback(IntPtr, IntPtr)), calling libcoreclr.so!JIT_PInvokeBegin [/__w/1/s/src/coreclr/pal/inc/unixasmmacrosamd64.inc:896]
00007F85FECF9820 00007fc6c8310356 libssl.so.1.1!SSL_set_fd + 0x56
00007F85FECF9850 00007fc6c8329131 libssl.so.1.1!___lldb_unnamed_symbol512$$libssl.so.1.1 + 0x3e1, calling libssl.so.1.1!SSL_get_rfd + 0x10
00007F85FECF98E0 00007fc6c832a9e5 libssl.so.1.1!___lldb_unnamed_symbol521$$libssl.so.1.1 + 0x6b5, calling libssl.so.1.1!___lldb_unnamed_symbol511$$libssl.so.1.1 + 0x170
00007F85FECF9990 00007fc7463675f7 libcoreclr.so!GetCurrentThreadId + 0x77 [/__w/1/s/src/coreclr/pal/src/include/pal/thread.hpp:781], calling libcoreclr.so!__tls_get_addr
00007F85FECF99F0 00007fc6c82fce70 libssl.so.1.1!___lldb_unnamed_symbol114$$libssl.so.1.1 + 0x130
00007F85FECF9A40 00007fc745d78788 libcoreclr.so!Frame::Pop(Thread*) + 0xf8 [/__w/1/s/src/coreclr/vm/frames.cpp:447], calling libcoreclr.so!Thread::SetFrame(Frame*) [/__w/1/s/src/coreclr/vm/threads.cpp:211]
00007F85FECF9A90 00007fc6c8303995 libssl.so.1.1!___lldb_unnamed_symbol152$$libssl.so.1.1 + 0xb45
00007F85FECF9AA0 00007fc6c8303950 libssl.so.1.1!___lldb_unnamed_symbol152$$libssl.so.1.1 + 0xb00, calling libssl.so.1.1!EC_KEY_get_conv_form
00007F85FECF9AE0 00007fc6c830e252 libssl.so.1.1!ERR_load_SSL_strings + 0x22
00007F85FECF9B00 00007f861c275cc9 libcrypto.so.1.1!___lldb_unnamed_symbol1922$$libcrypto.so.1.1 + 0xf9, calling libcrypto.so.1.1 + 0xffffffff
00007F85FECF9B10 00007f861c1ee2e3 libcrypto.so.1.1!ERR_add_error_data + 0x13, calling libcrypto.so.1.1!___lldb_unnamed_symbol1922$$libcrypto.so.1.1 + 0x130
00007F85FECF9B40 00007fc6c830e373 libssl.so.1.1!___lldb_unnamed_symbol287$$libssl.so.1.1 + 0xb3, calling libssl.so.1.1!SSL_CONF_CTX_set_ssl_ctx + 0x50
00007F85FECF9B60 00007fc6c858ca32 libSystem.Security.Cryptography.Native.OpenSsl.so!CryptoNative_SslRead + 0x32 [/__w/1/s/src/native/libs/System.Security.Cryptography.Native/pal_ssl.c:458]
00007F85FECF9B90 00007fc6cf6408fc (MethodDesc 00007fc6cf8629b0 + 0x7c ILStubClass.IL_STUB_PInvoke(IntPtr, Byte*, Int32, SslErrorCode*))
00007F85FECF9BD0 00007fc6cf6408fc (MethodDesc 00007fc6cf8629b0 + 0x7c ILStubClass.IL_STUB_PInvoke(IntPtr, Byte*, Int32, SslErrorCode*))
00007F85FECF9C20 00007fc6cf644886 (MethodDesc 00007fc6cf5c4de8 + 0xa6 Interop+Ssl.SslRead(Microsoft.Win32.SafeHandles.SafeSslHandle, Byte ByRef, Int32, SslErrorCode ByRef)), calling 00007fc6cf3a4678 (stub for Interop+Ssl.<SslRead>g____PInvoke__|22_0(IntPtr, Byte*, Int32, SslErrorCode*))
00007F85FECF9CA0 00007fc6cf644557 (MethodDesc 00007fc6cf5c4230 + 0xb7 Interop+OpenSsl.Decrypt(Microsoft.Win32.SafeHandles.SafeSslHandle, System.Span`1<Byte>, SslErrorCode ByRef)), calling 00007fc6cf614300
00007F85FECF9D40 00007fc6cf6441b1 (MethodDesc 00007fc6cf5a5d70 + 0x91 System.Net.Security.SslStreamPal.DecryptMessage(System.Net.Security.SafeDeleteSslContext, System.Span`1<Byte>, Int32 ByRef, Int32 ByRef)), calling 00007fc6cf6142b8
Author: jakobbotsch
Assignees: -
Labels:

area-System.Net, area-System.Net.Security, blocking-clean-ci, untriaged

Milestone: -

@danmoseley
Copy link
Member

Is it always ubuntu.1804? If so is it #48411 ? Essentially we know we have a potential segfault on 1804 and can't fix it.

If so we could disable tests tests for 1804, I guess. Assuming we have 20.04 in the matrix as well. cc @bartonjs

@rzikm
Copy link
Member

rzikm commented May 10, 2022

Yes, it is always Ubuntu 18.04, and my fair guess it is related to the problem in the issue you mentioned. However, in this case, the crash happens some time before the main thread exits IMO, since the job output does not contain the summary line (total tests run, tests skipped, etc.)

@jkotas
Copy link
Member

jkotas commented May 10, 2022

However, in this case, the crash happens some time before the main thread exits

Then it means that it is a different problem from #48411.

@bartonjs
Copy link
Member

(lldb) dumpstack
OS Thread Id: 0x296f (1)
TEB information is not available so a stack size of 0xFFFF is assumed
Current frame: libcoreclr.so!Object::ValidateInner(int, int, int) + 0x1ad [/__w/1/s/src/coreclr/vm/object.h:446]
Child-SP         RetAddr          Caller, Callee
00007F85FECF9710 00007fc745e0135d libcoreclr.so!OBJECTREF::OBJECTREF(Object*) + 0x11d [/__w/1/s/src/coreclr/vm/object.cpp:1132], calling libcoreclr.so!Object::ValidateInner(int, int, int) [/__w/1/s/src/coreclr/vm/object.cpp:513]
00007F85FECF9740 00007fc745f70130 libcoreclr.so!MarshalNative::GCHandleInternalGet(OBJECTHANDLE__*) + 0x50 [/__w/1/s/src/coreclr/vm/gchandleutilities.h:44], calling libcoreclr.so!OBJECTREF::OBJECTREF(Object*) [/__w/1/s/src/coreclr/vm/object.cpp:1117]
00007F85FECF9780 00007fc6cf648f26 (MethodDesc 00007fc6cf5c4370 + 0x66 Interop+OpenSsl.NewSessionCallback(IntPtr, IntPtr)), calling 00007fc745f700e0 (stub for System.Runtime.InteropServices.GCHandle.InternalGet(IntPtr))
00007F85FECF97B8 00007fc6cf648ef8 (MethodDesc 00007fc6cf5c4370 + 0x38 Interop+OpenSsl.NewSessionCallback(IntPtr, IntPtr)), calling libcoreclr.so!JIT_PInvokeBegin [/__w/1/s/src/coreclr/pal/inc/unixasmmacrosamd64.inc:896]
00007F85FECF9820 00007fc6c8310356 libssl.so.1.1!SSL_set_fd + 0x56

This failed in

[UnmanagedCallersOnly]
// Invoked from OpenSSL when new session is created.
// We attached GCHandle to the SSL so we can find back SafeSslContextHandle holding the cache.
// New session has refCount of 1.
// If this function returns 0, OpenSSL will drop the refCount and discard the session.
// If we return 1, the ownership is transfered to us and we will need to call SessionFree().
private static unsafe int NewSessionCallback(IntPtr ssl, IntPtr session)
{
Debug.Assert(ssl != IntPtr.Zero);
Debug.Assert(session != IntPtr.Zero);
IntPtr ptr = Ssl.SslGetData(ssl);
Debug.Assert(ptr != IntPtr.Zero);
GCHandle gch = GCHandle.FromIntPtr(ptr);
SafeSslContextHandle? ctxHandle = gch.Target as SafeSslContextHandle;
// There is no relation between SafeSslContextHandle and SafeSslHandle so the handle
// may be released while the ssl session is still active.
if (ctxHandle != null && ctxHandle.TryAddSession(Ssl.SslGetServerName(ssl), session))
{
// offered session was stored in our cache.
return 1;
}
// OpenSSL will destroy session.
return 0;
}
, which feels like it means #64369 made some bad assumptions.

@jkotas
Copy link
Member

jkotas commented May 10, 2022

This looks like the bad assumption:

// There is no relation between SafeSslContextHandle and SafeSslHandle so the handle
// may be released while the ssl session is still active.

GCHandles are unmanaged resource. It is not ok to access them once they have been freed.

@jkotas
Copy link
Member

jkotas commented May 10, 2022

cc @wfurt

@rzikm
Copy link
Member

rzikm commented May 10, 2022

I remember when I investigated #68037, the actual ptr that we tried to convert to GCHandle was 0x1, so I assume something got corrupted along the way, see #68037 (comment)

@wfurt
Copy link
Member

wfurt commented May 10, 2022

I will take a look. If the memory is freed, anything can written to it afterward...

@wfurt wfurt self-assigned this May 10, 2022
@wfurt wfurt added this to the 7.0.0 milestone May 10, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label May 10, 2022
@rzikm
Copy link
Member

rzikm commented May 17, 2022

@danmoseley
Copy link
Member

This is segfaulting about 2x a day.

WorkItems
| where FriendlyName == "System.Net.Requests.Tests"
| where Queued > ago(7d)
| where Status == "BadExit"
| where ExitCode  == 139
| join Jobs on JobId
| project
  Queued,
  FriendlyName, ExitCode,
  ConsoleUri,
  PhaseName = tostring(parse_json(Properties)["System.PhaseName"]),
  OS = tostring(parse_json(Properties)["operatingSystem"]),
  Pipeline = tostring(parse_json(Properties).DefinitionName),
  BuildId = tostring(parse_json(Properties).BuildId),
  QueueName, Source
| where Pipeline == "runtime"
Queued FriendlyName ExitCode ConsoleUri PhaseName OS Pipeline BuildId QueueName Source
2022-06-08 16:09:59.3080000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70111-merge-750d4d8ac641453b8e/System.Net.Requests.Tests/1/console.6f856297.log?helixlogtype=result libraries_test_run_release_mono_Linux_x64_Debug (Centos.7.Amd64.Open)Ubuntu.1604.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:centos-7-mlnet-helix-20220601183719-dde38af runtime 1813588 ubuntu.1604.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70111/merge
2022-06-08 16:10:00.3530000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70111-merge-54d5c9d134f94b9dbb/System.Net.Requests.Tests/1/console.fdb327de.log?helixlogtype=result libraries_test_run_release_mono_Linux_x64_Debug RedHat.7.Amd64.Open runtime 1813588 redhat.7.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70111/merge
2022-06-08 16:10:01.8500000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70111-merge-994c60fb01e64929bc/System.Net.Requests.Tests/1/console.4ae6a658.log?helixlogtype=result libraries_test_run_release_mono_Linux_x64_Debug (Debian.10.Amd64.Open)Ubuntu.1804.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-20210304164434-56c6673 runtime 1813588 ubuntu.1804.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70111/merge
2022-06-08 16:10:03.6660000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70111-merge-d860a09d7b08433884/System.Net.Requests.Tests/1/console.d3a38823.log?helixlogtype=result libraries_test_run_release_mono_Linux_x64_Debug Ubuntu.1804.Amd64.Open runtime 1813588 ubuntu.1804.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70111/merge
2022-06-08 16:10:28.2940000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70111-merge-798cf7dbb50745c5b0/System.Net.Requests.Tests/1/console.1efaab5a.log?helixlogtype=result libraries_test_run_release_mono_interpreter_Linux_x64_Debug (Debian.10.Amd64.Open)Ubuntu.1804.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-20210304164434-56c6673 runtime 1813588 ubuntu.1804.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70111/merge
2022-06-08 16:10:38.7310000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70111-merge-5aa3373dba5f4d5f8b/System.Net.Requests.Tests/1/console.7c1352fd.log?helixlogtype=result libraries_test_run_release_coreclr_Linux_x64_Debug (Centos.7.Amd64.Open)Ubuntu.1604.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:centos-7-mlnet-helix-20220601183719-dde38af runtime 1813588 ubuntu.1604.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70111/merge
2022-06-08 16:10:41.3250000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70111-merge-4a90f871403a4eb288/System.Net.Requests.Tests/1/console.6a644d80.log?helixlogtype=result libraries_test_run_release_coreclr_Linux_x64_Debug RedHat.7.Amd64.Open runtime 1813588 redhat.7.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70111/merge
2022-06-08 16:10:42.3570000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70111-merge-4826f19068c141648c/System.Net.Requests.Tests/1/console.1a4cdcdd.log?helixlogtype=result libraries_test_run_release_coreclr_Linux_x64_Debug (Debian.10.Amd64.Open)Ubuntu.1804.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-20210304164434-56c6673 runtime 1813588 ubuntu.1804.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70111/merge
2022-06-08 16:10:43.4970000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70111-merge-81f69b8b0f3e4d25a2/System.Net.Requests.Tests/1/console.dbd721c3.log?helixlogtype=result libraries_test_run_release_coreclr_Linux_x64_Debug Ubuntu.1804.Amd64.Open runtime 1813588 ubuntu.1804.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70111/merge
2022-06-08 16:11:03.1150000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70111-merge-2f41ebf2ec514d7290/System.Net.Requests.Tests/1/console.fb0df240.log?helixlogtype=result libraries_test_run_checked_coreclr_Linux_x64_Release Ubuntu.1804.Amd64.Open runtime 1813588 ubuntu.1804.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70111/merge
2022-06-13 13:53:51.0110000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-61049-merge-ac0d5573a448434cb9/System.Net.Requests.Tests/1/console.8806cc9c.log?helixlogtype=result libraries_test_run_checked_coreclr_Linux_x64_Release Ubuntu.1804.Amd64.Open runtime 1821288 ubuntu.1804.amd64.open.rt pr/public/dotnet/runtime/refs/pull/61049/merge
2022-06-13 14:01:47.1050000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70249-merge-24402ab56fab40a4a9/System.Net.Requests.Tests/1/console.e188ee53.log?helixlogtype=result libraries_test_run_checked_coreclr_Linux_x64_Release Ubuntu.1804.Amd64.Open runtime 1821306 ubuntu.1804.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70249/merge
2022-06-14 13:29:18.4350000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70614-merge-fc3f53944daa474fa3/System.Net.Requests.Tests/1/console.c430cd0c.log?helixlogtype=result libraries_test_run_checked_coreclr_Linux_x64_Release Ubuntu.1804.Amd64.Open runtime 1823524 ubuntu.1804.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70614/merge
2022-06-14 20:07:56.9810000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-70740-merge-432d91b4703a42079b/System.Net.Requests.Tests/1/console.09544009.log?helixlogtype=result libraries_test_run_checked_coreclr_Linux_x64_Release Ubuntu.1804.Amd64.Open runtime 1823941 ubuntu.1804.amd64.open.rt pr/public/dotnet/runtime/refs/pull/70740/merge
2022-06-14 21:10:24.4870000 System.Net.Requests.Tests 139 https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-046b0e4edd044ce4a6/System.Net.Requests.Tests/1/console.a9dfa1d1.log?helixlogtype=result libraries_test_run_checked_coreclr_Linux_x64_Release Ubuntu.1804.Amd64.Open runtime 1824423 ubuntu.1804.amd64.open.rt ci/public/dotnet/runtime/refs/heads/main

@wfurt
Copy link
Member

wfurt commented Jun 22, 2022

I don't think all the failures are related. For example

datadisks/disk1/work/A6300948/w/AA250977/e /datadisks/disk1/work/A6300948/w/AA250977/e
Failed to load /datadisks/disk1/work/A6300948/p/shared/Microsoft.NETCore.App/7.0.0/libcoreclr.so, error: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /datadisks/disk1/work/A6300948/p/shared/Microsoft.NETCore.App/7.0.0/libcoreclr.so)
./RunTests.sh: line 168: 14849 Segmentation fault      (core dumped) "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Net.Requests.Tests.runtimeconfig.json --depsfile System.Net.Requests.Tests.deps.json xunit.console.dll System.Net.Requests.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE
/datadisks/disk1/work/A6300948/w/AA250977/e

this fails to load because of glibc version mismatch. Also some of the failures above are reported on OS with OpenSSL 1.0 and the code with GCHandle should not even run on them.

I spent fair amount of time trying to reproduce it (so I can verify possible changes) but no luck so far.

@jkotas
Copy link
Member

jkotas commented Jun 22, 2022

this fails to load because of glibc version mismatch

These failures are from #70111 that is worked-in-progress.

I spent fair amount of time trying to reproduce it (so I can verify possible changes) but no luck so far.

Yes, crashes like this are often hard to reproduce since they depend on specific timing and the timing on your local machine is different.

Do you have a theory on what may be the problem? Have you tried to write a targeted test that hits the problem with high probability?

@richlander
Copy link
Member

These failures are from #70111 that is worked-in-progress.

Help me understand how one unmerged PR can affect another. I didn't think that was possible.

@jkotas
Copy link
Member

jkotas commented Jul 31, 2022

The kusto query from #69125 (comment) is over test failures in System.Net.Requests.Tests that occurred in last 7 days, irrespective whether the PR was merged or not. We often run these engineering system data queries over all tests results. It gives us more complete and more up-to-date picture about where the test is failing, but it may return some false positives that one has to be mindful of.

@jkotas
Copy link
Member

jkotas commented Aug 5, 2022

System.Net.WebSockets.Client.Tests crashed with this same stacktrace in #73471

@hoyosjs
Copy link
Member

hoyosjs commented Aug 13, 2022

Seen in https://dev.azure.com/dnceng/public/_build/results?buildId=1941374&view=results

0:000> k
 # Child-SP          RetAddr               Call Site
00 (Inline Function) --------`--------     libcoreclr!Object::GetGCSafeMethodTable [/__w/1/s\src/coreclr/vm/object.h @ 446] 
01 00007f17`e2ffb6b0 00007f59`3385585a     libcoreclr!Object::ValidateInner+0x1ad [/__w/1/s\src/coreclr/vm/object.cpp @ 518] 
02 (Inline Function) --------`--------     libcoreclr!Object::Validate+0x8a [/__w/1/s\src/coreclr/vm/object.cpp @ 1124] 
03 00007f17`e2ffb770 00007f59`339c04e0     libcoreclr!OBJECTREF::OBJECTREF+0xfa [/__w/1/s\src/coreclr/vm/object.cpp @ 1124] 
04 (Inline Function) --------`--------     libcoreclr!ObjectFromHandle+0x2c [/__w/1/s\src/coreclr/vm/gchandleutilities.h @ 44] 
05 00007f17`e2ffb7a0 00007f58`b7c202c6     libcoreclr!MarshalNative::GCHandleInternalGet+0x50 [/__w/1/s\src/coreclr/vm/marshalnative.cpp @ 534] 

@wfurt
Copy link
Member

wfurt commented Aug 22, 2022

There is no failure in main since #73972 was merged.

@karelz
Copy link
Member

karelz commented Aug 23, 2022

Fixed in main for 8.0 in PR #73972 and in 7.0 (RC1) in PR #74367.

@karelz
Copy link
Member

karelz commented Aug 27, 2022

There has been another crash in System.Net.Requests.Tests Work Item @wfurt
Can you please check if it is the same problem and reopen the issue? If not, let's file a new one.

  • 8/28 PM Rolling run 1972808 - Core Dump - net7.0-Linux-Release-arm64-Mono_release-(Debian.11.Arm64.Open)Ubuntu.1804.Armarch.Open
  • 8/26 PM Rolling run 1970381 - Core Dump - net7.0-Linux-Release-arm64-CoreCLR_release-(Ubuntu.2204.Arm64.Open)Ubuntu.1804.ArmArch.Open

@wfurt
Copy link
Member

wfurt commented Aug 30, 2022

Process terminated. Error while reaping child. errno = 10
   at System.Environment.FailFast(System.String)
   at System.Diagnostics.ProcessWaitState.TryReapChild(Boolean)
   at System.Diagnostics.ProcessWaitState.CheckChildren(Boolean, Boolean)
   at System.Diagnostics.Process.OnSigChild(Int32, Int32)
./RunTests.sh: line 168:    21 Aborted                 (core dumped) "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Net.Requests.Tests.runtimeconfig.json --depsfile System.Net.Requests.Tests.deps.json xunit.console.dll System.Net.Requests.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE

I did not figure out how to dump managed objects from Mono - cc @marek-safar for help.
But the trace looks similar...

opened #74795 for tracking

@ghost ghost locked as resolved and limited conversation to collaborators Sep 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Security blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'
Projects
None yet
Development

No branches or pull requests

9 participants