Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.NET 7 osx-arm64 single-file crashing with sigsegv #67062

Closed
am11 opened this issue Mar 23, 2022 · 18 comments · Fixed by #68845
Closed

.NET 7 osx-arm64 single-file crashing with sigsegv #67062

am11 opened this issue Mar 23, 2022 · 18 comments · Fixed by #68845

Comments

@am11
Copy link
Member

am11 commented Mar 23, 2022

Description

Latest build of .NET 7 published single-file app is crashing on execution.

Reproduction Steps

# installation
mkdir ~/.dotnet7
curl -sSL https://aka.ms/dotnet/7.0.1xx/daily/dotnet-sdk-osx-arm64.tar.gz | tar xzf - -C ~/.dotnet7

# publish a new app as self-contained and single app
~/.dotnet7/dotnet new console -n testapp1
cd testapp1
cat > NuGet.config << EOF
<configuration>
  <packageSources>
    <add key="dotnet7" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet7/nuget/v3/index.json" />
  </packageSources>
</configuration>
EOF
~/.dotnet7/dotnet publish --use-current-runtime -p:PublishSingleFile=true --self-contained -c Release

# run the published app
bin/Release/net7.0/osx-arm64/publish/testapp1

Expected behavior

Displays Hello, World!

Actual behavior

zsh: segmentation fault  bin/Release/net7.0/osx-arm64/publish/testapp1

Regression?

Yes, it woks with .NET 6.

Known Workarounds

Publish as self-contained, without -p:PublishSingleFile=true .

Configuration

Daily build

% strings ~/.dotnet7/dotnet | grep '@(#)'
@(#)Version 7.0.22.17106 @Commit: ce813882f4061459dc62b63acb75add040f1f603

Other information

I tried debugging it with native symbols (of release singlefilehost), the clrstack looks like this:

% lldb bin/Release/net7.0/osx-arm64/publish/testapp1
Added Microsoft public symbol server

(lldb) target create "bin/Release/net7.0/osx-arm64/publish/testapp1"
Current executable set to '/Users/am11/projects/testapp1/bin/Release/net7.0/osx-arm64/publish/testapp1' (arm64).

(lldb) r
Process 22685 launched: '/Users/am11/projects/testapp1/bin/Release/net7.0/osx-arm64/publish/testapp1' (arm64)
Process 22685 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001000b4c78 testapp1`DictionaryLayout::FindToken(MethodTable*, LoaderAllocator*, int, SigBuilder*, unsigned char*, DictionaryEntrySignatureSource, CORINFO_RUNTIME_LOOKUP*, unsigned short*) + 84
testapp1`DictionaryLayout::FindToken:
->  0x1000b4c78 <+84>: ldr    w8, [x22]
    0x1000b4c7c <+88>: tst    w8, #0x30
    0x1000b4c80 <+92>: cset   w9, eq
    0x1000b4c84 <+96>: orr    w8, w9, w8, lsr #31
Target 0: (testapp1) stopped.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x00000001000b4c78 testapp1`DictionaryLayout::FindToken(MethodTable*, LoaderAllocator*, int, SigBuilder*, unsigned char*, DictionaryEntrySignatureSource, CORINFO_RUNTIME_LOOKUP*, unsigned short*) + 84
    frame #1: 0x000000010010bf00 testapp1`ProcessDynamicDictionaryLookup(TransitionBlock*, Module*, Module*, unsigned char, unsigned char const*, unsigned char const*, CORINFO_RUNTIME_LOOKUP*, unsigned int*) + 932
    frame #2: 0x000000010010c290 testapp1`DynamicHelperFixup(TransitionBlock*, unsigned long*, unsigned int, Module*, CORCOMPILE_FIXUP_BLOB_KIND*, TypeHandle*, MethodDesc**, FieldDesc**) + 408
    frame #3: 0x000000010010d2d0 testapp1`DynamicHelperWorker + 232
    frame #4: 0x00000001002ed34c testapp1`DelayLoad_Helper_FakeProlog + 92
    frame #5: 0x0000000176a93760
    frame #6: 0x0000000176aa86b0
    frame #7: 0x00000001766badc4
    frame #8: 0x00000001002ed830 testapp1`CallDescrWorkerInternal + 132
    frame #9: 0x0000000100162eb4 testapp1`MethodDescCallSite::CallTargetWorker(unsigned long const*, unsigned long*, int) + 852
    frame #10: 0x000000010008df44 testapp1`CorHost2::CreateAppDomainWithManager(char16_t const*, unsigned int, char16_t const*, char16_t const*, int, char16_t const**, char16_t const**, unsigned int*) + 620
    frame #11: 0x0000000100572334 testapp1`coreclr_initialize + 784
    frame #12: 0x000000010001fb70 testapp1`coreclr_t::create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*, char const*, coreclr_property_bag_t const&, std::__1::unique_ptr<coreclr_t, std::__1::default_delete<coreclr_t> >&) + 420
    frame #13: 0x000000010002c998 testapp1`(anonymous namespace)::create_coreclr() + 432
    frame #14: 0x000000010002c46c testapp1`corehost_main + 160
    frame #15: 0x000000010000d5c8 testapp1`fx_muxer_t::handle_exec_host_command(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, host_startup_info_t const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::unordered_map<known_options, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, known_options_hash, std::__1::equal_to<known_options>, std::__1::allocator<std::__1::pair<known_options const, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > > > const&, int, char const**, int, host_mode_t, bool, char*, int, int*) + 1328
    frame #16: 0x000000010000c6a4 testapp1`fx_muxer_t::execute(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, int, char const**, host_startup_info_t const&, char*, int, int*) + 860
    frame #17: 0x00000001000091c0 testapp1`hostfxr_main_bundle_startupinfo + 196
    frame #18: 0x000000010004c818 testapp1`exe_start(int, char const**) + 1124
    frame #19: 0x000000010004caf4 testapp1`main + 152
    frame #20: 0x00000001043610f4 dyld`start + 520

(lldb) clrstack -f
OS Thread Id: 0x30e100 (1)
        Child SP               IP Call Site
000000016FDFE1A0 00000001000B4C78 testapp1!DictionaryLayout::FindToken(MethodTable*, LoaderAllocator*, int, SigBuilder*, unsigned char*, DictionaryEntrySignatureSource, CORINFO_RUNTIME_LOOKUP*, unsigned short*) + 84
000000016FDFE230 000000010010BF00 testapp1!ProcessDynamicDictionaryLookup(TransitionBlock*, Module*, Module*, unsigned char, unsigned char const*, unsigned char const*, CORINFO_RUNTIME_LOOKUP*, unsigned int*) + 932
000000016FDFE290 000000010010C290 testapp1!DynamicHelperFixup(TransitionBlock*, unsigned long*, unsigned int, Module*, CORCOMPILE_FIXUP_BLOB_KIND*, TypeHandle*, MethodDesc**, FieldDesc**) + 408
000000016FDFE610 000000010010D2D0 testapp1!DynamicHelperWorker + 232
000000016FDFE6A0                  [DynamicHelperFrame: 000000016fdfe6a0] 
000000016FDFE730 00000001002ED34C testapp1!DelayLoad_Helper_FakeProlog + 92
000000016FDFE860 0000000176AC3760 System.Private.CoreLib.dll!System.Collections.Generic.HashSet`1[[System.__Canon, System.Private.CoreLib]].CheckUniqueAndUnfoundElements(System.Collections.Generic.IEnumerable`1<System.__Canon>, Boolean) + 112 [/_/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/HashSet.cs @ 1436]
000000016FDFE910 0000000176AD86B0 System.Private.CoreLib.dll!System.Collections.Generic.Dictionary`2[[System.__Canon, System.Private.CoreLib],[System.IntPtr, System.Private.CoreLib]].TryGetValue(System.__Canon, IntPtr ByRef) + 32 [/_/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/Dictionary.cs @ 1108]
000000016FDFE930 00000001766EADC4 System.Private.CoreLib.dll!System.AppContext.Setup(Char**, Char**, Int32) + 84 [/_/src/libraries/System.Private.CoreLib/src/System/AppContext.cs @ 136]
FFFFFFFFFFFFFFFF 0000000176AD86B0 
FFFFFFFFFFFFFFFF 00000001766EADC4 
FFFFFFFFFFFFFFFF 00000001002ED830 testapp1!CallDescrWorkerInternal + 132
000000016FDFE9B0 0000000100162EB4 testapp1!MethodDescCallSite::CallTargetWorker(unsigned long const*, unsigned long*, int) + 852
000000016FDFEC20 000000010008DF44 testapp1!CorHost2::CreateAppDomainWithManager(char16_t const*, unsigned int, char16_t const*, char16_t const*, int, char16_t const**, char16_t const**, unsigned int*) + 620
000000016FDFEE20 0000000100572334 testapp1!coreclr_initialize + 784
000000016FDFEEE0 000000010001FB70 testapp1!coreclr_t::create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*, char const*, coreclr_property_bag_t const&, std::__1::unique_ptr<coreclr_t, std::__1::default_delete<coreclr_t> >&) + 420
000000016FDFEFF0 000000010002C998 testapp1!(anonymous namespace)::create_coreclr() + 432
000000016FDFF060 000000010002C46C testapp1!corehost_main + 160
000000016FDFF1B0 000000010000D5C8 testapp1!fx_muxer_t::handle_exec_host_command(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, host_startup_info_t const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::unordered_map<known_options, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, known_options_hash, std::__1::equal_to<known_options>, std::__1::allocator<std::__1::pair<known_options const, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > > > const&, int, char const**, int, host_mode_t, bool, char*, int, int*) + 1328
000000016FDFF310 000000010000C6A4 testapp1!fx_muxer_t::execute(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, int, char const**, host_startup_info_t const&, char*, int, int*) + 860
000000016FDFF420 00000001000091C0 testapp1!hostfxr_main_bundle_startupinfo + 196
000000016FDFF4D0 000000010004C818 testapp1!exe_start(int, char const**) + 1124
000000016FDFF600 000000010004CAF4 testapp1!main + 152
000000016FDFF660 00000001043610F4 dyld!start + 520
@ghost
Copy link

ghost commented Mar 23, 2022

Tagging subscribers to this area: @agocke, @vitek-karas, @VSadov
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Latest build of .NET 7 published single-file app is crashing on execution.

Reproduction Steps

# installation
mkdir ~/.dotnet7
curl -sSL https://aka.ms/dotnet/7.0.1xx/daily/dotnet-sdk-osx-arm64.tar.gz | tar xzf - -C ~/.dotnet7

# publish a new app as self-contained and single app
~/.dotnet7/dotnet new console -n testapp1
cd testapp1
cat > NuGet.config << EOF
<configuration>
  <packageSources>
    <add key="dotnet7" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet7/nuget/v3/index.json" />
  </packageSources>
</configuration>
EOF
~/.dotnet7/dotnet publish --use-current-runtime -p:PublishSingleFile=true --self-contained -c Release

# run the published app
bin/Release/net7.0/osx-arm64/publish/testapp1

Expected behavior

Displays Hello, World!

Actual behavior

zsh: segmentation fault  bin/Release/net7.0/osx-arm64/publish/testapp1

Regression?

Yes, it woks with .NET 6.

Known Workarounds

Publish as self-contained, without -p:PublishSingleFile=true .

Configuration

Daily build

% strings ~/.dotnet7/dotnet | grep '@(#)'
@(#)Version 7.0.22.17106 @Commit: ce813882f4061459dc62b63acb75add040f1f603

Other information

I tried debugging it with native symbols (of release singlefilehost), the clrstack looks like this:

% lldb bin/Release/net7.0/osx-arm64/publish/testapp1
Added Microsoft public symbol server

(lldb) target create "bin/Release/net7.0/osx-arm64/publish/testapp1"
Current executable set to '/Users/am11/projects/testapp1/bin/Release/net7.0/osx-arm64/publish/testapp1' (arm64).

(lldb) r
Process 22685 launched: '/Users/am11/projects/testapp1/bin/Release/net7.0/osx-arm64/publish/testapp1' (arm64)
Process 22685 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001000b4c78 testapp1`DictionaryLayout::FindToken(MethodTable*, LoaderAllocator*, int, SigBuilder*, unsigned char*, DictionaryEntrySignatureSource, CORINFO_RUNTIME_LOOKUP*, unsigned short*) + 84
testapp1`DictionaryLayout::FindToken:
->  0x1000b4c78 <+84>: ldr    w8, [x22]
    0x1000b4c7c <+88>: tst    w8, #0x30
    0x1000b4c80 <+92>: cset   w9, eq
    0x1000b4c84 <+96>: orr    w8, w9, w8, lsr #31
Target 0: (testapp1) stopped.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x00000001000b4c78 testapp1`DictionaryLayout::FindToken(MethodTable*, LoaderAllocator*, int, SigBuilder*, unsigned char*, DictionaryEntrySignatureSource, CORINFO_RUNTIME_LOOKUP*, unsigned short*) + 84
    frame #1: 0x000000010010bf00 testapp1`ProcessDynamicDictionaryLookup(TransitionBlock*, Module*, Module*, unsigned char, unsigned char const*, unsigned char const*, CORINFO_RUNTIME_LOOKUP*, unsigned int*) + 932
    frame #2: 0x000000010010c290 testapp1`DynamicHelperFixup(TransitionBlock*, unsigned long*, unsigned int, Module*, CORCOMPILE_FIXUP_BLOB_KIND*, TypeHandle*, MethodDesc**, FieldDesc**) + 408
    frame #3: 0x000000010010d2d0 testapp1`DynamicHelperWorker + 232
    frame #4: 0x00000001002ed34c testapp1`DelayLoad_Helper_FakeProlog + 92
    frame #5: 0x0000000176a93760
    frame #6: 0x0000000176aa86b0
    frame #7: 0x00000001766badc4
    frame #8: 0x00000001002ed830 testapp1`CallDescrWorkerInternal + 132
    frame #9: 0x0000000100162eb4 testapp1`MethodDescCallSite::CallTargetWorker(unsigned long const*, unsigned long*, int) + 852
    frame #10: 0x000000010008df44 testapp1`CorHost2::CreateAppDomainWithManager(char16_t const*, unsigned int, char16_t const*, char16_t const*, int, char16_t const**, char16_t const**, unsigned int*) + 620
    frame #11: 0x0000000100572334 testapp1`coreclr_initialize + 784
    frame #12: 0x000000010001fb70 testapp1`coreclr_t::create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*, char const*, coreclr_property_bag_t const&, std::__1::unique_ptr<coreclr_t, std::__1::default_delete<coreclr_t> >&) + 420
    frame #13: 0x000000010002c998 testapp1`(anonymous namespace)::create_coreclr() + 432
    frame #14: 0x000000010002c46c testapp1`corehost_main + 160
    frame #15: 0x000000010000d5c8 testapp1`fx_muxer_t::handle_exec_host_command(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, host_startup_info_t const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::unordered_map<known_options, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, known_options_hash, std::__1::equal_to<known_options>, std::__1::allocator<std::__1::pair<known_options const, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > > > const&, int, char const**, int, host_mode_t, bool, char*, int, int*) + 1328
    frame #16: 0x000000010000c6a4 testapp1`fx_muxer_t::execute(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, int, char const**, host_startup_info_t const&, char*, int, int*) + 860
    frame #17: 0x00000001000091c0 testapp1`hostfxr_main_bundle_startupinfo + 196
    frame #18: 0x000000010004c818 testapp1`exe_start(int, char const**) + 1124
    frame #19: 0x000000010004caf4 testapp1`main + 152
    frame #20: 0x00000001043610f4 dyld`start + 520

(lldb) clrstack -f
OS Thread Id: 0x30e100 (1)
        Child SP               IP Call Site
000000016FDFE1A0 00000001000B4C78 testapp1!DictionaryLayout::FindToken(MethodTable*, LoaderAllocator*, int, SigBuilder*, unsigned char*, DictionaryEntrySignatureSource, CORINFO_RUNTIME_LOOKUP*, unsigned short*) + 84
000000016FDFE230 000000010010BF00 testapp1!ProcessDynamicDictionaryLookup(TransitionBlock*, Module*, Module*, unsigned char, unsigned char const*, unsigned char const*, CORINFO_RUNTIME_LOOKUP*, unsigned int*) + 932
000000016FDFE290 000000010010C290 testapp1!DynamicHelperFixup(TransitionBlock*, unsigned long*, unsigned int, Module*, CORCOMPILE_FIXUP_BLOB_KIND*, TypeHandle*, MethodDesc**, FieldDesc**) + 408
000000016FDFE610 000000010010D2D0 testapp1!DynamicHelperWorker + 232
000000016FDFE6A0                  [DynamicHelperFrame: 000000016fdfe6a0] 
000000016FDFE730 00000001002ED34C testapp1!DelayLoad_Helper_FakeProlog + 92
000000016FDFE860 0000000176AC3760 System.Private.CoreLib.dll!System.Collections.Generic.HashSet`1[[System.__Canon, System.Private.CoreLib]].CheckUniqueAndUnfoundElements(System.Collections.Generic.IEnumerable`1<System.__Canon>, Boolean) + 112 [/_/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/HashSet.cs @ 1436]
000000016FDFE910 0000000176AD86B0 System.Private.CoreLib.dll!System.Collections.Generic.Dictionary`2[[System.__Canon, System.Private.CoreLib],[System.IntPtr, System.Private.CoreLib]].TryGetValue(System.__Canon, IntPtr ByRef) + 32 [/_/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/Dictionary.cs @ 1108]
000000016FDFE930 00000001766EADC4 System.Private.CoreLib.dll!System.AppContext.Setup(Char**, Char**, Int32) + 84 [/_/src/libraries/System.Private.CoreLib/src/System/AppContext.cs @ 136]
FFFFFFFFFFFFFFFF 0000000176AD86B0 
FFFFFFFFFFFFFFFF 00000001766EADC4 
FFFFFFFFFFFFFFFF 00000001002ED830 testapp1!CallDescrWorkerInternal + 132
000000016FDFE9B0 0000000100162EB4 testapp1!MethodDescCallSite::CallTargetWorker(unsigned long const*, unsigned long*, int) + 852
000000016FDFEC20 000000010008DF44 testapp1!CorHost2::CreateAppDomainWithManager(char16_t const*, unsigned int, char16_t const*, char16_t const*, int, char16_t const**, char16_t const**, unsigned int*) + 620
000000016FDFEE20 0000000100572334 testapp1!coreclr_initialize + 784
000000016FDFEEE0 000000010001FB70 testapp1!coreclr_t::create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*, char const*, coreclr_property_bag_t const&, std::__1::unique_ptr<coreclr_t, std::__1::default_delete<coreclr_t> >&) + 420
000000016FDFEFF0 000000010002C998 testapp1!(anonymous namespace)::create_coreclr() + 432
000000016FDFF060 000000010002C46C testapp1!corehost_main + 160
000000016FDFF1B0 000000010000D5C8 testapp1!fx_muxer_t::handle_exec_host_command(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, host_startup_info_t const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::unordered_map<known_options, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, known_options_hash, std::__1::equal_to<known_options>, std::__1::allocator<std::__1::pair<known_options const, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > > > const&, int, char const**, int, host_mode_t, bool, char*, int, int*) + 1328
000000016FDFF310 000000010000C6A4 testapp1!fx_muxer_t::execute(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, int, char const**, host_startup_info_t const&, char*, int, int*) + 860
000000016FDFF420 00000001000091C0 testapp1!hostfxr_main_bundle_startupinfo + 196
000000016FDFF4D0 000000010004C818 testapp1!exe_start(int, char const**) + 1124
000000016FDFF600 000000010004CAF4 testapp1!main + 152
000000016FDFF660 00000001043610F4 dyld!start + 520
Author: am11
Assignees: -
Labels:

arch-arm64, os-mac-os-x, area-Single-File

Milestone: -

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Mar 23, 2022
@VSadov VSadov self-assigned this Mar 23, 2022
@am11
Copy link
Member Author

am11 commented Apr 14, 2022

It broke in main branch on Nov 3, 2021.

@jkoritzinsky, I have bisected the commits and found that the first commit (since .NET 6 release) which fails single-file app on osx-arm64 is 24e7a4a (it was working until the previous commit c87e932). With debug build, it fails an assertion:

Assert failure(PID 70129 [0x000111f1], Thread: 5669656 [0x568318]): Consistency check failed: System.Environment::GetProcessorCount is not registered using DllImportentry macro in qcallentrypoints.cppFAILED: pvTarget != nullptr
    File: /Users/am11/projects/runtime-pr/src/coreclr/vm/dllimport.cpp Line: 5449
    Image: /Users/am11/projects/testapp1/bin/Debug/net7.0/osx-arm64/publish/testapp1

zsh: abort      bin/Debug/net7.0/osx-arm64/publish/testapp1

I have debugged a bit and noticed that after this line (which does not fail):

if (FAILED(pInternalImport->GetPinvokeMap(pMD->GetMemberDef(), (DWORD*)&mappingFlags, ppEntryPointName, &modref)))

p *ppEntryPointName in lldb prints GetProcessorCount instead of Environment_GetProcessorCount. Any thoughts (or theories) what might be the cause of invalid mapping? 🤔

@am11 am11 removed the untriaged New issue has not been triaged by the area owner label Apr 14, 2022
@am11
Copy link
Member Author

am11 commented Apr 15, 2022

I have ran another git-bisect session, this time marking ProcessorCount error with git bisect good (basically ignoring it). Here is a more precise summary:

  1. from release/6.0 branch-off commit until 24e7a4a ~1, everything was fine. That commit started to fail QCall consistency check.

    • Assert failure(PID 41507 [0x0000a223], Thread: 6600605 [0x64b79d]): Consistency check failed: System.Environment::GetProcessorCount is not registered using DllImportentry macro in qcallentrypoints.cppFAILED: pvTarget != nullptr
            File: /Users/am11/projects/runtime-pr/src/coreclr/vm/dllimport.cpp Line: 5436
            Image: /Users/am11/projects/testapp1/bin/Debug/net7.0/osx-arm64/publish/testapp1
      
    • cc @jkoritzinsky

  2. from 24e7a4a until bcd3527 ~1, the same consistency check was failing. With the latter commit, a different assertion has started to fail earlier in the execution. This is the case in the tip of main branch.

    • Assert failure(PID 26205 [0x0000665d], Thread: 6557904 [0x6410d0]): Compiler optimization assumption invalid: EE expects method to exist: System.String:Ctor  Sig pointer: 0000000105317690
      FAILED: pMD != 0
          File: /Users/am11/projects/runtime-pr/src/coreclr/vm/binder.cpp Line: 125
          Image: /Users/am11/projects/testapp1/bin/Debug/net7.0/osx-arm64/publish/testapp1
      
    • cc @jkotas

If they are not related in terms of root-cause, then fixing 2 first will bring it back to state of 1.

@am11
Copy link
Member Author

am11 commented Apr 15, 2022

@jkotas, (I can create a separate issue for 2 if needed) it looks like the issue is with the meta signature of METHOD__STRING__CTORF_CHARARRAY that has first byte set to 0 but the one computed by MethodDesc::GetSigFromMetadata has value 32 (which is probably incorrect?). Consequently, this comparison is failing:

(memcmp(pSig1, pSig2, cSig1) == 0))

* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 10.1
    frame #0: 0x00000001002a8a10 testapp1`MetaSig::CompareMethodSigs(pSignature1="", cSig1=5, pModule1=0x00000001764c0000, pSubst1=0x0000000000000000, pSignature2=" \U00000001\U0000000e\U0000001d\U00000003\a \U00000003\U00000001\U0000001d\U00000003\b\b2\U00000001", cSig2=5, pModule2=0x00000001764c0000, pSubst2=0x0000000000000000, skipReturnTypeSig=NO, pVisited=0x0000000000000000) at siginfo.cpp:4281:17
   4278	        (cSig1 == cSig2) &&
   4279	        (pSubst1 == NULL) &&
   4280	        (pSubst2 == NULL) &&
-> 4281	        (memcmp(pSig1, pSig2, cSig1) == 0))
   4282	    {
   4283	        return TRUE;
   4284	    }
Target 0: (testapp1) stopped.

(lldb) p (int)memcmp(pSig1, pSig2, cSig1)
(int) $300 = -32

(lldb) p cSig1
(DWORD) $301 = 5

(lldb) memory read -s1 -fu -c5 pSig1 --force
0x100e0429e: 0
0x100e0429f: 1
0x100e042a0: 14
0x100e042a1: 29
0x100e042a2: 3

(lldb) memory read -s1 -fu -c5 pSig2 --force
0x108684d18: 32
0x108684d19: 1
0x108684d1a: 14
0x108684d1b: 29
0x108684d1c: 3

if i jump the PC to line 4283 and continue, the same 32 vs. 0 issue shows up for other string methods. For the non-string methods (like METHOD__CASTHELPERS__ISINSTANCEOFANY, METHOD__CASTHELPERS__UNBOX etc.), the comparison succeeds because both pSig1 and pSig2 have 0 in the first byte.

@jkotas
Copy link
Member

jkotas commented Apr 15, 2022

Neither of the two failure modes make sense. I think that the problem is likely a bad C++ codegen or something low-level like that.

@VSadov
Copy link
Member

VSadov commented Apr 15, 2022

p *ppEntryPointName in lldb prints GetProcessorCount instead of Environment_GetProcessorCount. Any thoughts (or theories) what might be the cause of invalid mapping? 🤔

Maybe mismatching bits - like a new singlefilehost and old System.Private.CoreLib.dll
It would be hard to mismatch them though, since we build them together.

@jkotas
Copy link
Member

jkotas commented Apr 15, 2022

Yeah, I agree. This looks like mismatched bits.

@am11
Copy link
Member Author

am11 commented Apr 21, 2022

@VSadov will it be fixed in the next preview?

@VSadov
Copy link
Member

VSadov commented Apr 26, 2022

When I am trying the scenario with latest daily build, it looks like bits are matching but R2R is broken.

  • if I just compile a default app as singlefile for osx-arm64, it fails to run with EXC_BAD_ACCESS (code=1, address=0x580000ead28000d1)
  • if I add -p:PublishTrimmed=true. , which results in IL-only app, it runs and prints "Hello World"
  • if I also add -p:PublishReadyToRun=true , then app fails again with BAD_ACCESS
  • and if I do export COMPlus_ZapDisable=1, the app works

It looks like R2R is broken in singlefile on OSX.
It is also likely that we are not running host tests on osx-arm64

BTW, when targeting osx-x64, the app runs on the same machine (M1)

I will continue investigating.

@VSadov
Copy link
Member

VSadov commented Apr 26, 2022

the build that I picked up is:

strings ./testapp1 | grep @Commit                                                                    

@(#)Version 7.0.22.22403 @Commit: 47d9c43ab1f10a98a348a28b3fd7ed9c4d35328b

@am11
Copy link
Member Author

am11 commented Apr 27, 2022

It is also likely that we are not running host tests on osx-arm64

Single file tests were added to outerloop test pipeline in 7677f7d, and removed in f29ba20#diff-e2e027b9777fc35f4a8243db97ce50f7dac99b3cee9465c5325d283c34d2d872L655 for cost saving.

I think those are good tests to validate with frequent runtime changes and we should bring them back with osx-arm64 addition. AFAIK, there is nothing else in any pipeline testing single-file host (in runtime, sdk or installer repos). Issues are reported usually after the GA release.

@vitek-karas
Copy link
Member

I "think" we have an E2E test in the SDK repo (didn't check to be sure) - unfortunately I know that SDK or installer repo doesn't run tests on osx-arm64 either.

@VSadov
Copy link
Member

VSadov commented May 2, 2022

it looks like we sometimes see PE sections overlapping in memory. This is either a loader bug or crossgen bug. Most likely crossgen.
Either way we should be able to layout a PE that we ourselves produce.

@oransel
Copy link

oransel commented Jun 1, 2022

Same error with dotnet 6.0 on M1

thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x580000ead2800051)
frame #0: 0x00000001000b9c58 testDictionaryLayout::FindToken(MethodTable*, LoaderAllocator*, int, SigBuilder*, unsigned char*, DictionaryEntrySignatureSource, CORINFO_RUNTIME_LOOKUP*, unsigned short*) + 140 testDictionaryLayout::FindToken:
-> 0x1000b9c58 <+140>: ldr x9, [x9, #0x8]
0x1000b9c5c <+144>: cbz x9, 0x1000b9c70 ; <+164>
0x1000b9c60 <+148>: ldr x12, [x9]
0x1000b9c64 <+152>: ldrh w9, [x12]
Target 0: (test) stopped.

Fix:

export COMPlus_ZapDisable=1

@am11
Copy link
Member Author

am11 commented Jun 1, 2022

Pretty sure it was working fine with .NET 6 in March, without disabling zap. It is perhaps a recent regression? I haven't tested with latest patch version.

@oransel
Copy link

oransel commented Jun 1, 2022

Here are the outputs:

→ dotnet --version
6.0.300

→ uname -a
Darwin MBProMax.local 21.5.0 Darwin Kernel Version 21.5.0: Tue Apr 26 21:08:37 PDT 2022; root:xnu-8020.121.3~4/RELEASE_ARM64_T6000 arm64

→ cat Program.cs
// See https://aka.ms/new-console-template for more information
var log = (object msg) => Console.WriteLine((new DateTimeOffset(DateTime.UtcNow).ToUnixTimeSeconds()).ToString() + ": " + msg);

log("Hello, World!");

→ dotnet publish --use-current-runtime -p:PublishSingleFile=true --self-contained -c Release
Microsoft (R) Build Engine version 17.2.0+41abc5629 for .NET
Copyright (C) Microsoft Corporation. All rights reserved.

Determining projects to restore...
Restored /private/tmp/test/test.csproj (in 79 ms).
test -> /private/tmp/test/bin/Release/net6.0/osx-arm64/test.dll
Optimizing assemblies for size, which may change the behavior of the app. Be sure to test after publishing. See: https://aka.ms/dotnet-illink
test -> /private/tmp/test/bin/Release/net6.0/osx-arm64/publish/

→ /private/tmp/test/bin/Release/net6.0/osx-arm64/publish/test
zsh: segmentation fault /private/tmp/test/bin/Release/net6.0/osx-arm64/publish/test

@oransel
Copy link

oransel commented Jun 2, 2022

@am11 can we re-open this for v6?

@VSadov
Copy link
Member

VSadov commented Jun 2, 2022

There is a separate issue for 6.0 - #69923

@ghost ghost locked as resolved and limited conversation to collaborators Jul 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants