Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dotnet-sos experience on macOS arm64 #2877

Closed
am11 opened this issue Feb 19, 2022 · 26 comments
Closed

dotnet-sos experience on macOS arm64 #2877

am11 opened this issue Feb 19, 2022 · 26 comments
Assignees
Labels
arm64 ARM64 architecture related issue dotnet-sos os-mac-os-x macOS aka OSX sos
Milestone

Comments

@am11
Copy link
Member

am11 commented Feb 19, 2022

On macOS arm64, dotnet-sos is installed with native architecture, as long as we explicitly specify -a arm64. This is unideal, as native architecture should be used by default and non-native one should be specified via --arch / -a option.

Moving on, this is what I am seeing with arm64 bits:

~ % uname -m
arm64

~ % dotnet --version
6.0.200

~ % dotnet tool install -g dotnet-sos -a arm64

# making sure that it's the right architecture:
~ % file $(command -v dotnet-sos)
/Users/am11/.dotnet/tools/dotnet-sos: Mach-O 64-bit executable arm64

~ % dotnet-sos install

# making sure that libsos has the right architecture as well:
~ % file ~/.dotnet/sos/libsos.dylib 
/Users/am11/.dotnet/sos/libsos.dylib: Mach-O 64-bit dynamically linked shared library arm64

# lets tests a just-created singlefile arm64 app
~ % dotnet new console -n foo
~ % dotnet publish foo/ --use-current-runtime -p:PublishSingleFile=true --self-contained -c Release

# moment of truth:
~ % lldb -- foo/bin/Release/net6.0/osx-arm64/publish/foo                                                
Error: Failed to load /usr/local/share/dotnet/x64/shared/Microsoft.NETCore.App/5.0.14/libcoreclr.dylib
SetSymbolServer -ms  failed
(lldb) target create "foo/bin/Release/net6.0/osx-arm64/publish/foo"
Current executable set to '/Users/am11/foo/bin/Release/net6.0/osx-arm64/publish/foo' (arm64).

This error suggests that something is still probing .NET 5 (x64) components:

Error: Failed to load /usr/local/share/dotnet/x64/shared/Microsoft.NETCore.App/5.0.14/libcoreclr.dylib

which is .. unexpected++.

@tommcdon tommcdon added this to the .NET 7.0 milestone Feb 19, 2022
@tommcdon tommcdon added sos arm64 ARM64 architecture related issue os-mac-os-x macOS aka OSX dotnet-sos labels Feb 19, 2022
@tommcdon
Copy link
Member

@mikem8361

@mikem8361
Copy link
Member

This is 2 different problems I think. One is that the dotnet-sos needs to detect that it is installing on a MacOS arm64. This should be a simple fix in the installer.

The second is that you have both an x64 and arm64 dotnet SDK installed. SOS uses this to host it's managed code and it looks at the config file /etc/dotnet/install_location for the runtime. Depending on the order of the SDK install that file will have the wrong path (x64) "usr/local/share/dotnet/x64". You can fix it by editing the file and removing the /x64. There isn't anything SOS can do here really.

@am11
Copy link
Member Author

am11 commented Feb 20, 2022

@mikem8361, thanks! I installed .NET 6 SDK from https://dot.net on this new machine. Later, when I was setting up my dev environment, I realized that few of our projects are using .NET 5, so I downloaded .NET 5 SDK (again from the same official website). There is only osx-x64 package available for .NET 5, so I installed that one. Over the weekend, I was experimenting with SOS and found this issue. The install_location file shows only one path:

~ % cat /etc/dotnet/install_location
/usr/local/share/dotnet/x64

I have manually added .NET 6 arm64 path, now it looks like:

~ % cat /etc/dotnet/install_location
/usr/local/share/dotnet
/usr/local/share/dotnet/x64

and lldb is not complaining:

~ % lldb -- foo/bin/Release/net6.0/osx-arm64/publish/foo
Added Microsoft public symbol server
(lldb) target create "foo/bin/Release/net6.0/osx-arm64/publish/foo"
Current executable set to '/Users/am11/foo/bin/Release/net6.0/osx-arm64/publish/foo' (arm64).
(lldb) r
...

Alternatively, setting the environment variableDOTNET_ROOT=/usr/local/share/dotnet, without modifying install_location also fixed the issue.

Can the error message be improved? i.e. if there is architecture mismatch between the machine and arch-specific file SOS is trying to load, then Failed to load .. message is extended with something actionable like: Make sure runtime path exists in /etc/dotnet/install_location. Alternatively, set DOTNET_ROOT environment variable pointing to the correct install location..

@mikem8361
Copy link
Member

There isn't any way for SOS to know that the architecture of the runtime install is incorrect, I don't think. I'll have to look into this. You are not the only one that has ran into this.

@mikem8361 mikem8361 self-assigned this Feb 21, 2022
@AndyAyersMS
Copy link
Member

I think I just ran into this too. I only had ARM64 .NET 7P2 installed; ran dotnet tool install --global dotnet-sos, and then dotnet-sos install failed looking for libhostxfer.dylib on an x64 path (or something like that, don't have the exact text anymore).

@mikem8361
Copy link
Member

@AndyAyersMS you need to edit your /etc/dotnet/install_location file if it isn't obvious from the comments.

@mikem8361
Copy link
Member

And you need the --arch arm64 option on the dotnet-sos install command.

@mikem8361
Copy link
Member

I added @hoyosjs because he is going to fix this as part of other hosting problems on M1.

@mikem8361 mikem8361 removed their assignment Mar 24, 2022
@hoyosjs
Copy link
Member

hoyosjs commented Mar 24, 2022

To be fair, Andy's problem is something different - it has more to do with our tools being 3.1 roll forward based apps and macOS signing issues... This can both solve the /etc issue and improve the diagnostic message.

@mikem8361
Copy link
Member

mikem8361 commented Mar 24, 2022 via email

@AndyAyersMS
Copy link
Member

I should be able to get my machine back in this state to repro if you want the exact failure message.

Something like:

  • Uninstall dotnet-sos
  • Uninstall all versions of .NET
  • Install only .NET 7 P2 ARM64
  • Install the dotnet-sos tool
  • Try and run dotnet-sos

That should fail.

It will work if I also install x64 .NET version (which runs only because I've also enabled rosetta).

So I though the issue was that the global tool that was installed was unexpectedly tied to x64 .NET despite me running in an arm shell and having no x64 .NET installed.

@hoyosjs
Copy link
Member

hoyosjs commented Mar 24, 2022

I am able to repro this issue. It stems from dotnet tool install adding an x64 shim which forces the process to run under emulation and we don't have an x64 runtime install, so your hunch is correct. A workaround is doing dotnet ~/.dotnet/tools/.store/dotnet-sos/6.0.317201/dotnet-sos/6.0.317201/tools/netcoreapp3.1/any/dotnet-sos.dll install (adjusting the version). This is because of how global-tools and shims work.

@am11
Copy link
Member Author

am11 commented Mar 24, 2022

~ % dotnet tool install -g dotnet-sos -a arm64

# making sure that it's the right architecture:
~ % file $(command -v dotnet-sos)
/Users/am11/.dotnet/tools/dotnet-sos: Mach-O 64-bit executable arm64

~ % dotnet-sos install

# making sure that libsos has the right architecture as well:
~ % file ~/.dotnet/sos/libsos.dylib 
/Users/am11/.dotnet/sos/libsos.dylib: Mach-O 64-bit dynamically linked shared library arm64

with -a arm64 it should install the correct binary.

@hoyosjs
Copy link
Member

hoyosjs commented Mar 24, 2022

Yup - and that's the part I don't know if I have much control over. Do you know if there's a corresponding issue in the SDK? And I don't think packing our own shim would help at all.

@hoyosjs
Copy link
Member

hoyosjs commented Mar 24, 2022

Mike also reports he never needed the arch parameter using 6.0.103. Can't say I've ever been able to skip that bit.

@am11
Copy link
Member Author

am11 commented Mar 24, 2022

There is a related discussion: dotnet/efcore#27787. Initial part of that thread is about -a, which now works fine but the control over default architecture selection for tool author, it is unknown.

After some debugging, it turned out we are hitting this condition: https://github.com/dotnet/sdk/blob/63ab18fd09e5976a44e9f0b43c228ab19e819666/src/Cli/dotnet/ShellShim/ShellShimTemplateFinder.cs#L46. Note that the value of targetFramework.Version.Major is 3 as that comes from another fallback branch: https://github.com/dotnet/sdk/blob/63ab18fd09e5976a44e9f0b43c228ab19e819666/src/Cli/dotnet/commands/dotnet-tool/install/ToolInstallGlobalOrToolPathCommand.cs#L139.

To avoid hitting this condition on osx-arm64, we can bump the dotnet-sos project framework version to 6.0. I tested by disabling the condition in ShellShimTemplateFinder.cs (prepended false &&), after that dotnet tool install -g dotnet-sos (without -a arm64) installs the expected arm64 binary .

@am11
Copy link
Member Author

am11 commented Mar 24, 2022

am11@0455de5

@hoyosjs
Copy link
Member

hoyosjs commented Mar 24, 2022

We've discussed about the .NET 6 tfm part for a bit for a few different reasons. This would be needed for pretty much all the tools. The reason 3.1 is a thing in our tools is because it's the lowest runtime we support and then we roll-forward. We didn't want customers who have 3.1/5.0 deployments to need to install multiple runtimes. I prefer multi-targeted tools as you have, but then we are shipping SOS binaries for each TFM in the tool packages with the current setup. There's some cleanup needed in the packaging + lookup part of the tools.

@hoyosjs
Copy link
Member

hoyosjs commented Mar 25, 2022

@hoyosjs
Copy link
Member

hoyosjs commented Mar 25, 2022

https://github.com/dotnet/sdk/blob/6edc23ae7ab92121c3e50c46be759bc0607fe076/src/Cli/dotnet/commands/dotnet-tool/install/ToolInstallGlobalOrToolPathCommand.cs#L136-L149

This is why on 6.0.201 it's not empty... @sfoslund is this something that needs to be ported to 6.0.1xx? Right now the behavior is inconsistent and sort of confusing.

@hoyosjs
Copy link
Member

hoyosjs commented Mar 26, 2022

Sorry - realized she's no longer in the team. Maybe @marcpopMSFT can help reroute to the appropriate folk.

@am11
Copy link
Member Author

am11 commented Mar 26, 2022

Ideally, dotnet tool install should take the case of RuntimeOptions.RollForward into account here (rather than solely rely on package's targetFramework.Version.Major).

@hoyosjs
Copy link
Member

hoyosjs commented Mar 27, 2022

It's a finicky argument. Half of the users want "use the real TFM, and then roll if it isn't there." The other ones want "Let it be the latest". The SDK seems to always choose to line with the latter mentality, which is dangerous for back compat. If we do that, emulation tools will no longer work, even if they have a 3.1 version of the runtime installed. Single entry shim is sticky here.

@hoyosjs
Copy link
Member

hoyosjs commented Mar 31, 2022

Closing this one for now as the tfm issue is separate. The hosting changes for runtime detection are in.

@hoyosjs hoyosjs closed this as completed Mar 31, 2022
@am11
Copy link
Member Author

am11 commented Apr 1, 2022

.NET 5 will be out of support in about a month .NET 3.1 in December 2022. It now makes more sense to start targeting net6.0 TFM for the future version of packages (that will mitigate the SDK bug or by-design-strange-behavior).

@hoyosjs
Copy link
Member

hoyosjs commented Apr 1, 2022

5.0 we will drop in a bit. 3.1 will likely be there for a while nonetheless, this gets a fair amount of use for analysis in prod scenarios.

@ghost ghost locked as resolved and limited conversation to collaborators Jun 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arm64 ARM64 architecture related issue dotnet-sos os-mac-os-x macOS aka OSX sos
Projects
None yet
Development

No branches or pull requests

5 participants