Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting local SDK deployment in global.json #303

Merged
merged 17 commits into from
May 10, 2024
7 changes: 7 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"[markdown]": {
"editor.rulers": [80],
"editor.wordWrap": "bounded",
"editor.wordWrapColumn": 80
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These settings match with the linting rules in this repo. I'd prefer the linting rules be changed (why 80 characters hard wrap? this is 2023, not 1980 😉 ) but at the moment leaning into previous decisions.

},
}
1 change: 1 addition & 0 deletions INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ Use update-index to regenerate it:

|Year|Title|Owners|
|----|-----|------|
| | [Provide SDK hint paths in global.json](proposed/local-sdk-global-json.md) | |
| | [Rate limits](proposed/rate-limit.md) | [John Luo](https://github.com/juntaoluo), [Sourabh Shirhatti](https://github.com/shirhatti) |
| | [Readonly references in C# and IL verification.](proposed/verifiable-ref-readonly.md) | |
| | [Ref returns in C# and IL verification.](proposed/verifiable-ref-returns.md) | |
Expand Down
235 changes: 235 additions & 0 deletions proposed/local-sdk-global-json.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
# Provide SDK hint paths in global.json

## Summary

This proposal adds two new properties to the `sdk` object in
[global.json][global-json-schema]

```json
{
"sdk": {
"paths": [ ".dotnet", "$host$" ],
"errorMessage": "The .NET SDK could not be found, please run ./install.sh."
Copy link
Member Author

@jaredpar jaredpar Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agocke chose this name, blame him. 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go explicit with resolutionFailureMessage, maybe? Not a strong opinion, errorMessage is ok by me.

}
}
```

These properties will be considered by the resolver host during .NET SDK
resolution. The `paths` property lists the locations that the resolver should
consider when attempting to locate a compatible .NET SDK. The `errorMessage`
property controls what the resolver displays when it cannot find a compatible
.NET SDK.

This particular configuration would cause the local directory `.dotnet` to be
considered _in addition_ to the current set of locations. Further if resolution
failed the resolver would display the contents of `errorMessage` instead of
the default error message.

## Motivation

There is currently a disconnect between the ways the .NET SDK is deployed in
practice and what the host resolver can discover when searching for compatible
SDKs. By default the host resolver is only going to search for SDKs next to
the running `dotnet`. This often means machine-wide locations, since users
and tools typically rely on `dotnet` already being on the user's path when
launching, instead of specifying a full path to the executable. The .NET SDK
though is commonly deployed to local locations: `%LocalAppData%\Microsoft\dotnet`,
`$HOME/.dotnet`. Many repos embrace this and restore the correct .NET for their
builds into a local `.dotnet` directory.

The behavior of the host resolver is incompatible with local based deployments.
It will not find these deployments without additional environment variable
configuration and only search next to the running `dotnet`. That means tools
like Visual Studio and VS Code simply do not work with local deployment by
default. Developers must take additional steps like manipulating `%PATH%` before
launching these editors. That reduces the usefulness of items like the quick
Comment on lines +44 to +45
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does manipulating PATH actually work? I thought you had to set the special SDK resolver env vars that then bypass hostfxr.

launch bar, short cuts, etc.

This is further complicated when developers mix local and machine wide
installations. The host resolver will find the first `dotnet` according to its
lookup rules and search only there for a compatible SDK. Once developers
manipulate `%PATH%` to prefer local SDKS the resolver will stop considering
machine wide SDKS. That can lead to situations where there is machine wide SDK
that works for a given global.json but the host resolver will not consider it
because the developer setup `%PATH%` to consider a locally installed SDK. That
can be very frustrating for end users.

This disconnect between the resolver and deployment has lead to customers
introducing a number of creative work arounds:

- [scripts][example-scripts-razor] to launch VS Code while considering locally
deployed .NET SDKs
- [docs and scripts][example-scripts-build] to setup the environment and launch
VS so it can find the deployed .NET SDKs.
- [scripts][example-scripts-dotnet] that wrap `dotnet` to find the _correct_
`dotnet` to use during build.

These scripts are not one offs, they are increasingly common items in repos in
`github.com/dotnet` to attempt to fix the disconnect. Even so many of these
solutions are incomplete because they themselves only consider local deployment.
They don't fully support the full set of ways the SDK can be deployed.

This problem also manifests in how customers naturally want to use our
development tools like Visual Studio or VS Code. It's felt sharply on the .NET
team, or any external customer who wants to contribute to .NET, due to how
.NET Arcade infrastructure uses xcopy deployment into `.dotnet`. External teams
like Unity also feel this pain in their development:

- This [issue][cases-sdk-issue] from 2017 attempting
to solve this problem. It gets several hits a year from customers who are
similarly struggling with our toolings inability to handle local deployment.
- This [internal discussion][cases-internal-discussion] from a C# team member.
They wanted to use VS as the product is shipped to customers and got blocked
when we shipped an SDK that didn't have a corresponding MSI and hence VS
couldn't load Roslyn anymore.
- [VS Code][cases-vscode] having to adjust to consider local directories for SDK
because our resolver can't find them.

## Detailed Design

The global.json file will support two new properties under the `sdk` object:

- `"paths"`: this is a list of paths that the host resolver should
consider when looking for compatible SDKs. In the case this property is `null`
or not specified, the host resolver will behave as it does today.
- `"errorMessage"`: when the host resolver cannot find a compatible .NET SDK it
will display the contents of this property instead of the default error message.
In the case this property is `null` or not specified, the default error message
will be displayed.

The values in the `paths` property can be a relative path, absolute path or
`$host$`. When a relative path is used it will be resolved relative to the
location of the containing global.json. The value `$host$` is a special value
that represents the machine wide installation path of .NET SDK for the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually $host$ is the path of the current dotnet.exe that was the entry point. This exe can be anywhere -- globally installed, locally installed, current directory, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$host$ would mean "the .NET 8 and earlier behavior", right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From .NET 7 and later, the host only looks next to the running dotnet.exe. I don't think we want anything representing the previous (.NET 6 and below, only different on Windows) behaviour.

[current host][installation-doc].

The values in `paths` are considered in the order they are defined. The host
resolver will stop when it finds the first path with a compatible .NET SDK.
For example:

```json
{
"sdk": {
"paths": [ ".dotnet", "$host$" ],
}
}
```

In this configuration the host resolver would find a compatible .NET SDK, if it
exists in `.dotnet` or a machine wide location.

This lookup will stop on the first match which means it won't necessarily find
the best match. Consider a scenario with a global.json that has:

```json
{
"sdk": {
"paths": [ ".dotnet", "$host$" ],
"version": "7.0.200",
"rollForward": "latestFeature"
}
}
```

In a scenario where the `.dotnet` directory had 7.0.200 SDK but there was a
machine wide install of 7.0.300 SDK, the host resolver would pick 7.0.200 out
of `.dotnet`. That location is considered first, it has a matching .NET SDK and
hence discovery stops there.

This design requires us to only change the host resolver. That means other
tooling like Visual Studio, VS Code, MSBuild, etc ... would transparently
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether or not this is true. @elinor-fung do you happen to know what entry points this change would effect and whether our dev tools would use the same entry points?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave the detailed analysis to @elinor-fung , but I think it will work out that way. The changes will be in hostfxr and we've been advocating for all of the other tools to use hostfxr to locate the SDK. Specifically MSBuildLocator does use hostfxr. That said, I think it would be good to include testing of some of the obvious use cases as part of the feature work on this (we've been surprised in the past unfortunately).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, the fact this lives in hostfxr will mean that it's versioning behavior will be somewhat "unexpected". As long as one has the .NET 9 version with this change installed, it will work regardless of the SDK version chosen in the global.json because we always use the latest hostfxr available in a given installation. This includes previews, so just installing a preview of .NET 9 will make this work basically everywhere on the machine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSBuildLocator using hostfxr will give this behavior for VS Code/DevKit scenarios.

For VS proper it's more complicated: VS will only ever use the copy of MSBuild that it distributes, but the .NET SDK is located by calling hostfxr, so this proposal should work fine there too--it's just a nicer way to get the same mix of components you can get today (with a global.json + global SDK install, or setting environment variables to tell the resolver to use a private SDK).

MSBuild.exe is the same as VS in this respect (you pick an MSBuild.exe to run, then it loads its own copies of things, but calls the .NET SDK resolver to find the SDK).

There are surely some build-ish tools that exist that don't call hostfxr (forks from older Locators or reimplementations), but I can't see this making anything worse for them, and I don't know of any offhand.

graph LR

subgraph entrypoints
    MSBuild.exe[<pre>MSBuild.exe</pre>]
    VS[Visual Studio]
    DevKit
    cli[<pre>dotnet</pre> CLI]
    rando[other apps]
end

hostfxr

sdk[.NET SDK Resolver]

VS --> sdk

MSBuild.exe --> msbenv[/<pre>MSBuildSDKsPath</pre> env var/]
MSBuild.exe --> sdk

sdk --> hostfxr
sdk --> env[/<pre>DOTNET_MSBUILD_SDK_RESOLVER_SDKS_DIR</pre> env var/]

cli --> hostfxr

DevKit --> MSBuildLocator --> hostfxr

rando --> MSBuildLocator

rando --> oldLocator[MSBuildLocator vOld] --> info[<pre>dotnet --info</pre>] --> hostfxr

rando --> info
Loading

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take this to mean VS will partially work, VS Code will work fully. As a VS Code user, I am fine with this 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VS will work the way it always was - even today if you set PATH and DOTNET_ROOT which is what the various scripts in repos do, before starting VS, VS will still run the .NET Framework version of MSBuild it ships with for design builds and the other one for build. But it should still use hostfxr to find the SDK to use for the given app. So this change should make it unnecessary to use the special scripts - which is probably 80% of the value of this feature :-)
That is the one scenario we definitely need to test when developing this feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...including the out-of-proc scenarios (e.g., the Windows Forms designers). /cc: @merriemcgaw

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, "the way it always has" is key here. All of the friction with a VS/SDK mismatch that you can get with this proposal you can get today with environment variables or a standalone SDK install + global.json.

VS will still run the .NET Framework version of MSBuild it ships with for design builds and the other one for build.

VS uses the .NET Framework version of MSBuild it ships with, period, for evaluation and for all builds including design time and F5 builds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VS uses the .NET Framework version of MSBuild it ships with, period, for evaluation and for all builds including design time and F5 builds.

Thanks for the explanation!

benefit from this change. Repositories could update global.json to have
`paths` support `.dotnet` and Visual Studio would automatically find it without
any design changes.

## Considerations

### Installation Points

One item to keep in mind when considering this area is the .NET SDK can be
installed in many locations. The most common are:

- Machine wide
- User wide: `%LocalAppData%\Microsoft\dotnet` on Windows and `$HOME/.dotnet`
on Linux/macOS.
- Repo: `.dotnet`

Our installation tooling tends to avoid redundant installations. For example, if
restoring a repository that requires 7.0.400, the tooling will not install it
locally if 7.0.400 is installed machine wide. It also will not necessarily
delete the local `.dotnet` folder or the user wide folder. That means developers
end up with .NET SDK installs in all three locations but only the machine wide
install has the correct ones.

As a result solutions like "just use .dotnet, if it exists" fall short. It will
work in a lot of cases but will fail in more complex scenarios. To completely
close the disconnect here we need to consider all the possible locations.

### Best match or first match?

This proposal is designed at giving global.json more control over how SDKs are
found. If the global.json asked for a specific path to be considered and it has
a matching SDK but a different SDK was chosen, that seems counter intuitive.
Even in the case where the chosen SDK was _better_. This is a motivating
scenario for CI where certainty around SDK is often more desirable than
_better_. This is why the host discovery stops at first match vs. looking at
all location and choosing the best match.

Best match is a valid approach though. Can certainly see the argument for some
customers wanting that. Feel like it cuts against the proposal a bit because it
devalues `paths` a bit. If the resolver is switched to best match then, the need
for configuration around best versus first match is much stronger. There would
certainly be a customer segment that wanted to isolate from machine state in
that case.

### dotnet exec

This proposal only impacts how .NET SDK commands do runtime discovery. The
command `dotnet exec` is not an .NET SDK command but instead a way to invoke
the app directly using the runtime installed with `dotnet`.

It is reasonable for complex builds to build and use small tools. For example
building a tool for linting the build, running complex validation, etc ... To
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
building a tool for linting the build, running complex validation, etc ... To
building a tool for linting the build, running complex validation, etc. To

work with local SDK discovery these builds need to leverage `dotnet run` to
execute such tools instead of `dotnet exec`.

```cmd
# Avoid
> dotnet exec artifacts/bin/MyTool/Release/net8.0/MyTool.dll
# Prefer
> dotnet run --no-build --framework net7.0 src/Tools/MyTool/MyTool.csproj
```

### Environment variables

Previous versions of this proposal included support for using environment
variables inside `paths`. This was removed due to lack of motivating
scenarios and potential for creating user confusion as different machines can
reasonably have different environment variables.

This could be reconsidered if motivating scenarios are found.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional things worth mentioning:

  • This will not affect the runtime resolution when executing dotnet exec app.dll
    • Maybe not a big deal - depends on the use cases - potentially we could introduce something like dotnet exec --use-sdk-locations app.dll or something like that to ask runtime resolution to also consider global.json

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Compatibility considerations - this relies on the dotnet.exe which is on the user's path to be able to run the SDK specified in the global.json. This is not a big concern for the "Fallback to machine wide" since that will typically be the dotnet.exe used, but it is a concern for repo-local installs. This can lead to things like:
    • 8.0 dotnet.exe is used to run 9.0.0-preview... SDKs (so we have to be really strict about forward compatibility)
    • 8.0 dotnet.exe is used to run 3.1.0 SDK (so we have to be really strict about backward compatibility)
      None of this is exactly new, but specifically the forward compatibility doesn't have a common use case currently (it can happen, but it's very rare).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The forward compatibility is somewhat concerning to me. Do we have a list of API dependencies that the muxer needs? And a strategy to ensure that those APIs don't change in the future?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could get you the list - it's VERY short (I think 2 APIs or maybe even just 1).
We are very aware of this requirement and we never touched those APIs. So I'm not super concern about this, but it does raise the importance of keeping it compatible since so far all the cases where it can happen are pretty rare, this would be basically the first real feature which makes use of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see the "doesn't affect dotnet exec app.dll" part called out explicitly. Naively I would have expected it to, though I think it's ok that it doesn't.

Copy link
Member

@MarcoRossignoli MarcoRossignoli Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct we set VSTEST_WINAPPHOST_* vars and we use it in case of apphost usage(testhost.exe) https://github.com/microsoft/vstest/blob/main/src/Microsoft.TestPlatform.TestHostProvider/Hosting/DotnetTestHostManager.cs#L488

Before the --arch support we used the default naming DOTNET_ROOT_* but architecture switch uses these and so we had to "pass" the sdk root information in a different one.

Architecture switch uses the DOTNET_ROOT_* to look for the correct muxer that can be set by users and can be different than the sdk one https://github.com/microsoft/vstest/blob/main/src/Microsoft.TestPlatform.CoreUtilities/Helpers/DotnetHostHelper.cs#L153

@nohwnd something to add?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see the "doesn't affect dotnet exec app.dll" part called out explicitly. Naively I would have expected it to, though I think it's ok that it doesn't.

I agree that tripped me up to so I will call it out. It materially impacts the scenarios here cause it is common in builds to effectively build a .NET core based tool then launch it with dotnet exec. That is done many times in the Roslyn build.

At the same time it seems like we can continue this with a small tweak. Just need to flip to using dotnet run instead of dotnet exec in our scripts. That fixes other problems like having to hard code in the location of the built binary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other SDK commands set DOTNET_HOST_PATH based on the running dotnet. Tools running as part of an SDK command look at its value.

Want to make sure that when we run dotnet build that DOTNET_HOST_PATH gets set to the resolved runtime / SDK. Example it would get set to the app local .NET SDK. Pretty sure based on reading that is the case but wanted to double check.

Copy link
Member Author

@jaredpar jaredpar Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to elaborate on DOTNET_HOST_PATH a bit. Let's assume I have the following:

  • dotnet.exe is installed at C:\Program Files\dotnet\dotnet.exe and is on %PATH%. This location includes a 7.0.400 .NET SDK install
  • The current directory is c:\source and has the following:
    • .dotnet which is a directory that has 8.0.100 .NET SDK installed
    • global.json that requires 8.0.100 .NET SDK and has path: ["$host$", ".dotnet" ].

Now in the current directory I run dotnet build inside c:\source. That will resolve to c:\Program Files\dotnet\dotnet.exe since it's on %PATH% but it will use the use the 8.0.100 SDK from c:\source\.dotnet because the machine wide installation does not work.

What will the value of %DOTNET_HOST_PATH% be inside of msbuild here? That needs to be c:\source\.dotnet\dotnet.exe otherwise this proposal falls apart. The msbuild environment depends on this to dotnet exec tools inside of it so it must be a dotnet.exe that will have access to the 8.0.100 runtime.

@elinor-fung, @agocke, @rainersigwald, @baronfel

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will the value of %DOTNET_HOST_PATH% be inside of msbuild here? That needs to be c:\source\.dotnet\dotnet.exe otherwise this proposal falls apart.

As part of the implementation of this proposal, I think it should become c:\source\.dotnet\dotnet.exe. I think the SDK commands that currently set some variable to 'the running dotnet' intend that value to represent 'the dotnet for the running SDK`, so they would need to be updated.

### Other Designs

[This is a proposal][designs-other] similar in nature to this one. There are a
few differences:

1. This proposal is more configurable and supports all standard local
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, although those installations won't explicitly be checked -- the muxer will simply look for the host relative to itself, which will work if the correct entry point is chosen (e.g., the user's PATH is correctly chosen to prefer the installation they want).

installation points, not just the `.dotnet` variant.
2. This proposal doesn't change what SDK is chosen: the rules for global.json
on what SDKs are allowed still apply. It simply changes the locations where the
SDK is looked for.
3. No consideration for changing the command line. This is completely driven
through global.json changes.

Otherwise the proposals are very similar in nature.

[global-json-schema]: https://learn.microsoft.com/en-us/dotnet/core/tools/global-json#globaljson-schema
[example-scripts-razor]: https://github.com/dotnet/razor/pull/9550
[example-scripts-build]: https://github.com/dotnet/sdk/blob/518c60dbe98b51193b3a9ad9fc44e055e6e10fa0/documentation/project-docs/developer-guide.md?plain=1#L38
[example-scripts-dotnet]: https://github.com/dotnet/runtime/blob/main/dotnet.cmd
[cases-sdk-issue]: https://github.com/dotnet/sdk/issues/8254
[cases-internal-discussion]: https://teams.microsoft.com/l/message/19:ed7a508bf00c4b088a7760359f0d0308@thread.skype/1698341652961?tenantId=72f988bf-86f1-41af-91ab-2d7cd011db47&groupId=4ba7372f-2799-4677-89f0-7a1aaea3706c&parentMessageId=1698341652961&teamName=.NET%20Developer%20Experience&channelName=InfraSwat&createdTime=1698341652961
[cases-vscode]: https://github.com/dotnet/vscode-csharp/issues/6471
[designs-other]: https://github.com/dotnet/designs/blob/main/accepted/2022/version-selection.md#local-dotnet
[installation-doc]: https://github.com/dotnet/designs/blob/main/accepted/2021/install-location-per-architecture.md
Loading