Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage of hard or symbolic linking leads to NuGet cache corruption #8273

Closed
marcin-krystianc opened this issue Dec 30, 2022 · 15 comments · Fixed by #8275 or #8685
Closed

Usage of hard or symbolic linking leads to NuGet cache corruption #8273

marcin-krystianc opened this issue Dec 30, 2022 · 15 comments · Fixed by #8275 or #8685
Labels
bug needs-triage Have yet to determine what bucket this goes in.

Comments

@marcin-krystianc
Copy link
Contributor

Issue Description

MSBuild can use hard or symbolic links to avoid excessive file copies on disk. It is a good method to speed-up builds but at the same time, it is very easy to silently corrupt the NuGet cache.
I have seen that this issue has been mentioned several times already, but it was never explained how and when exactly the NuGet cache gets corrupted:

Related issues:

Steps to Reproduce

  1. Build an executable project with hard or symbolic links enabled
  2. Update the version of referenced NuGet package
  3. Build the application again, but this time, without using hard/symbolic links.
    MSBuild will try to update dependencies in the application's output folder. Sadly, instead of replacing existing links, it replaces actual files in NuGet cache (thus corrupting it).
  • hard-links
dotnet nuget locals --clear all
dotnet new console
dotnet add package newtonsoft.json -v 13.0.1
dotnet build /p:CreateHardLinksForCopyLocalIfPossible=true
dotnet add package newtonsoft.json -v 13.0.2
dotnet build
  • symbolic-links
dotnet nuget locals --clear all
dotnet new console
dotnet add package newtonsoft.json -v 13.0.1
dotnet build /p:CreateSymbolicLinksForCopyLocalIfPossible=true
dotnet add package newtonsoft.json -v 13.0.2
dotnet build

In both cases file newtonsoft.json\13.0.1\lib\netstandard2.0\Newtonsoft.Json.dll is silently replaced with newtonsoft.json\13.0.2\lib\net6.0\Newtonsoft.Json.dll:

Expected Behavior

Files in the NuGet cache remain untouched.

Actual Behavior

Files in a NuGet package are silently replaced with files from another version.

Analysis

Both Windows and Linux systems are affected, also it doesn't matter whether hard or symbolic links are used. The problem is that File.Copy operation, instead of replacing the link, replaces the file that the link is linking to. To safely replace a link with a different file or link, File.Delete needs to be called first. Unfortunately, MSBuild calls File.Delete only when the usage of hard or symbolic links is requested. When the build doesn't use hard or symbolic links, then the File.Delete is not called.

Versions & Configurations

@marcin-krystianc marcin-krystianc added bug needs-triage Have yet to determine what bucket this goes in. labels Dec 30, 2022
JaynieBai pushed a commit that referenced this issue Mar 15, 2023
Fixes #8273

Context
Prevent overwriting the source of a hard/symbolic link.

Changes Made
Always delete the destination file (unless readonly and OverwriteReadOnlyFiles is false)

Testing
Added unit test.
JanKrivanek pushed a commit to JanKrivanek/msbuild that referenced this issue Mar 27, 2023
Fixes dotnet#8273

Context
Prevent overwriting the source of a hard/symbolic link.

Changes Made
Always delete the destination file (unless readonly and OverwriteReadOnlyFiles is false)

Testing
Added unit test.
@rainersigwald
Copy link
Member

Reopening since the change was backed out due to causing #8684.

@rainersigwald rainersigwald reopened this Apr 25, 2023
@marcin-krystianc
Copy link
Contributor Author

Hi @manfred-brands,

we are very interested in fixing the problem with NuGet cache corruption as soon as possible. Since #8275 had to be reverted, I wanted to ask whether you plan to work on a fix? If not, then I'm more than happy to do it.

@manfred-brands
Copy link
Contributor

manfred-brands commented May 4, 2023

Hi @marcin-krystianc,

This is the first time I heard it was reverted. I didn't get any GitHub notifications. I now see the item.mentioned above.

@manfred-brands
Copy link
Contributor

I could add an 'is same file' check in the copy code, but shouldn't msbuild not detect that itself before even calling Copy? The destination file exists and has the same timestamp.

@marcin-krystianc
Copy link
Contributor Author

I didn't look into details yet, but the #8684 @Forgind explains that msbuild checks whether paths are equal before running the Copy operation, but the checks are on relative paths not on the full paths.

@manfred-brands
Copy link
Contributor

manfred-brands commented May 4, 2023

I made a small test program, that emulates a copy to oneself to see how File.Copy behaves when this is done.

I expected one of three outcomes:

  1. An exception
  2. An empty destination file (as it is truncated before copying)
  3. It works as the System recognizes this.
File.Copy("Program.cs", "Program.cs", true);

It throws:

Unhandled exception. System.IO.IOException: The process cannot access the file 'C:\Users\m.brands\source\repos\CopyToSelf\Program.cs' because it is being used by another process.
   at System.IO.FileSystem.CopyFile(String sourceFullPath, String destFullPath, Boolean overwrite)
   at Program.<Main>$(String[] args) in C:\Users\m.brands\source\repos\CopyToSelf\Program.cs:line 4

This would mean that in the current msbuild, this situation already throws an exception.
In the current build this exception is handled in DoCopyWithRetries:

                            // if this was just because the source and destination files are the
                            // same file, that's not a failure.
                            // Note -- we check this exceptional case here, not before the copy, for perf.
                            if (PathsAreIdentical(sourceFileState.Name, destinationFileState.Name))
                            {
                                return true;
                            }

This states that Path.GetFullPath is too expensive to call and we wait for an exception in those few cases where it matters.
With the need to File.Delete first, do you see any other option than having to do this check before trying to delete?

@manfred-brands
Copy link
Contributor

My proposed change is:

            // If the destination file is a hard or symbolic link, File.Copy would overwrite the source.
            // To prevent this, we need to delete the existing entry before we Copy or create a link.
            // We could try to figure out if the file is a link, but I can't think of a reason to not simply delete it always.
            if (ChangeWaves.AreFeaturesEnabled(ChangeWaves.Wave17_6) && destinationFileState.FileExists && !destinationFileState.IsReadOnly)
            {
                if (PathsAreIdentical(sourceFileState.Name, destinationFileState.Name))
                {
                    return true;
                }

                FileUtilities.DeleteNoThrow(destinationFileState.Name);
            }

@Forgind
Copy link
Member

Forgind commented May 4, 2023

Great discussion here. The primary issue I've been wrestling with is how to accomplish this kind of check without regressing perf in the normal case in which you are not trying to copy a file on top of itself. As marcin-krystianc said, I noticed that MSBuild checks if the paths are the same, but it does no normalization, which means if you have an extra .\ in one of the paths, it thinks they're different. (More relevantly, it doesn't notice if you have one absolute path and one relative path.) As manfred-brands said, we don't do the full normalization there currently to avoid expensive operations when not necessary.

I have a draft PR out that roughly aligns with manfred-brands's proposal: #8685. I think it works, both as far as not corrupting the NuGet cache and as far as not deleting a file copied onto itself, and I'm currently learning towards taking it. The only case in which I believe performance is regressed is if you are creating a hard/symlink, and it should be regressed by as much time as it takes to make a path into a full path. Especially since we only recently fixed our symlink creation behavior, I think that's acceptable.

@manfred-brands
Copy link
Contributor

manfred-brands commented May 5, 2023

@Forgind The alternative I had was to replace the string.Equals on the specified paths in DoCopyIfNecessary with a PathsAreIdentical.
Is Path.GetFullPath that expensive?

I see that is what you did in #8685.

@danmoseley
Copy link
Member

@JeremyKuhne GetFullPath is still significantly expensive, right? Although perhaps better than before, it will still hit the disk in some cases, right?

@manfred-brands
Copy link
Contributor

It looks like the .net runtime on Windows calls into a system call GetFullPathW which has a remark that it does not check if the file exists. It should only do path magic: prefix with current directory and remove unnecessary directory up and down: `A/B/../C' becomes 'A/C'.

On Linux, GetFullPath does a Combine(GetCwd(), path) and then calls PathInternal.RemoveRelativeSegments.

I'm surprised at the amount of Window specific (directly into OS) calls in MSBuild for the file system access.

@KalleOlaviNiemitalo
Copy link

`A/B/../C' becomes 'A/C'.

That is not always correct if A/B is a symbolic link.

@manfred-brands
Copy link
Contributor

`A/B/../C' becomes 'A/C'.

That is not always correct if A/B is a symbolic link.

Except that the dotnet code for Unix doesn't seem to do that:

// We would ideally use realpath to do this, but it resolves symlinks and requires that the file actually exist.
string collapsedString = PathInternal.RemoveRelativeSegments(path, PathInternal.GetRootLength(path));

Nor does the GetFullPath method on Windows.
Given subdirectories "a", "a/aa", "a/aa/aaa" and symbolic link "b" pointing to "aa/aaa":
"a/b/../c" when following the links would be "a/aa/c" but GetFullPath always returns "a/c"

I also doubt that a user would expect that "a/b/../c" suddenly becomes "a/aa/c"

@JeremyKuhne
Copy link
Member

GetFullPath is still significantly expensive, right? Although perhaps better than before, it will still hit the disk in some cases, right?

@danmoseley On Windows it typically is just a string parsing/manipulation routine. It will get the current directory and possibly environment variables (for drive relative paths), but those are not disk based. There is some sort of check regarding legacy device names PRN, CON, etc. but I don't recall precisely what it does off the top of my head (check for device availability for serial ports maybe?).

On .NET Framework (and early .NET Core versions) Path.GetFullPath was significantly more expensive as we would try to parse the path and check for validity up front (before passing it to the OS). In 4.6.2 I ported back some of the changes which improves things dramatically.

@Forgind
Copy link
Member

Forgind commented May 5, 2023

@Forgind The alternative I had was to replace the string.Equals on the specified paths in DoCopyIfNecessary with a PathsAreIdentical. Is Path.GetFullPath that expensive?

I see that is what you did in #8685.

Right; I couldn't think of another way to validate that the paths really are the same, but I moved the check a little earlier and tried to unify some of the computation to minimize the impact in most cases. The only case where I think perf regresses is if you try to create a hard/symlink, and that's minor, per JeremyKuhne's comment, unless someone is using an older runtime. I undrafted the PR; from internal discussion, it sounds like we're leaning towards taking it, so that should hopefully resolve the writing-through-symlink problem without reintroducing the copy-onto-self-means-delete problem.

Forgind added a commit that referenced this issue May 5, 2023
Fixes #8684
Fixes #8273

Context
After #8275, we delete any destination file as part of the Copy task if we determine that we really should copy onto it. Unfortunately, if we try to copy a file onto itself, we delete it before we can copy onto itself, which just means it's gone. Fortunately, we have a check earlier that ensures that we skip any copy operation from a location to the same location. Unfortunately, it's a direct string comparison that doesn't evaluate to full paths first, so it misses slightly more complicated examples.

Changes Made
Take into account full paths

Testing
Unit tests + manual test that it doesn't delete the file anymore

Notes
This implementation tries to remove now-unnecessary full path derivations downstream, hence some added complexity, but it still means extra computation on the happy path if we end up creating a hard/symbolic link. An alternate direction eliminating any full path derivations on the happy path would save about 4% of Copy's execution time, per a quick perf test. (With how many samples I used, "no change" is within a standard deviation.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug needs-triage Have yet to determine what bucket this goes in.
Projects
None yet
7 participants