-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Restore corrupts NuGet cache #12047
Comments
@manfred-brands do your builds do anything like assembly signing (of 3rd party assemblies, not just the assembly being compiled), or something else that may write/overwrite files? Alternatively, are your CI machines "stateful" (don't get wiped between every single build)? NuGet is designed around the principal of immutable packages. This means that if during development of your packages, if at any point you re-use a package version, any machines with the old contents will not refresh (unless someone explicitly deleted the old package contents off the machine). For example, while still in development, if your project changed to version 2.0.0 (rather than using 2.0.0-prerelease.* dynamic versions), and later changed APIs (from 1.0.0 APIs to 2.0.0 APIs), then any machine with the 1.0.0 APIs but 2.0.0 package version, will have the issue you described. Something else that might help to investigate is if you can check the I haven't seen any other customers reporting similar issues, and I'm not sure what "mechanism" could theoretically cause NuGet to write a package's contents to the wrong folder. Our code could really use a refactor to reduce duplication (I think one method is used for packages.config, the other duplicate is used by PackageReference), but picking one at random (the first one), Even if the nuget server provided a nupkg whose nuspec says version 1.0.0, even though NuGet asked to download version 2.0.0, I believe NuGet will use the version it thinks it downloaded (2.0.0), not the version from the nuspec. If I'm wrong about that, that points to an error on the nuget server, not the client. Although there could be an opportunity here for the client to validate and report a friendly error message, rather than silently doing something unexpected. In short, without being able to reproduce the error, or having more information, I'm not sure what actions we can take. Perhaps something you could do is to 1. ensure that your build scripts do a restore separately from build, and then 2. write a program that runs after restore, but before build. What the program should validate depends on exactly what's going wrong on your machines. For example, if the nuspec shows that the package was extracted to entirely the wrong directory (directory is 1.0.0, but nuspec says 2.0.0), then the program could search all nuspec files under the global packages directory, parse the package id and version out of the nuspec, and compare to the full directory path they were found in. However, if the nuspec is correct, but only the contents of a |
@zivkan Thanks for your reply. Our builds are about 500 projects targeting .NET Framework 4.8. We use dotnet build which build several projects in parallel. All outputs are going to a single Binaries folder. This does unfortunately mean that 3rd party dlls are copied multiple times to that directory and occassionally we see msbuild retries on those. We also use CentralPackage management with the PackageVersion in a single Directory.Package.props and PackageReference in all .csproj files. We have seen this phenomenon more on developer machines. One of the projects gets changed and we either to a dotnet rebuild or a build in VS2022 (using partial loading of the solution only a few projects). As a total rebuild takes up to 7 minutes. Developers sometimes switch to a 2nd directory and do some work there. E.g. Two branches of the same repository checked out to different directories. Those builds are independent, but share the same NuGet cache.
All files except the .dll are from the correct version, this includes the .xml file in the lib\net48 folder. I also looked at the nuget source code and couldn't find anything there. I will develop that NuGet cache verifier tool. At least when the issue occurs it will find the offending dll straight away. |
In case you're not aware already, MSBuild (and therefore commands like |
I created the tool and found several "misplaced" files, which that tool then repaired by extracting the correct files from the .nupkg. I keep an eye on to see when it reappears. |
It looks the issue didn't repro for about 3 weeks, so I'm closing it for now considering no other customer reported same problem. |
@manfred-brands Do you use hard links (i.e.: |
@marcin-krystianc Yes we do use hard-links. |
FYI: I discovered that It is not a problem with NuGet itself. It is a problem with MSBuild and use of hard or symbolic links. I've opened dotnet/msbuild#8273 with a detailed description. |
@marcin-krystianc Thanks for finding the real cause. |
NuGet Product Used
dotnet.exe
Product Version
dotnet 6.0.400
Worked before?
Pre dotnet 5.0.0?
Impact
It's more difficult to complete my work
Repro Steps & Context
We have seen several instances of "corrupt" nuget caches, possibly be caused by different parallel dotnet builds requiring different nuget package versions.
Say one build requires version 1.0.0 and another 2.0.0 of a nuget package.
1.0.0 was previously restored correctly.
We then build a project requiring version 2.0.0 requiring a fresh restore of a new nuget package.
That package is restored correctly, but somehow sometimes one of the 2.0.0 dlls, e.g. lib/net48 makes it way to the 1.0.0 lib/net48 nuget folder.
If these version are not compatible, we get compile errors the next time compiling a project that requires 1.0.0.
If they are compatible it might work by stealth and nobody will notice.
We did not see this previously, but now that we strong name our nuget packages, we see unexpected binding redirects.
We use central package management
Verbose Logs
No response
The text was updated successfully, but these errors were encountered: