-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable NuGet Pack to be deterministic #6229
Comments
FYI @jaredpar @jinujoseph |
What time should be used? For example when pack creates a new nupkg with a new nuspec, are you saying that shouldn't use the current time? Would a time need to be passed into pack from the user? How does the compiler solve this? |
@emgarten I'm not sure what the right answer is for nuget. Re Roslyn: We decided to update the PE format specification to a) allow the time-stamp to be an arbitrary number b) add a record to debug directory that indicates the time-stamp is not actually time, so that new tools can handle the value accordingly. We hash the content of the PE file using SHA1. That gives us 20 bytes of deterministic hash derived (indirectly) from the inputs. We use 4 bytes for the time-stamp and 16 bytes for the MVID (module version guid) that also is emitted to a managed assembly. The reason why we use 4B of SHA1 of the content for the time-stamp (and not e.g. 0) is that some (legacy) tools use the time-stamp, the module name and the file size to "uniquely" id a PE file. I'm not sure what values for the time stamp the zip archive spec exactly allows and what is the time-stamp actually used for in practice. If it can be set to the min allowed value and nothing breaks then we can just do that. If anything breaks that we care about then having |
Re passing time stamp into pack task -- possible but it's just shifting the burden to the user, it doesn't really solve anything. I do not like doing that since most users do not want to spend time reading specifications and wondering what might break if they pass in whatever value they think makes sense. |
I'm optimistic that 0 can be used for the timestamp here. Roslyn had to abandon it due to some internal legacy tools which were looking at timestamps specifically to make sure they weren't 0. I doubt NuPkg will have that problem. |
@rohit21agrawal deterministic, reproducible builds. This is the ability to take a given source tree and build it consistently from machine to machine. More detailed info available here: https://reproducible-builds.org/ This is a practice that is becoming standard, and soon to be required, in most Linux distros. Hence it will apply to our source build efforts at some point in the future. In addition to the correctness benefits it provides, it also allows for build systems to engage in optimizations like build output caching. Note: even though it's not required at this point, all of the tools involved with our build chain today support determinism: C#, VB and F# compilers, MSBuild, etc ... |
@jaredpar for reproducible builds you need also the exact list of packages to restore. the deterministic build is a step after. A common strategy in package managers (ruby bundler, js npm5/yarn, etc) is having a real lock file https://fsprojects.github.io/Paket/lock-file.html to write down the list of packages used at compile time. |
@enricosada agree. To have a reproducible build all aspects of the build must be deterministic. For this particular issue though we're focusing on the details of the |
Has there been any more discussion on this? It would be useful to have reproducible builds. Current builds could workaround the zip timestamps by modifying them after running I've been packing reproducible zip files in some of my builds by setting the timestamps to I wouldn't be surprised if changing the zip timestamps in the .nupkg file didn't have an effect on package managers because the timestamps are independent from the PE timestamp, but I don't actually know if NuGet uses those timestamps for anything. |
@Thealexbarney, how are you setting the timestamps? |
Probably want to support https://reproducible-builds.org/docs/source-date-epoch/ |
@ctaggart It's pretty hacky, but here's an example of making NuGet packages deterministic for simple projects https://github.com/Thealexbarney/LibHac/blob/master/build/RepackNuget.cs |
Thanks @Thealexbarney & @tmat. I took a crack at a pull request: NuGet/NuGet.Client#2775 |
This is the first github.com link that comes up when searching Bing for "deterministic nuget packages", so I'm going to add a link here for documentation. @clairenovotny tweeted this example, which is great! https://github.com/clairernovotny/DeterministicBuilds |
Same reasoning as above: this is the first github.com link when Google searching "deterministic nuget pack", so here's a link to the latest status #8601 |
Background
Some environments require build to be deterministic, meaning that each tool involved in the build process must produce exactly the same output given the same inputs. Furthermore the build should not depend on the ambient state of the environment such as the current time, a global random number generator state, the machine name the build is running on, the root directory the repository is built from, etc.
For example, Roslyn compilers support deterministic builds by implementing
/deterministic
switch. This is also supported by csc/vbc msbuild tasks via propertyDeterministic
.Issue
NuGet pack
is not deterministic in the above sense. I identified several non-deterministic properties, but there might be more:ZipArchive.CreateEntry
API is used for creating zip parts and the call is not followed by settingZipArchiveEntry.LastWriteTime
to a deterministic value. TheCreateEntry
API initializes this property to the current time. As a result the current time is written to the .nupkg for some parts:https://github.com/NuGet/NuGet.Client/blob/dev/src/NuGet.Core/NuGet.Packaging/PackageCreation/Authoring/PackageBuilder.cs#L556
https://github.com/NuGet/NuGet.Client/blob/dev/src/NuGet.Core/NuGet.Packaging/PackageCreation/Authoring/PackageBuilder.cs#L770
https://github.com/NuGet/NuGet.Client/blob/dev/src/NuGet.Core/NuGet.Packaging/PackageCreation/Authoring/PackageBuilder.cs#L797
https://github.com/NuGet/NuGet.Client/blob/dev/src/NuGet.Core/NuGet.Packaging/PackageCreation/Authoring/PackageBuilder.cs#L839
PhysicalPackageFile
usesFile.GetLastWriteTimeUtc
orDateTimeOffset.UtcNow
to determine what value to set toZipArchiveEntry.LastWriteTime
.https://github.com/NuGet/NuGet.Client/blob/dev/src/NuGet.Core/NuGet.Packaging/PackageCreation/Authoring/PhysicalPackageFile.cs#L82
https://github.com/NuGet/NuGet.Client/blob/dev/src/NuGet.Core/NuGet.Packaging/PackageCreation/Authoring/PhysicalPackageFile.cs#L87
https://github.com/NuGet/NuGet.Client/blob/dev/src/NuGet.Core/NuGet.Packaging/PackageCreation/Authoring/PackageBuilder.cs#L576
If this is a required feature there should be a switch to disable it: e.g. adding nuget.exe command line argument
/deterministic
and respecting msbuild propertyDeterministic
in PackTask.https://github.com/NuGet/NuGet.Client/blob/dev/src/NuGet.Core/NuGet.Packaging/PackageCreation/Authoring/PackageBuilder.cs#L340,
https://github.com/NuGet/NuGet.Client/blob/dev/src/NuGet.Core/NuGet.Packaging/PackageCreation/Authoring/PackageBuilder.cs#L876.
The text was updated successfully, but these errors were encountered: