-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Go support on Reproducible Builds #57120
Comments
Lets be clear that I don't actually know how reproducible any binary artifacts from Go actually is at the moment. I'm just expressing that things need to be more rigorous when dealing with binary artifacts. |
see also #24904 |
When you say "won't necessarily be enough", I assume you mean it won't produce the same bits, and that the extra attestation or SBOMs would be needed because they contain the extra build configuration information to get the same bits out. That's absolutely true today, and I don't think there's too much benefit to chasing a reproduction of releases before Go 1.21. Right now the directory where the build is run leaks into the binaries, and we ship pre-built compiled archives that have other leakage, and we ship two binaries that are built in part using the host C compiler, which has its own leakage. Starting in Go 1.20 we are dropping the pre-compiled archives, and in Go 1.21 we will drop the use of the host C compiler, at which point Go controls all the bits that are generated, and there are just a few steps to make them truly reproducible, namely cut out the build directory root and avoid using backslashes in partial file paths on windows. All that work is pending to land once Go 1.21 development begins. At that point the distributions will be really, truly, reproducible from only the source commit. The bootstrap toolchain doesn't leak into the distribution and as of Go 1.21 neither will the directory where the build happens, nor which operating system ran the build. At that point, anyone should be able to check out the go1.21 tag in the repo, grab a new enough bootstrap toolchain (Go 1.21 will require Go 1.17 or later, same as Go 1.20 does), run the build, and get bit-for-bit identical results. If there are attestations or SBOMs required to support some kind of process, I'd be happy to look into that, but it won't be necessary to reproduce the bits. Go binaries have always been highly reproducible on a single machine environment (fixed build directory, architectures, host C compiler), because we use build input content hashes to identify up-to-date-ness. If the build is not reproducible locally, the hashes don't converge. The most common way this would happen is if some detail of the bootstrap toolchain leaked into the compiler binary, so that building itself once and building itself twice produce different results. That convergence is tested in every toolchain build, so we shake those out as soon as they creep in. It's been quite a while since the last one. What will be new in Go 1.21 is removing the "single machine environment" limitation. |
Thanks for the pointer @seankhliao. Added that issue number to my pending CL and also marked that issue for Go 1.21. |
I'll be happy to test and validate any reproducability claims the Go binary disitribution is making. I have spent quite a bit of time working on these sort of issues. Obviously cgo and the external linker is a harder target for reproducability |
Indeed, cgo and the external linker is a much harder target. |
Thanks very much. We will open Go 1.21 development in February. I'll ping this issue once there is something to try. |
Sorry if this has been discussed elsewhere. When we talk about "reproducible build", do we want to specify what exactly are considered as input, and what are not? For example, the program's source code is an input, so are the source code version, the toolchain version, some configurations like the target GOOS and GOARCH. The current date and time are non-inputs. What about the host GOOS and GOARCH, the source code location and toolchain location, some other environment variables, etc.? On a narrow definition, "reproducible build" could be interpreted as building the same program twice with the exactly same configuration, gets the exactly same output. In that sense the last category could be considered as input. On a stronger definition, the last category are probably not. This issue maybe is to eliminate the last category as input. It may be good to specify it more clearly. |
@cherrymui |
This issue is specifically about reproducible builds for the Go toolchain distributed on https://go.dev/dl, not for arbitrary Go binaries. For that context, the relevant inputs are a Go source tree with a VERSION file, a GOOS, and a GOARCH. The host GOOS/GOARCH does not matter - the goal is to be able to reproduce builds no matter which OS compiled them. Other environment variables like CGO_ENABLED, CC, and so on matter but are left unset by our toolchain generation, so we can ignore them for reproducing the official downloads. |
@dolmen For arbitrary binaries, (1) you need to compile them with -trimpath or else put the source in the same directory as it was built with, and (2) you need to compile with CGO_ENABLED=0 or else arrange to have exactly the same C compiler and C libraries. If you can satisfy those two conditions, and then you use the go version output to get the right toolchain and Go source files, then you should get a reproducible build. That's not what this issue is about though. (For the Go toolchain itself we compile the commands with -trimpath and CGO_ENABLED=0.) |
@Foxboron, I posted https://swtch.com/tmp/go1.21repro4.src.tar.gz with a source tree containing the changes for reproducible builds for the upcoming Go 1.21 release (still in development). If you build it using the standard process ( You said earlier that you'd be happy to test and validate any reproducibility claims. Can you check that you can reproduce that build? And assuming you can reproduce this specific distribution, what is the process for adding Go to Reproducible Builds once the official Go 1.21 is released? |
@rcs, It will be a couple of days before I'll look at this. Currently recovering from a fever. Adding the Go project to reproducible-builds.org is just a matter of adding it to the homepage. https://salsa.debian.org/reproducible-builds/reproducible-website |
Building the above source with
Whats the plan to ensure there are no regression between releases? |
I can probably also run this through a few |
@Foxboron, thanks for confirming that you can reproduce the build. That's great. I'm not too worried about testing lots of other distributions, especially since we can reproduce that go1.21repro4.linux-amd64.tar.gz from Windows and macOS too. Our current thinking for avoiding regressions in releases is to build releases on two fairly different machines (e.g., a Linux machine and a Windows machines) and confirm that they match before issuing a release. When I look at https://reproducible-builds.org/citests/, it appears that the top bunch are running regular tests on infrastructure run by the Reproducible Builds project. Once Go 1.21 is released (or at least go1.21rc1 is out), would it make sense for us to prepare a small repo containing a script that could be run on that infrastructure to reproduce the archives posted on https://go.dev/dl/? We could run it ourselves and be listed under "External tests" of course, but it seems like running on non-Google-owned infrastructure would be a stronger statement. What do you think? |
really great to read up on this issue and see the progress! kudos & thank you. one tiny comment from my side:
and in there one file needs to be edited: _data/projects.yml, where it just needs a YAML entry like
I'd either happily merge a MR or take the data from this issue ;) |
oh, and for testing on https://reproducible-builds.org/citests/ it's automated and the easiest if you do a release which then get's updated into Debian or Arch Linux or OpenSUSE. |
You could run it on the github CI/CD infra on each release? That + the google infra would be a nice statement to begin with. I'll probably write my own monitor for this, and then it might be worth to try host something on reproducible-builds.org in the future. |
I like the cron-based GitHub Actions idea. Thanks. |
I'd like to add that in addition to this, |
Sorry to intrude in this discussion, does this mean that as of go v.1.19 reproducible builds are not possible while cross compiling? I am trying to create a reproducible android app that includes rclone, but i dont know how to make the go-output reproducible. If this is not the right place, please point me in the right direction. Thanks! |
@newhinton, reproducible builds are possible in general by building with |
Ah okay! I am using
Will this change with v.1.21 as stated by the original post? |
No, the changes in Go 1.21 are focused on builds of the main Go toolchain not builds of other programs. I doubt very much that Android will work with internal linking any time soon. Does the Android C toolchain not support reproducible builds? |
I assume so, but to be fair, this is the first time for me creating a reproducible build. The weird thing to me is that while the env-vars are the same on my two build environments, (including the clang-version which is supplied by android/google itself and version-locked) i get two different binaries, and one contains a .hash section while the other does not. I am entirely unsure where i go from here, but i understand that this might not be the right place. Any help is appreciated though! Edit: We found the issue! I did not pass the linker-options properly, now it works and is reproducible! Thanks for your help! |
@newhinton Could you share the linker-options you used ? Thanks |
Change https://go.dev/cl/513975 mentions this issue: |
Change https://go.dev/cl/513700 mentions this issue: |
@agambier https://github.com/newhinton/Round-Sync/blob/master/rclone/build.gradle |
This command rebuilds or verifies all the artifacts posted on go.dev/dl for the latest supported releases (the last patch of the last two major releases, plus the most recent release candidate if we're approaching a new release). It is meant to be run by the Go team to update a status page that can be linked from reproducible-builds.org, but it is also meant to be run by anyone who wants to "trust but verify" the status page themselves. For golang/go#57120. For golang/go#58884. Change-Id: I80a70275c1821a66b6219d24f29c2d11bfe464a8 Reviewed-on: https://go-review.googlesource.com/c/build/+/513975 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Russ Cox <rsc@golang.org>
Change https://go.dev/cl/515415 mentions this issue: |
Change https://go.dev/cl/515356 mentions this issue: |
Issue 61513 is resolved so this path can be turned on now. Confirmed to still pass now that go1.21rc4 is out. It was the first release built using improvements from CL 512437. For golang/go#57120. For golang/go#58884. For golang/go#61513. Change-Id: Ie39765f8c7ba514dea2bfccf7c8ef8acc5822a22 Reviewed-on: https://go-review.googlesource.com/c/build/+/515415 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
For golang/go#57120. Change-Id: Ic741fe1d856a9d853f25288ce29ad40a289653ef Reviewed-on: https://go-review.googlesource.com/c/build/+/515356 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Russ Cox <rsc@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
We now have a command to reproduce Go builds posted on go.dev/dl. Add a dashboard that people can check to see its results. We should be able to link to this page from https://reproducible-builds.org/citests/. For golang/go#57120. For golang/go#58884. Change-Id: I0bd1f9c26a9a003aa1f301125083195fdeb048b4 Reviewed-on: https://go-review.googlesource.com/c/website/+/513700 Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
Change https://go.dev/cl/515455 mentions this issue: |
The "not modified" response code is 304, not 206. Oops. Use named constants to avoid similar mistakes in the future. Also update rebuild template to show more version information. For golang/go#57120. For golang/go#58884. Change-Id: I2c3ddf25cede0b5a853fa971226463a997f168c7 Reviewed-on: https://go-review.googlesource.com/c/website/+/515455 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Update on this: https://go.dev/rebuild exists now, and I sent https://salsa.debian.org/reproducible-builds/reproducible-website/-/merge_requests/98 to add it to the Reproducible Builds web site. |
Cool work on this. Would be interesting if other compiler developers would follow up with something similar :) |
This has been merged, and Go is now listed on https://reproducible-builds.org/who/projects/ and https://reproducible-builds.org/citests/. |
Change https://go.dev/cl/517515 mentions this issue: |
When arguments are provided to gorebuild, the "@" character can be used to specify a version. Otherwise version selection happens automatically via defaultVersions. Its output are Go versions, no need for any prefix. Fixes the error preventing gorebuild from running when versions are not explicitly provided via arguments: $ gorebuild 18:05:05.812 downloaded https://go.dev/dl/?mode=json&include=all 18:05:05.836 FAIL: unknown version "@go1.21.0" For golang/go#57120. Change-Id: I050bd9d6d12d89b6891c845e686326c87eae5716 Reviewed-on: https://go-review.googlesource.com/c/build/+/517515 Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Change https://go.dev/cl/556075 mentions this issue: |
The "missing from posted archive" case was checking the wrong variable and could never trigger. Fortunately, it's fairly harmless, as missing files would still be caught by gorebuild thanks to check hitting a nil pointer dereference trying to compare the missing file. Check the right variable to fix the panic, and print the intended text. For golang/go#57120. For golang/go#58884. Change-Id: I4560a9cc6c53bca37283c004826d728e175a1ff1 Reviewed-on: https://go-review.googlesource.com/c/build/+/556075 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Over at #57001 (comment), @Foxboron wrote:
Quick point while I contemplate if it's worth engaging on this topic as the Arch maintainer for the
go
package.It's a very big difference between downloading sources files defined in the
go.mod
files and fetching binary files files from some remote location. We are all very aware of the trusting trust attack and moving the reproducible builds requirements from the downstream distributor (Linux distributions) to the upstream (Google) is not trivial.So how is Google going to provide Reproducible Builds for the downloaded toolchains?
Then I wrote:
@Foxboron, regarding "Reproducible Builds", by that do you mean https://reproducible-builds.org/? And if so what is involved in "providing" one? As of Go 1.21 we expect our toolchains will be fully reproducible even when cross-compiling. (That is, if you build a Mac toolchain on Windows, Linux, and Mac, you get the same bits out in all cases.) I would be delighted to have a non-Google project reproducing our builds in some way.
Then @Foxboron replied:
Yes. I have been working on this project since 2017 for Arch Linux.
If this gets implemented we would be downloading binary toolchains, right? I want to reproduce the binaries distributed by Google.
Just checking out the source and building versions won't necessarily be enough, so there needs to be some attestation or SBOMs published to support the distribution of the binaries.
I'm not saying this can't be done. I'm just trying to point how the bar between the "reproducible builds" Go already facilitates with source code is very different from what you would need to ensure for binary builds.
I'm not sure if "our builds" is the distributed binaries from Google? But Arch has been publishing verifiable builds of the Go compiler for 2 or 3 years now.
Moving this conversation to a new issue.
The text was updated successfully, but these errors were encountered: