-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build host architecture is written into metadata hash for compilation units, breaking reproducibility of output between host platforms #13922
Comments
Made a slight tweak to the title as this does not prevent any build reproducibility but more specifically reproducibility between host platforms. As there is still reproducibility within a host platform and this has been this way for a while, this falls below other items in my priority list. If someone would want to drive moving this forward, it would be good to research why we use the verbose version in the hash. |
This is the original commit: As far as I understand there was never any explicit decision to use the verbose version vs concise version, it was considered simply 'the version' at the time of the commit. Maybe it had gained all that additional metadata including the host later? I still think that if --target is explicitly passed in, then the 'host:' in that output is safe to be ignored. |
This seems like a duplicate of #8140? Is there something different here, or can it be closed? |
I think the core issue is the same, although the title of #8140 makes it look like ' -C metadata=hash' is the non-reproducible bit, and discussion went on all sorts of tangents around that. The issue is specifically the host in verbose_version when cross-compiling, not the computation of the hash as a whole. |
So I would not want to close this, as the issue of whether to pass ' -C metadata=hash' at all is actually different than the issue of putting the 'host:' into it. |
Hm, I'm not quite following how it is different. We can reword the title if that helps. That issue is specifically about having I don't think there was any proposal to change whether or not to pass |
If you reword the title of #8140 so it's explicitly about host: in the hash, then I'm fine with closing this. |
This is fixed via #14107. Closing. |
Problem
hash_rustc_version() writes rustc().verbose_version into the hash. That data has one problematic field:
host: x86_64-unknown-linux-gnu
or
host: aarch64-unknown-linux-gnu
Due to this, when one is using a mix of x86 and aarch build hosts with exactly same rust compiler and building for the same cross-target, the output becomes non-reproducible, and differs between the two; even the file names become different.
Steps
This requries two build hosts that have a different architecture, e.g. x86 and aarch, and running an identical cross-compile on them, using exact same host rust compiler. The output is going to be different (even in the filenames if those include hashes), even though it should be the same.
Possible Solution(s)
The situation occurred in the context of Yocto project builds, where we have a cluster of build machines (a mix of x86 and arm64), the rust version in use is tightly controlled, and we expect the output to be the same (and check for it). The quick and hacky solution was to patch out the problematic lines in cargo, as shown in the attached patch, but perhaps a better option would be to not write the host architecture into the hash when the build is a cross one.
Notes
No response
Version
See this piece of code:
cargo/src/cargo/core/compiler/build_runner/compilation_files.rs
Line 664 in fc13634
The text was updated successfully, but these errors were encountered: