-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducible outputs without build-environment-specific command lines #87325
Comments
Do you mean that rustc is invalidating its own cache? Or that you have some build system around rustc that has caching based on the CLI args? The first problem should have been fixed in #84233. The second I'm not sure should belong in rustc - couldn't you change your build system not to expand the shell command before doing the caching? |
See also rust-lang/rfcs#3127, I'm not sure if that would help or not. |
The Chromium build system consists of One of the inputs to any cache system must be the command line. If you change the command line flags in any way, the same binary can’t be returned. And the command line must be expanded to know what would be passed to the compiler. However rustc forces you to choose between:
We can not use rustc with goma without resolving this conflict of choice that rustc currently presents. Thus, we are asking for rustc to provide a means to produce deterministic builds without a different command line for each build environment. While its semi-possible to solve this in the cacheing layer, it would require the cache to understand perfectly the command line arguments of rustc so it can know if a given command line will produce the same output. This would be flaky and unreliable, so we would like to solve this in the same way that Clang has, by providing a flag to remap the current working directory specifically. rust-lang/rfcs#3127 looks to be in a similar spirit but we do not use cargo, and it’s the ultimate invocation of rustc that matters for a distributed compilation system, and that is determined by the rustc compiler flags. Thanks for asking! |
What's wrong with @jsgf's suggestion?
|
Unless Goma guarantees the same cwd on every host you need to expand How does this build system handle env vars which may contain a path (eg |
A wrapper script might work on some platforms, but it won't work on our Windows builders, as they do not spawn sub-processes, and doing so would be a prohibitive performance cost. I am going to open an MFP for this to discuss further, as @michaelwoerister pointed me to, and I will add more about this there. Thanks for discussing with me. |
Thanks for bringing this to my attention. The answer will likely be that we will ban the use of env!() in first-party code, and restrict the use of third-party crates that use it unconditionally. We have bots that would probably catch simple cases of this, but it could have snuck in somehow. I've been looking for ways that non-determinism could be introduced in a Rust build, and this is a good one! |
I'm not sure I understand what you mean by "they do not spawn sub-processes" - isn't running rustc spawning a subprocess? If the "rustc" it's invoking is a wrapper, how would it know? And why would it be a prohibitive performance cost? If the cost of invoking rustc takes time T, then surely the invocation cost of a wrapper isn't going to be much more than 2T? And given that Rust compilation is famously not all that fast, the invocation overhead is pretty small vs the total runtime.
This could be tricky, depending on how wide and deep your third-party dep graph is. It isn't common in the absolute sense, but lots of dep subgraphs contain some crate which does something like this. Speaking of which, how are you producing build rules for third-party crates? Are you hand writing them, or using some tool to compute them from Cargo info? How do you deal with build scripts? (I have a lot of practical experience in building Rust with Buck in a large codebase, so I understand a lot of the problems you're facing.) |
BTW I have a stalled draft RFC which proposes making the env entirely controllable from the command-line - that is, the "real" process env is ignored, and replaced with command-line options which set the effective env for I'd be interested to know if this solves any problems for you (or what what changes would be needed to make it helpful). |
Let me come back to some of this in the MCP, which I am working on still. Sorry for the delay, I'd like it to be as robust as possible.
https://gn.googlesource.com/cargo-gnaw/ is a tool for consuming cargo and generating GN rules.
This is awesome, and might allow us to make more third-party crates possible to use. I've shared it to the relevant folks internally as well to get more eyes toward it. We're super early-days and just trying to make C++ and Rust compile together on top of our infrastructure right now, so it might take some time to come back to that. But thank you for the link! |
Regarding |
@fangism Abs paths can be hard to eliminate. One case is where someone is doing |
We also changed uses of |
Would absolutizing relative paths on the left side of |
Yep that would be another way to approach it. If all paths are absolute before remapping (I think so?) then it would be okay as it wouldn't get in the way of remapping relative paths (since they wouldn't exist at remap time). Though I should note it seems a little unclear as |
Except that it uses existing path resolution semantics instead of making some previously valid filenames reserved and magical. |
I guess that could work if you define that precisely as "map (effective) leading |
…haelwoerister Introduce -Z remap-cwd-prefix switch This switch remaps any absolute paths rooted under the current working directory to a new value. This includes remapping the debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`. Importantly, this flag does not require passing the current working directory to the compiler, such that the command line can be run on any machine (with the same input files) and produce the same results. This is critical property for debugging compiler issues that crop up on remote machines. This is based on adetaylor's rust-lang@dbc4ae7 Major Change Proposal: rust-lang/compiler-team#450 Discussed on rust-lang#38322. Would resolve issue rust-lang#87325.
…haelwoerister Introduce -Z remap-cwd-prefix switch This switch remaps any absolute paths rooted under the current working directory to a new value. This includes remapping the debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`. Importantly, this flag does not require passing the current working directory to the compiler, such that the command line can be run on any machine (with the same input files) and produce the same results. This is critical property for debugging compiler issues that crop up on remote machines. This is based on adetaylor's rust-lang@dbc4ae7 Major Change Proposal: rust-lang/compiler-team#450 Discussed on rust-lang#38322. Would resolve issue rust-lang#87325.
…haelwoerister Introduce -Z remap-cwd-prefix switch This switch remaps any absolute paths rooted under the current working directory to a new value. This includes remapping the debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`. Importantly, this flag does not require passing the current working directory to the compiler, such that the command line can be run on any machine (with the same input files) and produce the same results. This is critical property for debugging compiler issues that crop up on remote machines. This is based on adetaylor's rust-lang@dbc4ae7 Major Change Proposal: rust-lang/compiler-team#450 Discussed on rust-lang#38322. Would resolve issue rust-lang#87325.
…haelwoerister Introduce -Z remap-cwd-prefix switch This switch remaps any absolute paths rooted under the current working directory to a new value. This includes remapping the debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`. Importantly, this flag does not require passing the current working directory to the compiler, such that the command line can be run on any machine (with the same input files) and produce the same results. This is critical property for debugging compiler issues that crop up on remote machines. This is based on adetaylor's rust-lang@dbc4ae7 Major Change Proposal: rust-lang/compiler-team#450 Discussed on rust-lang#38322. Would resolve issue rust-lang#87325.
…haelwoerister Introduce -Z remap-cwd-prefix switch This switch remaps any absolute paths rooted under the current working directory to a new value. This includes remapping the debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`. Importantly, this flag does not require passing the current working directory to the compiler, such that the command line can be run on any machine (with the same input files) and produce the same results. This is critical property for debugging compiler issues that crop up on remote machines. This is based on adetaylor's rust-lang@dbc4ae7 Major Change Proposal: rust-lang/compiler-team#450 Discussed on rust-lang#38322. Would resolve issue rust-lang#87325.
Now that this is implemented, closing in favor of tracking issue #89434. |
Hello,
We in Chromium are working on integrating Rust into our distributed build system. We really appreciate the work done here, as reproducible builds are a requirement for such a plan. However, while this allows the binary output of rustc to be reproducible, we have noticed that the (fully resolved) command-line itself ends up not being so. This destroys our ability to cache build objects between bots and/or developers, as any change in the command-line requires a recompilation (such as changing optimization flags).
Clang has resolved this by providing a -fdebug-compilation-dir flag, in addition to the equivalent of rustc's --remap-path-prefix (as -fdebug-prefix-map in Clang).
This problem is discussed, in the context of Clang, in this blog post: https://blog.llvm.org/2019/11/deterministic-builds-with-clang-and-lld.html
We would like to propose a similar flag for rustc. For simplicity we'd propose similar naming and behaviour as Clang's, in order to reuse previous work.
rustc --debug-compilation-dir . would be exactly equivalent to rustc --remap-path-prefix $(pwd)=.
In terms of ordering and presidence, it can follow the exact same rules as multiple --remap-path-prefix arguments.
I originally posted this in issue #38322 and moving to a separate issue here.
The text was updated successfully, but these errors were encountered: