-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
-Zremap-cwd-prefix is not applied to the sysroot rlib paths #112586
Comments
Does the linker have a way to remap the paths of object files as they will end up in the .pdb file? |
I don't think so, no. What lld-link does have is a way to specify "for relative source file path X, map it to PREFIX\X to make an absolute path for debugging". This is lld-link has no way to remove parts of paths given to it. (The paths to the object files is not used in debugging, I believe? Just the paths to source files, which are currently remapped through |
If it is not used that begs the question why it is even recorded in the first place :) On a more useful note, the crate loader uses |
I feel the same but I don't make the rules for the PDBs. Ok thank you for the pointer, I will have a look at how to do this. |
@rustbot label +A-reproducibility +O-windows-msvc |
A PR to apply path remapping to linker command line outputs: #112597 I verified that the Chromium build stops seeing any delta (from these paths) with the patch applied and building the same target in two different directories. The Windows tests for bins can't yet be enabled by this PR tho because #112587 needs to be addressed first. |
Before we jump into a fix, I would find it helpful to get a more complete picture of the issue and possible solutions. I have some questions:
If these are paths that tell the linker where to find files on disk, then I suspect we can't remap them on the rustc side. |
The paths that are appearing in the PDB are:
The paths to other arguments ( Providing them as relative paths to the linker works, the binary differences go away. The PDB and EXE are affected, but the deltas in the PDB are the paths (which is easy to see) while the deltas in the EXE are less clear. However they are derived from the paths, because when the paths in the PDB are fixed, the deltas in the PDB and EXE both go away. I will have to try link.exe to verify.
I believe we need to differentiate remap-path-prefix (can make arbitrary paths) and remap-cwd-prefix (must make a valid path). Here's the high level way I think we can do so, from the session options: /// Returns a type to perform mapping of file paths as specified by the
/// user. These paths should not be used for IO as they may map to invalid
/// locations that are designed to just be more clear in diagnostics.
pub fn file_path_mapping(&self) -> FilePathMapping {
FilePathMapping::new(&self.remap_path_prefix)
}
/// Returns a type to perform mapping of file paths as specified by the
/// user. These paths may be used for IO and are required to produce a valid
/// path to the same file or directory.
pub fn file_path_mapping_for_io(&self) -> FilePathMapping {
FilePathMapping::new_for_io(&self.remap_path_prefix)
} The majority of uses cases would use |
Here's an example of the diffs on the EXE and PDB without any fixes:
Here is the difference once the stdlib rlibs are fixed:
And once the |
For reference, we already have something similar to deal with different versions of a filename needed in different situations: rust/compiler/rustc_span/src/lib.rs Line 282 in 3ed2a10
rust/compiler/rustc_span/src/lib.rs Line 176 in 3ed2a10
|
Interesting, it does not seem that FileName is used in
|
Could /LIBPATH be used to make paths in the sysroot relative? I.e. we pass |
We already do pass rust/compiler/rustc_codegen_ssa/src/back/link.rs Line 1959 in 3ed2a10
And if we do that and just give the sysroot libs without any path, yes the paths become deterministic in the PDB. Is there a risk of conflicting stdlibs in different search paths, or presumably that would be unsupported? |
The challenge here is the stdlib rlibs are found in CrateSource which contains the absolute path. We'd need to either:
|
Changing how linkers are invoked has often had unintended side effects in the past, so we'll want to be careful. But overall, I think |
Ok, from looking at how CrateSource is constructed, it feels fraught to me to try to change stdlib paths in there to be relative. So I will look at tracking /LIBPATH. Notably, it is added to the linker before the rlibs are. |
Yes, I think it would be a good idea to do a quick implementation just to verify that it actually solves the problem we're trying to fix. Once that's confirmed, we can think about how to implement it cleanly. |
Awkwardly, the /LIBPATH we give is relative to
Then the path to the rlib is absolute, but can't do prefix matching against /LIBPATH:
|
We can canonicalize the user-specified sysroot to make it consistent with the one we find by default. Something like this from session.rs: let sysroot =
filesearch::get_or_default_sysroot(sopts.maybe_sysroot.clone()).expect("Failed finding sysroot"); And then get_or_default can defer to the user-specified one if present, but also canonicalize it (and fix it up for gcc). |
Ah so
This was previously avoided by giving a relative |
Ok so recapping:
If I could find where the stdlib rlibs are added, as they are taking the sysroot and making something canonical, maybe that's the right answer then.. looking. |
rust/compiler/rustc_metadata/src/locator.rs Lines 621 to 642 in 3ed2a10
This looks wrong. The non-stdlib crates are not an absolute path with relative |
Staring at this code did give me an idea tho, to canonicalize the /LIBPATH (internally) and the rlib path (internally) in order to decide to drop the (canonical) matching prefix. |
That works. In MSVC Linker: fn link_rlib(&mut self, lib: &Path) {
+ let mut canonical_lib_dir = try_canonicalize(lib).unwrap_or_else(|_| lib.to_path_buf());
+ canonical_lib_dir.pop();
+ for libpath in &self.libpaths {
+ if canonical_lib_dir == *libpath {
+ // If the rlib is in a directory specified by /LIBPATH, then drop the directory
+ // from the command line. This ensures that the stdlib rlibs which are added to the
+ // command line by rustc do not appear as absolute paths when the sysroot is a
+ // relative path, in order to produce deterministic outputs (in this case, a linker
+ // command line) which do not include the current working directory.
+ self.cmd.arg(lib.file_name().expect("rlib has no file name path component"));
+ return;
+ }
+ }
self.cmd.arg(lib); and fn include_path(&mut self, path: &Path) {
+ self.libpaths.push(try_canonicalize(path).unwrap_or_else(|_| path.to_path_buf()));
let mut arg = OsString::from("/LIBPATH:");
arg.push(path);
self.cmd.arg(&arg); What do you think? Here's the delta with this approach, which is just caused by the /tmp/.../symbols.o path.
|
Taking @petrochenkov 's feedback however, we can do this up in link.rs instead of in linker.rs, though it makes it more general rather than scoped to MSVC then. In link.rs @@ -2693,9 +2693,24 @@ fn add_static_crate<'a>(
) {
let src = &codegen_results.crate_info.used_crate_source[&cnum];
let cratepath = &src.rlib.as_ref().unwrap().0;
+ let canonical_sysroot_lib_path = {
+ let lib_path = sess.target_filesearch(PathKind::All).get_lib_path();
+ try_canonicalize(&lib_path).unwrap_or(lib_path)
+ };
let mut link_upstream = |path: &Path| {
- cmd.link_rlib(&fix_windows_verbatim_for_gcc(path));
+ let mut canonical_path_dir = try_canonicalize(path).unwrap_or_else(|_| path.to_path_buf());
+ canonical_path_dir.pop();
+
+ let rlib_path = if canonical_path_dir == canonical_sysroot_lib_path {
+ // If the sysroot is a relative path, the sysroot libraries must also be specified as a
+ // relative path to the linker, else the linker command line is non-deterministic and
+ // it shows up in the PDB file generated by the MSVC linker.
+ Path::new(path.file_name().expect("rlib has no file name path component")).to_path_buf()
+ } else {
+ fix_windows_verbatim_for_gcc(path)
+ };
+ cmd.link_rlib(&rlib_path);
}; This also works, and only affects sysroot rlibs. Again, the binary diffs for two runs in different working dirs with the above approach:
|
When the `--sysroot` is specified as relative to the current working directory, the sysroot's rlibs should also be specified as relative paths. Otherwise, the current working directory ends up in the absolute paths, and in the linker command line. And the entire linker command line appears in the PDB file generated by the MSVC linker. When adding an rlib to the linker command line, if the rlib's canonical path is in the sysroot's canonical path, then use the current sysroot path + filename instead of the full absolute path to the rlib. This means that when `--sysroot=foo` is specified, the linker command line will contain `foo/rustlib/target/lib/lib*.rlib` instead of the full absolute path to the same. This addresses rust-lang#112586
7e07271 uses the above strategy in a (IMO) clean way. |
Looks promising! Here is my recap:
I'll give the concrete fix a closer look tomorrow. I also want to give others a chance to weigh in. |
Yeah. It is unrelated because it is actually the If the sysroot is on another drive, then it will have to be an absolute path, and thus making the rlibs relative to it will also make them an absolute path. So the solution we landed on here should work. Thanks for your help with this. |
@danakj, how are you building these test binaries? When I'm using a relative path for the sysroot, I'm running into trouble because Cargo will use a different working directory for each |
I am running it through Chromium GN - the invocations all happen from the single output directory (not unlike CMake). It sounds like a relative sysroot with Cargo will just not work by design (of Cargo) then. All the folks interested in reproducible builds are using other systems though - Buck, GN, or Bazel I guess. Let me share the commands in case they help. Here's the high level thing from the chromium repo:
With these GN args in both
Here's the rustc invocation that it results in:
And the
Oh, and to verify the cause and solution originally, I had replaced This is my hacky #!/usr/bin/env python3
# Copyright 2023 The Chromium Authors
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.
import subprocess
import os
import shutil
import sys
def drop_path(cwd, r: str):
stdpath = os.path.join('local_rustc_sysroot', 'lib', 'rustlib', 'x86_64-pc-windows-msvc', 'lib')
if '/' + str(stdpath) in r:
return os.path.basename(r)
else:
return r
def main():
cwd = os.getcwd()
rest = [drop_path(cwd, r) for r in sys.argv[1:]]
args = []
for r in rest:
if r.startswith('/tmp'):
shutil.copy(r, 'my.o')
args.append('my.o')
else:
args.append(r)
lld = os.path.join('..', '..', 'third_party', 'llvm-build', 'Release+Asserts', 'bin', 'lld-link')
return subprocess.call([lld] + args)
if __name__ == '__main__':
sys.exit(main()) |
We do have the tests in here which are disabled too but I don't think they print out what differed so they will just still fail until the /tmp issue is resolved. But those are some command lines you could run as well and then use other tools to diff. Like https://source.chromium.org/chromium/chromium/src/+/main:tools/determinism/compare_build_artifacts.py |
Thanks for the info, @danakj! Maybe we can let some of that inspire improved reproducibility testing when we also fix the object file in |
…erister Use the relative sysroot path in the linker command line to specify sysroot rlibs This addresses rust-lang#112586
Yes it does! As long as the command line is reproducible. There's an extra flag that helps compared to link.exe, which is |
This is done with a TODO while we work upstream to fix Rust-link-driven Windows exe non-determinism issues. Upstream does intend to have Rust able to produce deterministic results and there were already tests for this that were disabled on Windows due to the reasons being unclear which have now become clear thanks to our determinism bots! Upstream issues: rust-lang/rust#112586 rust-lang/rust#112587 Bug: 1453509 Change-Id: I582e46792934d44866a45d9b57b2c7b64b52a2e5 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4618949 Reviewed-by: Erik Chen <erikchen@chromium.org> Commit-Queue: danakj <danakj@chromium.org> Cr-Commit-Position: refs/heads/main@{#1158781}
Use the relative sysroot path in the linker command line to specify sysroot rlibs This addresses rust-lang/rust#112586
This breaks deterministic builds on Windows (see #88982 (comment)), where the PDB ends up including the paths to the object files. If a relative path is given to
--sysroot
, the path is turned into an absolute path for each rlib, and-Zremap-cwd-prefix
is not applied.The text was updated successfully, but these errors were encountered: