-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix some issues with absolute paths in dep-info files. #7137
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
Marking as draft, as it currently doesn't work on Windows. See rust-lang/rust#61727 (comment) I'm going to propose the following rollout schedule. I think we should back out #7030 on beta, since it has known bugs I feel uncomfortable leaving it in (this is optional). We can land this PR now in Cargo. Then wait for 1.39 (the next release cycle) to land 61727. Beta-cargo (1.37) won't handle the extra paths properly, so when used in rust-lang/rust I think it will cause too many problems to land 61727 now. I'm not sure if @Mark-Simulacrum is OK with letting that PR wait another 5 weeks. The alternative is to backport this PR to beta, but I feel that this PR is very risky, so I don't feel comfortable doing that (I can be convinced otherwise, though!). |
I'm okay backing out relevant PRs for beta, especially if they have known bugs. In the meantime, I can start working on moving rust-lang/rust's rustbuild over to this new system (by creating a toolchain for at least x86_64-unknown-linux-gnu via try builders, for example). Ideally, we'd backport while gating under |
@@ -1547,6 +1550,9 @@ impl DepInfoPathType { | |||
/// The `rustc_cwd` argument is the absolute path to the cwd of the compiler | |||
/// when it was invoked. | |||
/// | |||
/// If the `allow_package` argument is false, then package-relative paths are | |||
/// skipped and ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this comment be expanded to include rationale for why we'd set it to false
?
It looked a bit odd for a second filtering here vs filtering when we parse the file, but I don't see any reason why we'd do it the other way around
let path_mtime = match paths::mtime(path) { | ||
Ok(mtime) => mtime, | ||
Err(..) => return Some(StaleFile::Missing(path.to_path_buf())), | ||
let path_mtime = match mtime_cache.entry(path.to_path_buf()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the most worrisome part here to me, using the mtime cache for all calls to paths::mtime
. My main worry is how this interacts with edits during the build itself. Can you confirm that find_stale_file
here is basically only called during "startup" of Cargo? I think it's fine to, basically during the preparation phase of a build, assume that everything happens instantaneously.
I think that's the case here already since it's the only phase which should have &mut Context
, but would be worth double-checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also you were mentioning that this deduplicates sysroot dependencies, but that's the only mtime calls that are being reduced, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is all done at the beginning.
It can also deduplicate other paths as well. For example, when doing cargo build --tests
then main.rs
will show up twice (once as a dep of the bin executable, and once as the bin unittest). There are a variety of other situations (using the same modules from different targets, include
shenanigans, etc.). I'm fairly confident they should all be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok sounds good to me!
One step we could also do is to store Option<HashMap>
and unwrap it here, and then after the build preparation is finished we set it to None
to basically internally assert we don't access it again. I don't think that's necessary at this time, though.
} else { | ||
Ok(Some(paths)) | ||
} | ||
Ok(Some(paths)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you comment on how come this change was necessary? Is it because package files were filtered out for registry deps?
I'm trying to remember why this case was here but I'm coming up with a blank...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the paths.is_empty()
logic was added in #4788 with no comment as to why. The previous logic looks like what's added in this PR. So long as all tests pass seems reasonable to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I couldn't figure out why this check was added in 4788. The main gist is that when building a registry dependency with current rustc, the list would end up empty (everything gets filtered out), and cargo would treat it as "dirty" when it should be fine. By instead returning an empty list, it treats it correctly.
I think that ignoring source-relative files for registry deps for |
Oh, I didn't think about that. I'm probably still not comfortable back porting things since this looks like it has a high risk to me regardless. If you want to add a -Z flag to 61727, then I guess that PR could land right away. We could then make it the default in 1.39. It's up to you, whichever is easiest. |
FWIW forgot to mention but I personally agree with the strategy of "revert beta, land this now, and get rustbuild working ASAP via what's necessary but don't backport this PR" |
Perhaps I misunderstood -- I thought this PR was necessary to get rust-lang/rust#61727 working properly, even with modifications to rustbuild? In that case it seems we need to backport to beta to get rustbuild working. I'd be happy to do the legwork adding unstable flags or similar wherever necessary! |
It is required, but from my perspective I feel it is too risky to backport this to beta. I would prefer to wait for the next release cycle before turning 61727 on. If you add a It's unfortunate to delay 4 more weeks, but I don't want to chase potential new issues on beta. |
Ah, sure, that makes sense. I'll work on making that happen -- I think from a rustc perspective a |
I added a hack to deal with Windows extended-length paths. I'm open to other approaches, but I think any other approach is going to be more complex. I haven't tried to confuse it using symlinks, but I don't think that is a new problem. |
} else if let Ok(stripped) = file.strip_prefix(target_root) { | ||
let file = if cfg!(windows) && file.starts_with("\\\\?\\") { | ||
// Remove Windows extended-length prefix, since functions like | ||
// strip_prefix won't work if you mix with traditional dos paths. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically this isn't quite right I think because if you strip \\?\C:\foo
from \\?\C:\foo\bar
you'll get bar
. If you strip C:\foo
from \\?\C:\foo\bar
, however, it doesn't work. I won't pretend to know enough about paths to know whether that's a bug or not though.
I think a better stategy may be to have canonicalized and non-canonicalized versions of the various root paths we're considering here? I think this would definitely be a problem on Unix as well if the project dir or something above it uses a symlink, but stripping both the current absolute dir as well as the canonicalized current absolute dir should solve that as well (and handle this \\?\
business transparently)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds fair, I'll give it a shot. It will just be a bit more complex, but hopefully not too much. I'll also try to create a test to see if symlink canonicalization actually is a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with a new approach which always compares canonical paths.
It did reveal that it was not handling symlinks properly, so it should work better on all platforms now.
df1ea31
to
b737b83
Compare
Once rust-lang/rust#61727 is in nightly, I'll update the tests here. |
☔ The latest upstream changes (presumably #7175) made this pull request unmergeable. Please resolve the merge conflicts. |
This adds dep-info tracking for non-path dependencies. This avoids tracking package-relative paths (like source files) in non-path dependencies, since we assume they are static. It also adds an mtime cache to improve performance since when rustc starts tracking sysroot files (in the future), it can cause a large number of stat calls.
Instead of treating Windows differently, this just always uses canonical paths on all platforms. This fixes a problem where symlinks were not treated correctly on all platforms. Switching rm_rf to the remove_dir_all crate because deleting symbolic links on Windows is difficult.
b737b83
to
51a8206
Compare
I added a |
📌 Commit c6e626b has been approved by |
Fix some issues with absolute paths in dep-info files. There were some issues with how #7030 was handling translating paths in dep-info files. The main consequence is that when coupled with rust-lang/rust#61727 (which has not yet merged), the fingerprint would fail and be considered dirty when it should be fresh. It was joining [`target_root`](https://github.com/rust-lang/cargo/blame/1140c527c4c3b3e2533b9771d67f88509ef7fc16/src/cargo/core/compiler/fingerprint.rs#L1352-L1360) which had 3 different values, but stripping [only one](https://github.com/rust-lang/cargo/blame/1140c527c4c3b3e2533b9771d67f88509ef7fc16/src/cargo/core/compiler/mod.rs#L323). This means for different scenarios (like using `--target`), it was creating incorrect paths in some cases. For example a target having a proc-macro dependency which would be in the host directory. The solution here is to always use CARGO_TARGET_DIR as the base that all relative paths are checked against. This should handle all host/target differences. The tests are marked with `#[ignore]` because 61727 has not yet merged. This includes a second commit (which I can open as a separate PR if needed) which is an alternate solution to #7034. It adds dep-info tracking for registry dependencies. However, it tries to limit which kinds of paths it tracks. It will not track package-relative paths (like source files). It also adds an mtime cache to significantly reduce the number of stat calls (because every dependency was stating every file in sysroot). Finally, I've run some tests using this PR with 61727 in rust-lang/rust. I can confirm that a second build is fresh (it doesn't erroneously rebuild). I can also confirm that the problem in rust-lang/rust#59105 has *finally* been fixed! My confidence in all this is pretty low, but I've been unable to think of any specific ways to make it fail. If anyone has ideas on how to better test this, let me know. Also, feel free to ask questions since I've been thinking about this a lot for the past few weeks, and there is quite a bit of subtle stuff going on.
☀️ Test successful - checks-azure |
Update cargo, rls ## cargo 12 commits in d0f828419d6ce6be21a90866964f58eb2c07cd56..26092da337b948719549cd5ed3d1051fd847afd7 2019-07-23 21:58:59 +0000 to 2019-07-31 23:24:32 +0000 - tests: Enable features to fix unstabilized `#[bench]` (rust-lang/cargo#7198) - Fix excluding target dirs from backups on OSX (rust-lang/cargo#7192) - Handle symlinks to directories (rust-lang/cargo#6817) - Enable pipelined compilation by default (rust-lang/cargo#7143) - Refactor resolve `Method` (rust-lang/cargo#7186) - Update `cargo_compile` module doc. (rust-lang/cargo#7187) - Clean up TargetInfo (rust-lang/cargo#7185) - Fix some issues with absolute paths in dep-info files. (rust-lang/cargo#7137) - Update the `url` crate to 2.0 (rust-lang/cargo#7175) - Tighten requirements for git2 crates (rust-lang/cargo#7176) - Fix a deadlocking test with master libgit2 (rust-lang/cargo#7179) - Fix detection of cyclic dependencies through `[patch]` (rust-lang/cargo#7174) ## rls 1 commits in 70347b5d4dfe78eeb9e6f6db85f773c8d43cd22b..93d9538c6000fcf6c8da763ef4ce7a8d407b7d24 2019-07-30 12:56:38 +0200 to 2019-07-31 21:42:49 +0200 - Update cargo (rust-lang/rls#1529)
Remove remove_dir_all This removes the `remove_dir_all` dependency. It is no longer needed, as there has been significant improvements to std's implementation over the years (particularly recently). This improves the performance of running cargo's testsuite on my machine from 8m to 2m 30s (!!). This was added in #7137 to deal with some issues with symbolic links. I believe those seem to be resolved now. Note that std's implementation is still not bullet proof. In particular, I think there are still some cases where a file can be locked by another process which prevents it from being removed. There are some more notes about this at #7042 (comment) (and various other places linked there). Closes #9585
There were some issues with how #7030 was handling translating paths in dep-info files. The main consequence is that when coupled with rust-lang/rust#61727 (which has not yet merged), the fingerprint would fail and be considered dirty when it should be fresh.
It was joining
target_root
which had 3 different values, but stripping only one. This means for different scenarios (like using--target
), it was creating incorrect paths in some cases. For example a target having a proc-macro dependency which would be in the host directory.The solution here is to always use CARGO_TARGET_DIR as the base that all relative paths are checked against. This should handle all host/target differences.
The tests are marked with
#[ignore]
because 61727 has not yet merged.This includes a second commit (which I can open as a separate PR if needed) which is an alternate solution to #7034. It adds dep-info tracking for registry dependencies. However, it tries to limit which kinds of paths it tracks. It will not track package-relative paths (like source files). It also adds an mtime cache to significantly reduce the number of stat calls (because every dependency was stating every file in sysroot).
Finally, I've run some tests using this PR with 61727 in rust-lang/rust. I can confirm that a second build is fresh (it doesn't erroneously rebuild). I can also confirm that the problem in rust-lang/rust#59105 has finally been fixed!
My confidence in all this is pretty low, but I've been unable to think of any specific ways to make it fail. If anyone has ideas on how to better test this, let me know. Also, feel free to ask questions since I've been thinking about this a lot for the past few weeks, and there is quite a bit of subtle stuff going on.