Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-Zmtime-on-use causes spurious rebuilds in workspace #6972

Closed
Nemo157 opened this issue May 20, 2019 · 11 comments · Fixed by #7050
Closed

-Zmtime-on-use causes spurious rebuilds in workspace #6972

Nemo157 opened this issue May 20, 2019 · 11 comments · Fixed by #7050
Labels
C-bug Category: bug

Comments

@Nemo157
Copy link
Member

Nemo157 commented May 20, 2019

Problem

Running a build using -p or --manifest-path to build individual crates from a workspace and -Zmtime-on-use to allow using cargo-sweep causes some of the dependencies to get rebuilt when they shouldn't.

I have tried to reproduce this in one other workspace I have and it didn't happen, it appears to be somehow specific to the workspace setup in Tide.

Steps

> git clone https://github.com/rustasync/tide && cd tide
Cloning into 'tide'...
[...]
> # current commit: 011d0db078d74e78ccf53f03605271b1bba60290

> cargo build -p tide -Zmtime-on-use
   [...far too many crates...]
   Compiling tide v0.2.0 (/private/var/folders/0p/5yvmrvhj5w3_vy1y8x7dvk7m0000gn/T/tmp.TiMcd4ms/tide/tide)
    Finished dev [unoptimized + debuginfo] target(s) in 1m 42s
> cargo build -p tide-core -Zmtime-on-use
    Finished dev [unoptimized + debuginfo] target(s) in 0.09s
> cargo build -p tide -Zmtime-on-use
   Compiling num-traits v0.2.7
   Compiling num-integer v0.1.40
   Compiling thread_local v0.3.6
   [...20 other crates.io dependencies...]
   Compiling tide-cookies v0.2.0 (/private/var/folders/0p/5yvmrvhj5w3_vy1y8x7dvk7m0000gn/T/tmp.TiMcd4ms/tide/tide-cookies)
   Compiling tide v0.2.0 (/private/var/folders/0p/5yvmrvhj5w3_vy1y8x7dvk7m0000gn/T/tmp.TiMcd4ms/tide/tide)
    Finished dev [unoptimized + debuginfo] target(s) in 16.87s

> # If -Zmtime-on-use isn't used then nothing gets rebuilt:
> cargo build -p tide
    Finished dev [unoptimized + debuginfo] target(s) in 0.11s
> cargo build -p tide-core
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
> cargo build -p tide
    Finished dev [unoptimized + debuginfo] target(s) in 0.09s

Notes

Output of cargo version: cargo 1.36.0-nightly (759b6161a 2019-05-06)

cc @Eh2406

@Eh2406
Copy link
Contributor

Eh2406 commented May 23, 2019

Sorry I have not had a chance to look into this. Conceptually mtime-on-use tuches files that Cargo uses but where Cargo does not use the mtime. (Where Cargo uses the mtime, it is not optional)
Cargo reads and writes mtime using the filetime crate, so I would investigate by printing each file that reads and writes `mtime.
We are looking for the file that:

  • Is read.
  • Is written when using mtime-on-use.
  • Is not written when doing normal invocation.

Then when we know the offending file, the solution is ether to stop reading it or to stop writing it. That depends on where and why it is read.

@mbilker
Copy link

mbilker commented Jun 3, 2019

I am experiencing something like this with chrono which depends on num-traits and num-integer. If I use a patch declaration in my Cargo.toml to point num-traits and num-integer to locally cloned versions of their Git repositories, then I have no spurious rebuilds of those two packages (and any package that depends on them and so forth).

Rebuilds even without changing any file results in:

   Compiling num-traits v0.2.8
   Compiling num-integer v0.1.41
   Compiling num-traits v0.1.43
   Compiling rmp v0.8.7
   Compiling rmp-serde v0.13.7
   Compiling chrono v0.4.6
   Compiling diesel v1.4.2

@schomatis
Copy link
Contributor

Hey, thanks for the detailed description, jumping in to participate since I've seen the E-help-wanted and E-mentor tags. Reading through the code (but without a clear understanding of the intended purpose of the mtime-on-use flag) I fail to see the issue here (that is, what would be the expected behavior in these cases). Since the two packages live in the same workspace and share the same dependencies (same .rmeta files whose mtimes get updated every time they are checked), but not necessarily in the same exact graph, it seems there will always be a discrepancy when rebuilding the other package.

Another example,

git clone https://github.com/rust-lang/rls
cargo build -p rls
cargo build -p rls-rustc -Zmtime-on-use
cargo build -p rls -Zmtime-on-use
cargo build -p rls-rustc -Zmtime-on-use
[...]

both packages will keep stepping on each other and both will need to be rebuilt. More explicitly, rls_rustc (a dependency of rls) will complain when being built that it's already outdated (from the previous rls build),

CARGO_LOG=cargo::core::compiler::fingerprint=trace cargo build -p rls-rustc -Zmtime-on-use
[...]
cargo::core::compiler::fingerprint] dependency on `rls_rustc` is newer than we are
cargo::core::compiler::fingerprint] fingerprint error for rls-rustc v0.6.0
cargo::core::compiler::fingerprint]     err: current filesystem status shows we're outdated

(Not saying there's no problem here, just trying to better understand the issue to try to participate. Any clarifications would be very helpful, thanks.)

@Eh2406
Copy link
Contributor

Eh2406 commented Jun 14, 2019

Grate! Thanks for jumping in!

(but without a clear understanding of the intended purpose of the mtime-on-use flag)

So let me dig up some history. The /target directory grows without bound with normal use of Cargo. This provides the best caching we can. The only way to clean it up is cargo clean which basically blows the entire thing away, and means a full rebuild on the next use of Cargo. There are many better GC/Invalidation strategies then all or nothing. But it is not clear which are going to be useful enough for Cargo to support in perpetuity. (#6229) The thought is to make outside subcommands that try different strategies, and we will talk about bringing the best in when we see what works.

Turns out that most better strategies want to know when Cargo last used an artifact. CI often needs record a timestamp, build all the things, GC artifacts older than the timestamp. Crater needs, when the drive fills up GC the oldest things until we have enough room to continue. On the 99% of systems where accessed time is not available from the file system, Cargo needs to provide that data. This was added in #6477, with lots of good discussion. Unfortunately this did not work well on some systems, specifically it made the playground time out, and was put behind the mtime-on-use flag in #6573

what would be the expected behavior in these cases

The intent of mtime-on-use is that it only changes the mtimes of files where Cargo does not use the mtime of the file. If Cargo reds a the mtime of a file that was set by mtime-on-use then there is a bug. That bug may be, from least profound to most:

  1. mtime-on-use is setting it on some small category of files that it should not be.
  2. Cargo is reading mtimes of some small category of files that it should not be.
  3. There is no way to tell if Cargo will need to read the mtime of any particular file so mtime-on-use needs to be redesigned. A redesign my be interesting for other reasons, but this would be a nail in the coffin of this design.

To start investigating I would start by comparing the set of files:

  • That are set by mtime-on-use, here or here
  • That are read by paths::mtime
  • That flow into the log message you found. Thank you for finding that!

The first two sets can be found by adding some log::info!. Hopefully the overlap is small and we can figure it out from there. If not we know, thanks to you, that it is related to rls_rustc, so we can look at the subset related to that.

Everything else is unknown unknowns.

@schomatis
Copy link
Contributor

Thanks for this excellent write-up. I'll make a more thorough investigation but my impression (from some logs I set up when first looking at this) is that we're closer to number 3 (without necessarily being the last nail).

The intent of mtime-on-use is that it only changes the mtimes of files where Cargo does not use the mtime of the file.

I didn't see anything in the code that reflects this intent. It seems to me that check_filesystem indiscriminately sets all output's mtime. This is fine when building the same package many times because the .rmeta files (seen both as outputs and dependencies) are processed in the correct order but when we try to build a package that shares only some part of the dependency graph (and not necessarily in the same order) then I think mtime discrepancies are bound to happen. I haven't seen anything in the design that could possibly account for that (it seems to me cargo thinks about a single dependency graph at a time, which seems reasonable, it's not speculating that the same Units may be part of different graph for a different target).

Anyway, that is just my first impression and I'm probably missing parts of the puzzle, I'll try to dig deeper systematically scraping the log output as you suggest.

@Eh2406
Copy link
Contributor

Eh2406 commented Jun 14, 2019

No need for a deeper dig, you are completely correct. The touching of intermediate artifacts does not preserve the only changes the mtimes of files where Cargo does not use the mtime of the file invariant, since #6883 or #6832.

@alexcrichton that was your suggestion and your PRs. How do you think we should fix this?
My 2c is that we just remove the touching of intermediate artifacts. This dose mean that all GC strategies will need to be aware of (and updated with) Cargos layout of /target. But the only one in use so far is cargo-sweep, which already uses the fingerprint file. @ehuss, If an outside tool needs to be aware of a designated side channel, (currently the fingerprint file), then your suggestion of having a different side channel seems very tempting.

@alexcrichton
Copy link
Member

Sorry I'm not really up to speed on the issues here and what broke and what it's all used for. That being said experimental features are experimental for a reason which is that we're not willing to commit to keeping them working over time, so I'm sorry if those PRs broke the feature but we can't also just freeze Cargo as-is, tack on experimental features, and then never break anything.

@schomatis
Copy link
Contributor

@Eh2406 With some guidance I could probably start some work towards your proposed solution (if you think that may be a way forward in the GC effort).

@Eh2406
Copy link
Contributor

Eh2406 commented Jun 14, 2019

@alexcrichton I am sorry I came across as blaming you. You are absolutely correct, unstable features can and will break. I do not have a problem with your PRs braking the unstable mtime-on-use flag. The question I have is how to proceed. My feeling is to rip out the conflicting part of the unstable mtime-on-use feature. mtime-on-use was always intended to be a small change, that could be removed if it gave us trouble. My only hesitation is that you specifically asked me to add that part. link So I did not want to ask @schomatis to remove it without asking you.

@schomatis Help with a PR would be appreciated! The steps involved making a new test, like the ones added here that using mtime-on-use does not trigger a rebuild in a workspace. (This test will fail now.)

If we agree that remove the touching of intermediate artifacts is the solution, then you can remove

if mtime_on_use {
let t = FileTime::from_system_time(SystemTime::now());
filetime::set_file_times(output, t, t)?;
}

and the mtime_on_use argument to that function. (and any other code that makes dead.) This should make your new test pass. It should also make the simple_deps_cleaner_dose_not_rebuild test fail, so you'll need to remove it.

@alexcrichton
Copy link
Member

Oh I'm fine architecting this however works best, so removing something I thought was a good idea because it's no longer a good idea seems fine by me.

bors added a commit that referenced this issue Jun 21, 2019
…ate-artifacts-mtime, r=Eh2406

fix(fingerpring): do not touch intermediate artifacts

Fixes #6972.

The newly introduced [test](9aa7a4d) is [failing](https://travis-ci.com/rust-lang/cargo/jobs/209849725#L2569-L2580) as discussed in #6972 (comment) (replicating the issue). Removing the `touching of intermediate artifacts` as suggested fixes the issue, but makes the old test `simple_deps_cleaner_does_not_rebuild` fail. The `simple_deps_cleaner_does_not_rebuild` test is not needed anymore so it's removed.

r? @Eh2406
@mbilker
Copy link

mbilker commented Oct 30, 2019

Thank you @schomatis for the CARGO_LOG flag to print fingerprinting messages. I discovered that my spurious rebuilds were not because of the mtime-on-use flag, but rather because of my use of remap-path-prefix in my Cargo.toml. After removing that from the custom rustflags, I no longer get spurious rebuilds.

rustflags = ["--remap-path-prefix", "/home/user/.cargo=/"]
[2019-10-30T17:52:38Z INFO  cargo::core::compiler::fingerprint] stale: missing "/registry/src/github.com-1ecc6299db9ec823/indexmap-1.3.0/build.rs"
[2019-10-30T17:52:38Z INFO  cargo::core::compiler::fingerprint] stale: missing "/registry/src/github.com-1ecc6299db9ec823/num-integer-0.1.41/build.rs"
[2019-10-30T17:52:38Z INFO  cargo::core::compiler::fingerprint] stale: missing "/registry/src/github.com-1ecc6299db9ec823/num-traits-0.2.8/build.rs"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants