Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustbuild: Use copies instead of hard links #39518

Merged
merged 2 commits into from
Mar 10, 2017

Conversation

alexcrichton
Copy link
Member

The original motivation for hard links was to speed up the various stages of
rustbuild, but in the end this is causing problems on Windows (#39504).

This commit tweaks the build system to use copies instead of hard links
unconditionally to ensure that the files accessed by Windows are always
disjoint.

Locally this added .3s to a noop build, so it shouldn't be too much of a
regression hopefully!

Closes #39504

@alexcrichton
Copy link
Member Author

r? @brson

@rust-highfive rust-highfive assigned brson and unassigned nikomatsakis Feb 4, 2017
@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@nagisa
Copy link
Member

nagisa commented Feb 4, 2017

I always said we should be using CoW copies when they are supported :)

@nikomatsakis
Copy link
Contributor

@alexcrichton I think we copied this "hard-link-or-copy" pattern in incremental compilation (cc @michaelwoerister, still true?) -- should we revert that code? Also, maybe in the test runner?

@michaelwoerister
Copy link
Member

We are using hard-links when possible for incremental compilation and I haven't seen any reports of problems with that lately. The usage pattern in incr. comp. is rather simple. CoW would be ideal, but it's not available on most file systems.

@alexcrichton
Copy link
Member Author

Oh so to be clear I think this is a relatively obscure bug, having to do with the interaction between Cargo, rustbuild, and the compiler. The compiler's only considering the interaction with itself with incremental compilation, so I'd be confident that the strategy for the compiler is still going to work quite well.

I would have solved this problem very differently if it were just a problem within Cargo itself, but seeing how it spanned across multiple systems this was the best solution I could think of.

@brson
Copy link
Contributor

brson commented Feb 9, 2017

@bors r+

@bors
Copy link
Contributor

bors commented Feb 9, 2017

📌 Commit 32e6c9f has been approved by brson

@bors
Copy link
Contributor

bors commented Feb 10, 2017

⌛ Testing commit 32e6c9f with merge 9658b3b...

@bors
Copy link
Contributor

bors commented Feb 10, 2017

💔 Test failed - status-appveyor

@alexcrichton alexcrichton added this to the 1.16.0 milestone Feb 11, 2017
@nagisa
Copy link
Member

nagisa commented Feb 11, 2017

CoW would be ideal, but it's not available on most file systems.

Since copy is happening anyway, it is worthwhile to just try CoW copy first.

@alexcrichton
Copy link
Member Author

Ok a new hypothesis for what's going on here.

  • In one process:
    • Cargo is running rustc --test for librand
    • The compiler has -L path/do/deps
    • The libtest.rlib file is found in the sysroot
    • This crate depends on alloc_system (transitively)
    • The compiler searches for alloc_system
    • The compiler finds two candidates, one in 'deps' one in 'sysroot'
    • They have the same SVH, the compiler pikcs the first, the one in 'deps'
    • The compiler mmaps the rlib to learn of the metadata
  • In another process
    • Cargo is running normally, learns that alloc_system is fresh
    • Upon learning alloc_system is fresh Cargo runs the freshness logic
    • This logic will re-link an rlib with hashes to a location of an rlib without hashes
    • This involes deleting a previous hard link if it's there

This means that we're deleting a hard link where the compiler in another process has an open file handle to a different path pointing to the hard link. Windows then rejects this.

Some questions I've run into:

So... what now? I'm not sure.

@alexcrichton
Copy link
Member Author

Oh right so I should also mention that I can't think of a way to trigger this in a "normal" cargo project. The problem is the circular dependency sorta back through the sysroot. Typically the compiler never finds sysroot crates in the deps/ dir so the only mmap'd files are ones that Cargo's already finished with or ones in the sysroot.

So it's not a heinous bug for everyone, just for rustbuild ...

@alexcrichton
Copy link
Member Author

Oh and of course I can't reproduce this locally, as with all other "flaky" failures nowadays.

@alexcrichton
Copy link
Member Author

Ok I'm not even sure if that's true. I can't reproduce an error by deleting a file when the compiler has it open. I have no idea what's going on.

@brson
Copy link
Contributor

brson commented Mar 4, 2017

We could try verifying rust-lang/cargo#3478 is the one that causes it. That bors commit is 4b351eadce9330fa2d17219b48bbcdc8cef255e7 and the preceding bors commit, that should pass, is d12cc03cf3b144bc7f13707f40d29edb79dc259d. Here's a patch for the latter, though if we tried to test it against bors and it passed, then we would merge the commit we don't want.

If that's the offending commit then we could revert it in cargo.

Can we adjust the flags so that we're not asking rustc to arbitrarily search the deps folder, by being more specific about --extern and not passing -L?

Since this is believed to be a concurrency bug, can we verify that by running this build serially?

Can we trigger this windows build outside of a PR so we don't accidentally land something bogus? Perhaps add an intentional fail late in the build and disable all the other configurations for a while. It seems like we're going to have to throw experiments at the bots.

@brson
Copy link
Contributor

brson commented Mar 4, 2017

I pushed a patch to my branch that I think tells the linux builders to fail immediately, the idea being to run experiments without wasting CPU or accidentally landing.

@brson
Copy link
Contributor

brson commented Mar 4, 2017

@bors r+

@bors
Copy link
Contributor

bors commented Mar 4, 2017

📌 Commit 4b9ecba has been approved by brson

@brson
Copy link
Contributor

brson commented Mar 4, 2017

@bors r+ p=1

@bors
Copy link
Contributor

bors commented Mar 4, 2017

💡 This pull request was already approved, no need to approve it again.

@bors
Copy link
Contributor

bors commented Mar 4, 2017

📌 Commit 4b9ecba has been approved by brson

@brson
Copy link
Contributor

brson commented Mar 4, 2017

@alexcrichton I went ahead and tried a build using the commit you think is immediately before the one that fails, just for the sake of doing something. Feel free to kill it if you have a better use of cpu.

@alexcrichton
Copy link
Member Author

I've opened a homu issue for that failure.

@bors
Copy link
Contributor

bors commented Mar 10, 2017

⌛ Testing commit 6f43149 with merge 2485800...

@bors
Copy link
Contributor

bors commented Mar 10, 2017

💔 Test failed - status-travis

@arielb1
Copy link
Contributor

arielb1 commented Mar 10, 2017

ccache error:

caused by: error reading compile response from server

caused by: IoError(Error { repr: Custom(Custom { kind: Other, error: StringError("unexpected EOF") }) })

@bors retry

@bors
Copy link
Contributor

bors commented Mar 10, 2017

⌛ Testing commit 6f43149 with merge d538fcf...

@bors
Copy link
Contributor

bors commented Mar 10, 2017

💔 Test failed - status-travis

@arielb1
Copy link
Contributor

arielb1 commented Mar 10, 2017

@bors
Copy link
Contributor

bors commented Mar 10, 2017

⌛ Testing commit 6f43149 with merge f573db4...

bors added a commit that referenced this pull request Mar 10, 2017
rustbuild: Use copies instead of hard links

The original motivation for hard links was to speed up the various stages of
rustbuild, but in the end this is causing problems on Windows (#39504).

This commit tweaks the build system to use copies instead of hard links
unconditionally to ensure that the files accessed by Windows are always
disjoint.

Locally this added .3s to a noop build, so it shouldn't be too much of a
regression hopefully!

Closes #39504
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Mar 10, 2017
rustbuild: Use copies instead of hard links

The original motivation for hard links was to speed up the various stages of
rustbuild, but in the end this is causing problems on Windows (rust-lang#39504).

This commit tweaks the build system to use copies instead of hard links
unconditionally to ensure that the files accessed by Windows are always
disjoint.

Locally this added .3s to a noop build, so it shouldn't be too much of a
regression hopefully!

Closes rust-lang#39504
@bors
Copy link
Contributor

bors commented Mar 10, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: arielb1
Pushing f573db4 to master...

@bors bors merged commit 6f43149 into rust-lang:master Mar 10, 2017
@alexcrichton alexcrichton deleted the update-cargo branch March 10, 2017 19:08
alexcrichton added a commit to alexcrichton/cargo that referenced this pull request Aug 10, 2017
This is targeted at removing the need for a workaround in rust-lang/rust#39518,
allowing the main rust build system to move back to hard links which should be
much more efficient.
bors added a commit to rust-lang/cargo that referenced this pull request Aug 10, 2017
Use `same-file` to avoid unnecessary hard links

This is targeted at removing the need for a workaround in rust-lang/rust#39518,
allowing the main rust build system to move back to hard links which should be
much more efficient.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Sep 10, 2017
The `copy` function historically in rustbuild used hard links to speed up the
copy operations that it does. This logic was backed out, however, in rust-lang#39518 due
to a bug that only showed up on Windows, described in rust-lang#39504. The cause
described in rust-lang#39504 happened because Cargo, on a fresh build, would overwrite
the previous artifacts with new hard links that Cargo itself manages.

This behavior in Cargo was fixed in rust-lang/cargo#4390 where it no longer
should overwrite files on fresh builds, opportunistically leaving the filesystem
intact and not touching it.

Hopefully this can help speed up local builds by doing fewer copies all over the
place!
bors added a commit that referenced this pull request Sep 10, 2017
rustbuild: Switch back to using hard links

The `copy` function historically in rustbuild used hard links to speed up the
copy operations that it does. This logic was backed out, however, in #39518 due
to a bug that only showed up on Windows, described in #39504. The cause
described in #39504 happened because Cargo, on a fresh build, would overwrite
the previous artifacts with new hard links that Cargo itself manages.

This behavior in Cargo was fixed in rust-lang/cargo#4390 where it no longer
should overwrite files on fresh builds, opportunistically leaving the filesystem
intact and not touching it.

Hopefully this can help speed up local builds by doing fewer copies all over the
place!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants