-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate jobserver support to parallel codegen #42682
Conversation
r? @arielb1 (rust_highfive has picked a reviewer for you, use r? to override) |
c765aad
to
4e8e13a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! I'm excited about this :)
Regarding the implementation, I'm not quite clear on how token handling works there. Wouldn't it be easier to just move one token into each spawn_work
and let it go out of scope there?
@@ -82,16 +84,11 @@ pub fn run(sess: &session::Session, | |||
// For each of our upstream dependencies, find the corresponding rlib and | |||
// load the bitcode from the archive. Then merge it into the current LLVM | |||
// module that we've got. | |||
link::each_linked_rlib(sess, &mut |cnum, path| { | |||
// `#![no_builtins]` crates don't participate in LTO. | |||
if sess.cstore.is_no_builtins(cnum) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you remove this on purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, it's added in later again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this query just ended up having a lot of dependencies on sess
so I figured it'd be best to move it way up to the beginning instead of only running it back here.
src/librustc_trans/back/write.rs
Outdated
execute_work_item(&cgcx, work); | ||
let mut tokens = Vec::new(); | ||
let mut running = 0; | ||
while work_items.len() > 0 || running > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment here saying something to the effect of "This is our 'main loop', taking care of spawning worker threads and communicating with live ones via message passing -- so we have to keep it running as long as there's still work that hasn't been doled out to a worker (work_items > 0) or if there are still live workers to be communicated with (running > 0)."
src/librustc_trans/back/write.rs
Outdated
scope, | ||
tx.clone(), | ||
work_items.pop().unwrap(), | ||
work_items.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very fond this: mutating work_items
via pop
and then taking its len
. I assume that we have a defined evaluation order of function arguments, but I don't like relying on it.
src/librustc_trans/back/write.rs
Outdated
// possible. Remember that we have an ambient token available to us | ||
// hence the `+1` here. | ||
// | ||
// Also note that we may actually acquire more tokens than we need, so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When does that happen? If we abort early because of an error?
src/librustc_trans/back/write.rs
Outdated
// | ||
// Also note that we may actually acquire more tokens than we need, so | ||
// in that case just truncate the `tokens` list every time we pass | ||
// through here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add that truncating implies dropping and thus releasing tokens?
src/librustc_trans/back/write.rs
Outdated
work_items.len()); | ||
running += 1; | ||
} | ||
tokens.truncate(running.saturating_sub(1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure how this works. Can't this cause tokens to be lost without a spawn_work
having been called for them?
|
||
// Set up a destructor which will fire off a message that we're done as | ||
// we exit. | ||
struct Bomb { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have something like this in libstd.
src/librustc_trans/back/write.rs
Outdated
if sess.cstore.is_no_builtins(cnum) { | ||
return | ||
} | ||
each_linked_rlib.push((cnum, path.to_path_buf())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the each_linked_rlib
field is LTO-specific, we should probably change to the name to reflect this.
src/librustc_trans/back/write.rs
Outdated
// Execute the work itself, and if it finishes successfully then flag | ||
// ourselves as a success as well. | ||
if execute_work_item(&cgcx, work).is_err() { | ||
drop(cgcx.tx.send(Message::AbortIfErrors)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could argue that it would be cleaner to also mem::forget
the bomb
in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I wasn't quite sure how this should be handled, I think that if you see a FatalError
then a diagnostic has already been sent off, which in turn already sent AbortIfErrors
. In that sense it may be fruitless to send another message here, so I'll just ignore the result.
src/librustc_trans/scope.rs
Outdated
// option. This file may not be copied, modified, or distributed | ||
// except according to those terms. | ||
|
||
//! Scoped threads, copied from `crossbeam` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also use crossbeam
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm upon further inspection, I don't see why not!
Ok, updated! @michaelwoerister I added a large comment above the "main loop" which I believe should answer your questions about the token management, but if you'd like me to clarify anything please just let me know! |
☔ The latest upstream changes (presumably #42676) made this pull request unmergeable. Please resolve the merge conflicts. |
// manner we can ensure that the maximum number of parallel workers is | ||
// capped at any one point in time. | ||
// | ||
// The jobserver protocol is a little unique, however. We, as a running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because concurrent programming isn't complicated enough by itself already 😛
Thanks for the clarifying comment about the jobserver protocol! r=me once the merge conflict is fixed. |
@bors: r=michaelwoerister |
📌 Commit 5d00e5e has been approved by |
⌛ Testing commit 5d00e5ea2892d376aa18f8db2e4db435e097a81a with merge 6cb3b992db1aa5c9952c187dca9b30ec0c3d98d4... |
💔 Test failed - status-appveyor |
@bors: r=michaelwoerister |
📌 Commit a014634 has been approved by |
⌛ Testing commit a014634c1a0ee939207dcd1e8d64dfcd0ebec586 with merge 016496955016d5d75d27180c10a158aad0083c8d... |
💔 Test failed - status-appveyor |
@bors: r=michaelwoerister |
📌 Commit 451d392 has been approved by |
This commit integrates the `jobserver` crate into the compiler. The crate was previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose here is to two-fold: * Primarily the compiler can cooperate with Cargo on parallelism. When you run `cargo build -j4` then this'll make sure that the entire build process between Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc instances which may all try to spawn lots of threads. * Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver. This means that if you call cargo/rustc from `make` or another jobserver-compatible implementation it'll use foreign parallelism settings instead of creating new ones locally. As the number of parallel codegen instances in the compiler continues to grow over time with the advent of incremental compilation it's expected that this'll become more of a problem, so this is intended to nip concurrent concerns in the bud by having all the tools to cooperate! Note that while rustc has support for itself creating a jobserver it's far more likely that rustc will always use the jobserver configured by Cargo. Cargo today will now set a jobserver unconditionally for rustc to use.
@bors: r=michaelwoerister |
📌 Commit 201f069 has been approved by |
Integrate jobserver support to parallel codegen This commit integrates the `jobserver` crate into the compiler. The crate was previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose here is to two-fold: * Primarily the compiler can cooperate with Cargo on parallelism. When you run `cargo build -j4` then this'll make sure that the entire build process between Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc instances which may all try to spawn lots of threads. * Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver. This means that if you call cargo/rustc from `make` or another jobserver-compatible implementation it'll use foreign parallelism settings instead of creating new ones locally. As the number of parallel codegen instances in the compiler continues to grow over time with the advent of incremental compilation it's expected that this'll become more of a problem, so this is intended to nip concurrent concerns in the bud by having all the tools to cooperate! Note that while rustc has support for itself creating a jobserver it's far more likely that rustc will always use the jobserver configured by Cargo. Cargo today will now set a jobserver unconditionally for rustc to use.
💔 Test failed - status-travis |
@bors: retry
* osx timed out
…On Wed, Jun 21, 2017 at 4:24 PM, bors ***@***.***> wrote:
💔 Test failed - status-travis
<https://travis-ci.org/rust-lang/rust/builds/245482773?utm_source=github_status&utm_medium=notification>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#42682 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAD95FEbgYHMXulmMHTszBVJR6hpkE9Lks5sGYoOgaJpZM4N7l7L>
.
|
Integrate jobserver support to parallel codegen This commit integrates the `jobserver` crate into the compiler. The crate was previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose here is to two-fold: * Primarily the compiler can cooperate with Cargo on parallelism. When you run `cargo build -j4` then this'll make sure that the entire build process between Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc instances which may all try to spawn lots of threads. * Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver. This means that if you call cargo/rustc from `make` or another jobserver-compatible implementation it'll use foreign parallelism settings instead of creating new ones locally. As the number of parallel codegen instances in the compiler continues to grow over time with the advent of incremental compilation it's expected that this'll become more of a problem, so this is intended to nip concurrent concerns in the bud by having all the tools to cooperate! Note that while rustc has support for itself creating a jobserver it's far more likely that rustc will always use the jobserver configured by Cargo. Cargo today will now set a jobserver unconditionally for rustc to use.
☀️ Test successful - status-appveyor, status-travis |
So does this only support invoking |
@jdm it's a little more nuanced than that. Cargo also creates a jobserver in addition to consuming one, meaning that rustc will basically always use that jobserver now. If Cargo inherits a jobserver though then rustc likely will too. You need to tweak makefiles calling rustc/cargo though to actually let them inherit the jobserver, notably adding a For build scripts invoking make the |
This should significantly speed up debug and test builds + cargo check. With rust-lang/rust#42682, cargo/rustc no longer spawns lots and lots of workers even when called recursively. Still not enabled by default in release mode: https://internals.rust-lang.org/t/help-test-out-thinlto/6017
…atsakis Remove some `ignore-stage1` annotations. These tests appear to no longer need the `ignore-stage1` marker. - `run-make-fulldeps/issue-37839` and `run-make-fulldeps/issue-37893`: I believe these were due to the use of proc-macros, and probably were just missed in rust-lang#49219 which fixed the proc-macro compatibility. - `compile-fail/asm-src-loc-codegen-units.rs`: This was due to an old issue with landing pads (as mentioned in the linked issue rust-lang#20184). `-Zno-landing-pads` was an option when building the first stage (it was much faster), but somewhere along the way (I think the switch from makefiles to rustbuild), the option was removed. - NOTE: This test doesn't actually test what it was originally written for, and is probably mostly pointless now. This test was asserting the message "build without -C codegen-units for more exact errors", but that was removed in rust-lang#42682. It is now in essence identical to `asm-src-loc.rs`.
…atsakis Remove some `ignore-stage1` annotations. These tests appear to no longer need the `ignore-stage1` marker. - `run-make-fulldeps/issue-37839` and `run-make-fulldeps/issue-37893`: I believe these were due to the use of proc-macros, and probably were just missed in rust-lang#49219 which fixed the proc-macro compatibility. - `compile-fail/asm-src-loc-codegen-units.rs`: This was due to an old issue with landing pads (as mentioned in the linked issue rust-lang#20184). `-Zno-landing-pads` was an option when building the first stage (it was much faster), but somewhere along the way (I think the switch from makefiles to rustbuild), the option was removed. - NOTE: This test doesn't actually test what it was originally written for, and is probably mostly pointless now. This test was asserting the message "build without -C codegen-units for more exact errors", but that was removed in rust-lang#42682. It is now in essence identical to `asm-src-loc.rs`.
…atsakis Remove some `ignore-stage1` annotations. These tests appear to no longer need the `ignore-stage1` marker. - `run-make-fulldeps/issue-37839` and `run-make-fulldeps/issue-37893`: I believe these were due to the use of proc-macros, and probably were just missed in rust-lang#49219 which fixed the proc-macro compatibility. - `compile-fail/asm-src-loc-codegen-units.rs`: This was due to an old issue with landing pads (as mentioned in the linked issue rust-lang#20184). `-Zno-landing-pads` was an option when building the first stage (it was much faster), but somewhere along the way (I think the switch from makefiles to rustbuild), the option was removed. - NOTE: This test doesn't actually test what it was originally written for, and is probably mostly pointless now. This test was asserting the message "build without -C codegen-units for more exact errors", but that was removed in rust-lang#42682. It is now in essence identical to `asm-src-loc.rs`.
…atsakis Remove some `ignore-stage1` annotations. These tests appear to no longer need the `ignore-stage1` marker. - `run-make-fulldeps/issue-37839` and `run-make-fulldeps/issue-37893`: I believe these were due to the use of proc-macros, and probably were just missed in rust-lang#49219 which fixed the proc-macro compatibility. - `compile-fail/asm-src-loc-codegen-units.rs`: This was due to an old issue with landing pads (as mentioned in the linked issue rust-lang#20184). `-Zno-landing-pads` was an option when building the first stage (it was much faster), but somewhere along the way (I think the switch from makefiles to rustbuild), the option was removed. - NOTE: This test doesn't actually test what it was originally written for, and is probably mostly pointless now. This test was asserting the message "build without -C codegen-units for more exact errors", but that was removed in rust-lang#42682. It is now in essence identical to `asm-src-loc.rs`.
This commit integrates the
jobserver
crate into the compiler. The crate waspreviously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:
Primarily the compiler can cooperate with Cargo on parallelism. When you run
cargo build -j4
then this'll make sure that the entire build process betweenCargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
instances which may all try to spawn lots of threads.
Secondarily rustc/Cargo can now integrate with a foreign GNU
make
jobserver.This means that if you call cargo/rustc from
make
or anotherjobserver-compatible implementation it'll use foreign parallelism settings
instead of creating new ones locally.
As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!
Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.