-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experiments with distributed compilation #249
Conversation
8b0bb57
to
963947c
Compare
22355ca
to
ef230a7
Compare
605f57f
to
997c6e8
Compare
Cargo.toml
Outdated
@@ -45,7 +46,10 @@ rust-crypto = { version = "0.2.36", optional = true } | |||
serde = "1.0" | |||
serde_derive = "1.0" | |||
serde_json = "1.0" | |||
# TODO: publish to crates.io | |||
sccache-dist = { path = "sccache-dist" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how you'd prefer to deal with this - I can publish and update this PR? Or you can do it next time you release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason you're adding this code as a separate crate? It's not really usable standalone, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention is to split things out into two separate binaries - the sccache-dist binary has a number of things (e.g. rouille) that sccache itself doesn't need to have compiled in.
It also makes the compile a bit more incremental. Are you thinking I could just have it be a separate binary within the same crate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, a separate binary within the same crate is what I had envisioned. I get the compile argument (although incremental compilation should help with that nowadays), but managing crates.io releases for two intertwined crates like that is a PITA. (I've dealt with this with another project of mine.)
src/compiler/gcc.rs
Outdated
// With -fprofile-generate line number information is important, so don't use -P. | ||
if !parsed_args.profile_generate { | ||
cmd.arg("-P"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With distributed compilation, line number information is always important. More generally, it's important if you want to use a preprocessed file as input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. This will regress the change in #208. Would it be possible to change things so that when distributed compilation is not enabled we can continue to use this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is disappointing, I should've done a bit more digging.
I'm not sure if it's better to say "distributed compilation may happen, your cache is completely invalid (because it's based on non-line-number-annotated source files)", or just preprocess a second time on-demand when sending for remote compilation. Looking at a reasonably large file of C++ code I have easily to hand, it's spending 10-20% of wall-clock time in preprocessing so running twice isn't ideal.
What about: if we might be distributing, do the full preprocess with line numbers but strip out the line numbers with a regexp when considering the cache key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds fine. Since users have to explicitly opt-in to using distributed compilation, I don't think it's unreasonable for them to get this behavior in that case. Running the preprocessor twice seems like too much overhead just for this purpose. We can make a note somewhere if necessary that cache hits using different source paths are not possible with distributed compilation enabled.
@@ -286,20 +309,110 @@ pub trait CompilerHasher<T>: fmt::Debug + Send + 'static | |||
fn box_clone(&self) -> Box<CompilerHasher<T>>; | |||
} | |||
|
|||
#[cfg(not(feature = "dist"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature is disabled by default, therefore so is distributed compilation
@luser this is ready for review - it adds the basic structure of two binaries for distributed compilation, the way they communicate and the initial support for C, C++ and Linux. Although this unconditionally shuffles some things around (e.g. the generation of compilation command rather than executing them), distributed compilation is hidden behind a feature flag. Followups coming soon. |
sccache-dist/src/cache.rs
Outdated
|
||
use errors::*; | ||
|
||
// TODO: possibly shouldn't be public |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with TODO comments, but I'd prefer if they either get removed or if you file follow-up issues and put the issue URL in the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all of the TODOs will be gone by the final PR.
sccache-dist/src/cache.rs
Outdated
Ok(hex(m.finish().as_ref())) | ||
} | ||
|
||
fn hex(bytes: &[u8]) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like something we could pull in from a crate, like the hex
crate or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just stole it from src/util.rs
, I can pull in the crate if you like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, hah! If you move this code into the existing crate then you can just reuse that code, I guess. I'm certainly not going to hold you to a higher standard than the existing code. :)
sccache-dist/src/cache.rs
Outdated
.truncate(true) | ||
.open("/tmp/toolchain_cache.tar")?; | ||
create(file)?; | ||
// TODO: after, if still exists, remove it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like something that should definitely be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refined caching and scheduling are a slightly later PR.
sccache-dist/src/cache.rs
Outdated
return Ok(strong_key) | ||
} | ||
debug!("Weak key {} appears to be new", weak_key); | ||
// TODO: don't use this as a poor exclusive lock on this global file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be an exclusive lock? You've already got a mutex around the LRUDiskCache
here. If not, using a tempfile would be much nicer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a tempfile (for now)
sccache-dist/src/cache.rs
Outdated
pub struct ClientToolchainCache { | ||
cache_dir: PathBuf, | ||
cache: Mutex<TcCache>, | ||
// Local machine mapping from 'weak' hashes to strong toolchain hashes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some exposition on what weak and strong hashes are would be helpful. I gather that the strong hash is a SHA512 of the toolchain tar file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the explanation that 'weak' is just what sccache uses as part of the cache key.
81785e5
to
191a45b
Compare
7f36f9c
to
0f799cb
Compare
Some(_) => { | ||
warn!("Scheduler address configured but dist feature disabled, disabling distributed sccache"); | ||
Arc::new(dist::NoopClient) | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit has added in light of the new preprocessing rules, to use -P
in more cases when it's possible.
0f799cb
to
8fb37aa
Compare
// With -fprofile-generate line number information is important, so don't use -P. | ||
if !parsed_args.profile_generate { | ||
if !may_dist && !parsed_args.profile_generate { | ||
cmd.arg("-P"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this back in and explained why it's important for a distributed compile
Are you ok to wait for the rest of my PRs for the TODOs to be resolved? |
Yes. The separate crate part was the only thing I was really hung up on. I'll make sure I've given the rest of the code here a once-over today and we should be good to merge this. |
Cargo.toml
Outdated
[[bin]] | ||
name = "sccache-dist" | ||
required-features = ["dist-server"] | ||
path = "sccache-dist/main.rs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit unconventional to have source in a path that's not under src/
, isn't it?
@@ -202,6 +202,23 @@ impl LruDiskCache { | |||
}) | |||
} | |||
|
|||
/// Add a file by calling `with` with the open `File` corresponding to the cache at path `key`. | |||
pub fn insert_with<K: AsRef<OsStr>, F: FnOnce(File) -> io::Result<()>>(&mut self, key: K, with: F) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure on the precise differences of this vs. insert_by
. Can this just call insert_by
, passing a closure that opens the file and calls with(f)
?
Added two new commits, one to put sccache-dist in a conventional location (and remove explicit paths to the main.rs files), one to simplify insert_with (this had some minor knock-on effects in insert_by as we don't have the size immediately available). I also took the opportunity to clean up a couple of things in add_file, as well as add a |
An initial version that works and can actually be used to at least try it out.