-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --build-plan for 'cargo build' #5301
Conversation
cc @aidanhs |
Awesome @mshal! I really like how this looks now! I think we need to be really careful not to expose internal implementation details of Cargo, and this approach achieves this almost perfectly. I especially like how
Regarding the implementation itself, it looks great! We use slightly different code style in Cargo though, so you might want to It would also be great to have some tests for this as well! I think we should create a cargo/tests/testsuite/build.rs Line 4024 in b0d65af
[..] and {..} are used as wildcards to match arbitrary strings/objects to get around platform specific variations.
|
add a rustfmt detail to CONTRIBUTING.md stealing from the info in #5301 (comment)
☔ The latest upstream changes (presumably #5348) made this pull request unmergeable. Please resolve the merge conflicts. |
Sorry, I'm afraid my refactoring in #5348 has bitrotted this quite a bit. :( What would be a nice side goal for this (though I hope that doesn't complicate it too much) is if cargo itself is refactored such that it goes through (a) generating the build plan, and then (b) executing from the build plan. That would make it more robust in the sense that we know the cargo build itself is not inadvertently relying on anything but the build plan. That's not in this PR, as far as I can see, right? |
I was trying to make the buildkey to be something human-readable, so for example you would be able to see that a "app_units v0.6.0 Target(lib) Profile(build) Target" depends on "num-traits v0.1.43 Target(lib) Profile(build) Target" and "serde v1.0.37 Target(lib) Profile(build) Target" by looking at the generated JSON output. This also gives the external build system some indication of what it is building, so it can use the key when displaying what it's building. Eg: "Now building app_units v0.6.0 Target(lib) Profile(build) Target". Note this doesn't need to be consistent among cargo versions, as long as the buildkey is unique for different modules (including host vs. target vs. build-script iterations of a particular cargo module) during a single 'cargo --build-plan' invocation. I'm happy to use another naming scheme though if there's a better way to get this information.
Sorry I don't quite follow this. What target are you referring to? Is this the --target flag in the json output? Or the self.target in the buildkey definition? I don't think it matters if the output location changes, since the "outputs" field of the build plan explicitly lists where the file will be created. So if cargo decides things need to be in a different directory, a new 'cargo --build-plan' invocation will list a different output for a module, where the new file can be created.
I'm pretty sure this is necessary, otherwise the external build system won't know where to create the symlink for the "installed" version of the file.
Ahh, thanks for the heads up! I've used this on the files I've changed, though I had to use '--toolchain stable' instead of '+stable'.
Good idea, I'll add some tests to get at least some basic coverage. |
I like that idea, but unfortunately that's a bit beyond my level of expertise at the moment. You are correct that it is not in this PR. Maybe we can file a followup issue? I'm not really sure what would be involved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to use another naming scheme though if there's a better way to get this information.
Would it be possible perhaps to do rules name foo-h145135135
? The stringified version may not be entirely lossless in terms of capturing all the nuances that Cargo has, but hashing everything will for sure catch everything. I'm not sure if that's too unreadable of a target though? We could certainly work to make the prefix string pretty human readable.
What target are you referring to? Is this the --target flag in the json output? Or the self.target in the buildkey definition?
Ah I think @matklad means things like CARGO_TARGET_DIR
and such variables to tweak the output. I'd imagine, though, that the current implementation is fine. I see build plans as a sort of "build me a plan for this exact one invocation" rather than "build me a plan which is configurable at build-time in the same way Cargo is". For the latter case I think you'd want to generate multiple plans or just use Cargo.
I'm pretty sure this is necessary, otherwise the external build system won't know where to create the symlink for the "installed" version of the file.
Yeah this is something we'll definitely want, otherwise Cargo I believe will only generate files with unreadable hashes!
I think we'll probably eventually need some more treatment to handle build scripts but I'm fine to defer that to later!
program: String, | ||
args: Vec<String>, | ||
env: BTreeMap<String, String>, | ||
cwd: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think PathBuf
implements Serialize
, so perhaps that could be used here instead of String
to avoid some unwraps and to_str
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like PathBuf works fine for cwd- thanks!
Hm, interesting! I think it makes sense to also make it possible to connect build plan with other pieces of JSON information:
I have a slight suspicion that we are not on the same page here, so let me ask a clarifying question: Am I correct that The external build system would probably want to put final artifacts elsewhere, so the |
@matklad hm a good question. So I know a lot of artifacts are |
Hm, so I've made an experiment by disabling all linking, and looks like it is required at least for build scripts:
|
Heh interesting! We could probably fix Cargo to not actually require |
In order to get the test cases working, I had to change the key to be something other than the buildkey, so I've tried to implement @matklad's HashMap suggestion. (The with_json() function doesn't seem to let me wildcard keys with things like "foo [..]", only values). I couldn't get it to work with Unit since we don't have a Unit available during the drain_the_queue() function where I call plan.update(), so I've left it as a buildkey String for now. If there's a good way to fix get_id() to use a Unit everywhere, I could use some suggestions. The previous key is available as "display_name" in the Module structure so it is more clear what it is used for.
Yes, exactly. If the cargo configuration changes (via Cargo.toml changes, environment variable changes, etc), we'd run cargo --build-plan to generate a new plan.
Perhaps? It does look similar, though I don't have a full grasp of the overall architecture to know if they would easily be combined. |
The new push includes:
|
I forgot to ask:
"0": { ... } And inside the module, the deps: Vec show up like this:
It would be helpful if they were either both quoted, or both not quoted, if possible. |
In Cargo, we guarantee that JSON does not have internal linebreaks. We need it because we use JSON object per line format
JSON requires keys to be always quoted. If you use numeric keys, perhaps you can use an array instead of a |
|
||
pub fn add_output(&mut self, path: &PathBuf, link: &Option<PathBuf>) { | ||
self.outputs.push(path.clone()); | ||
if link.is_some() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be if let Some(ref link) = *link
to avoid unwrap
down the line.
.expect("unicode program string required") | ||
.to_string() | ||
.clone(); | ||
self.cwd = cmd.get_cwd().unwrap().to_path_buf(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely sure we set cwd
in call cases. Let's change cwd: PathBuf,
to cwd: Option<PathBuf>,
?
var.clone(), | ||
value | ||
.as_ref() | ||
.expect("environment value required") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this expect
might panic sometimes. We use None
to signify that the environment variable should be cleared, and by default we inherit env vars.
#[derive(Debug, Serialize)] | ||
pub struct BuildPlan { | ||
module_map: BTreeMap<String, usize>, | ||
modules: BTreeMap<usize, Module>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make this modules: Vec<Module>
, and then, in get_id
, we could push a Module::default
to this Vec
. This will require derive(Default)
as well.
"{} {} {} {:?}", | ||
self.pkg, self.target, self.profile, self.kind | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's switch to @alexcrichton suggestions with hashes? It could look like this I think:
impl<'a> Unit<'a> {
pub fn buildkey(&self) -> String {
#![allow(deprecated)] // for SipHash
use std::hash::{Hash, Hasher, SipHasher};
let mut hasher = SipHasher::new_with_keys(0, 0);
self.hash(&mut hasher);
format!("{}-{}", self.pkg.name(), hasher.finish())
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with @matklad on this, getting all the info into the key through the hash I think would be best here to ensure we don't forget to include anything
|
||
#[derive(Debug, Serialize)] | ||
struct Module { | ||
display_name: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of display_name
, I suggest including
pub package_id: PackageId,
pub target: Target,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this conflicts with the derive(Default) suggestion above, since PackageId and Target both don't have defaults. Or is there an easy way to set those somehow?
Separately, I don't think just PackageId and Target are sufficient for differentiating between a build script and its corresponding run action, right? Don't we need the Profile for that as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true they won't work with Default
, yeah, but could Default
be removed? Using PackageId
, Target
, and Profile
more closely match what Cargo is internally doing and using to represent data structures which is probably the best way to go for now to ensure it's all lossless
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not sure - the Default attribute came from the feedback above where we push a default Module in get_id(). Is there a better way to re-work that to get PackageId and such in there?
☔ The latest upstream changes (presumably #5384) made this pull request unmergeable. Please resolve the merge conflicts. |
I've addressed most of the review feedback. The remaining issues involve the display_name hashing and storing elements of Unit in the Module. I'm a bit stumped on how to resolve it with Default, and also how to get a Unit into the update() method so that we can call get_id() and update a Module's outputs. I also updated the BuildPlan to include a list of inputs, so an external build system will know which files cargo accessed so it can determine if the plan is out of date with respect to the input Cargo.toml files. This seemed to be fairly straightforward, but this section should also be reviewed. After rebasing to the latest master, I noticed that cargo can no longer build mozilla-central. It fails with:
I bisected it down to commit 4966f53. Ted poked around with it - it looks like it now finds the last dependency instead of the first, and serde lists serde_derive twice (once in dependencies and once in dev-dependencies), so that commit changes which one it tries to use. |
Is there a better way to handle the tests in a cross-platform way than what I have? Right now they are listed as [cfg(unix)] due to the paths in the JSON output and '/' not matching '' on Windows. I also had to wildcard out some of the outputs because of the additional dSYM file in OSX, but I'd prefer if I could list the expected outputs for the platform somewhere and feed that into the expected JSON. |
☔ The latest upstream changes (presumably #5432) made this pull request unmergeable. Please resolve the merge conflicts. |
tests/testsuite/build_plan.rs
Outdated
.build(); | ||
|
||
assert_that( | ||
p.cargo("build").arg("--build-plan"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah one thing I've forgotten when reviewing is that we'll want the --build-plan
argument to initially be unstable. This can be done by either renaming it to -Z build-plan
or leaving it as this argument but also requiring -Z unstable-options
to be passed to enable the argument
use serde_json; | ||
|
||
#[derive(Debug, Default, Serialize)] | ||
struct Module { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a better name for this perhaps be something like Compilation
or Invocation
? A "module" in Rust is pretty different from what this is I believe
📌 Commit 4016bac has been approved by |
@bors: r=matklad |
📌 Commit a071c3a has been approved by |
Add --build-plan for 'cargo build' With 'cargo build --build-plan', cargo does not actually run any commands, but instead prints out what it would have done in the form of a JSON data structure. Fixes #3815
☀️ Test successful - status-appveyor, status-travis |
Regression fixes: - rust-lang/cargo#5503 - cargo rustc broken for tests in project with bins - rust-lang/cargo#5520 - (rust-lang#50640) shared proc-macro dependency built incorrectly Changes: - rust-lang/cargo#5522 - Add option to set user-agent - rust-lang/cargo#5519 - NFC: fix a couple of typos, found by codespell. - rust-lang/cargo#5513 - Fix `panic` for binaries built during tests. - rust-lang/cargo#5512 - simplify build_requirements - rust-lang/cargo#5301 - Add --build-plan for 'cargo build' - rust-lang/cargo#5460 - Be more conservative about which files are linked to the output dir. - rust-lang/cargo#5509 - Use the new stable - rust-lang/cargo#5507 - Does not print seconds fraction with minutes - rust-lang/cargo#5498 - Bump to 0.29.0 - rust-lang/cargo#5497 - Mention +nightly in ARCHITECTURE.md
Update Cargo Regression fixes: - rust-lang/cargo#5503 - cargo rustc broken for tests in project with bins - rust-lang/cargo#5520 - (rust-lang#50640) shared proc-macro dependency built incorrectly Changes: - rust-lang/cargo#5522 - Add option to set user-agent - rust-lang/cargo#5519 - NFC: fix a couple of typos, found by codespell. - rust-lang/cargo#5513 - Fix `panic` for binaries built during tests. - rust-lang/cargo#5512 - simplify build_requirements - rust-lang/cargo#5301 - Add --build-plan for 'cargo build' - rust-lang/cargo#5460 - Be more conservative about which files are linked to the output dir. - rust-lang/cargo#5509 - Use the new stable - rust-lang/cargo#5507 - Does not print seconds fraction with minutes - rust-lang/cargo#5498 - Bump to 0.29.0 - rust-lang/cargo#5497 - Mention +nightly in ARCHITECTURE.md
Unblocking PRs: - rust-lang/cargo#5535 - Ignore <tab> in libtest output. (unblocks rust-lang#50387) - rust-lang/cargo#5537 - Remove -Zno-trans test. (unblocks rust-lang#50615) - rust-lang/cargo#5540 - Fix tests when CARGO_TARGET_DIR is set. (unblocks self) Regression fixes: - rust-lang/cargo#5503 - cargo rustc broken for tests in project with bins - rust-lang/cargo#5520 - (rust-lang#50640) shared proc-macro dependency built incorrectly Changes: - rust-lang/cargo#5527 - Point Source Replacement to the Overriding Dependencies section - rust-lang/cargo#5533 - Detail how to run locally-built nightly cargo - rust-lang/cargo#5522 - Add option to set user-agent - rust-lang/cargo#5519 - NFC: fix a couple of typos, found by codespell. - rust-lang/cargo#5513 - Fix `panic` for binaries built during tests. - rust-lang/cargo#5512 - simplify build_requirements - rust-lang/cargo#5301 - Add --build-plan for 'cargo build' - rust-lang/cargo#5460 - Be more conservative about which files are linked to the output dir. - rust-lang/cargo#5509 - Use the new stable - rust-lang/cargo#5507 - Does not print seconds fraction with minutes - rust-lang/cargo#5498 - Bump to 0.29.0 - rust-lang/cargo#5497 - Mention +nightly in ARCHITECTURE.md
Update Cargo Unblocking PRs: - rust-lang/cargo#5535 - Ignore <tab> in libtest output. (unblocks rust-lang#50387) - rust-lang/cargo#5537 - Remove -Zno-trans test. (unblocks rust-lang#50615) - rust-lang/cargo#5540 - Fix tests when CARGO_TARGET_DIR is set. (unblocks self) Regression fixes: - rust-lang/cargo#5503 - cargo rustc broken for tests in project with bins - rust-lang/cargo#5520 - (rust-lang#50640) shared proc-macro dependency built incorrectly Changes: - rust-lang/cargo#5527 - Point Source Replacement to the Overriding Dependencies section - rust-lang/cargo#5533 - Detail how to run locally-built nightly cargo - rust-lang/cargo#5522 - Add option to set user-agent - rust-lang/cargo#5519 - NFC: fix a couple of typos, found by codespell. - rust-lang/cargo#5513 - Fix `panic` for binaries built during tests. - rust-lang/cargo#5512 - simplify build_requirements - rust-lang/cargo#5301 - Add --build-plan for 'cargo build' - rust-lang/cargo#5460 - Be more conservative about which files are linked to the output dir. - rust-lang/cargo#5509 - Use the new stable - rust-lang/cargo#5507 - Does not print seconds fraction with minutes - rust-lang/cargo#5498 - Bump to 0.29.0 - rust-lang/cargo#5497 - Mention +nightly in ARCHITECTURE.md
Unblocking PRs: - rust-lang/cargo#5535 - Ignore <tab> in libtest output. (unblocks rust-lang#50387) - rust-lang/cargo#5537 - Remove -Zno-trans test. (unblocks rust-lang#50615) - rust-lang/cargo#5540 - Fix tests when CARGO_TARGET_DIR is set. (unblocks self) Regression fixes: - rust-lang/cargo#5503 - cargo rustc broken for tests in project with bins - rust-lang/cargo#5520 - (rust-lang#50640) shared proc-macro dependency built incorrectly Changes: - rust-lang/cargo#5527 - Point Source Replacement to the Overriding Dependencies section - rust-lang/cargo#5533 - Detail how to run locally-built nightly cargo - rust-lang/cargo#5522 - Add option to set user-agent - rust-lang/cargo#5519 - NFC: fix a couple of typos, found by codespell. - rust-lang/cargo#5513 - Fix `panic` for binaries built during tests. - rust-lang/cargo#5512 - simplify build_requirements - rust-lang/cargo#5301 - Add --build-plan for 'cargo build' - rust-lang/cargo#5460 - Be more conservative about which files are linked to the output dir. - rust-lang/cargo#5509 - Use the new stable - rust-lang/cargo#5507 - Does not print seconds fraction with minutes - rust-lang/cargo#5498 - Bump to 0.29.0 - rust-lang/cargo#5497 - Mention +nightly in ARCHITECTURE.md
Update Cargo Unblocking PRs: - rust-lang/cargo#5535 - Ignore tab in libtest output. (unblocks #50387) - rust-lang/cargo#5537 - Remove -Zno-trans test. (unblocks #50615) - rust-lang/cargo#5540 - Fix tests when CARGO_TARGET_DIR is set. (unblocks self) Regression fixes: - rust-lang/cargo#5503 - cargo rustc broken for tests in project with bins - rust-lang/cargo#5520 - shared proc-macro dependency built incorrectly Changes: - rust-lang/cargo#5527 - Point Source Replacement to the Overriding Dependencies section - rust-lang/cargo#5533 - Detail how to run locally-built nightly cargo - rust-lang/cargo#5522 - Add option to set user-agent - rust-lang/cargo#5519 - NFC: fix a couple of typos, found by codespell. - rust-lang/cargo#5513 - Fix `panic` for binaries built during tests. - rust-lang/cargo#5512 - simplify build_requirements - rust-lang/cargo#5301 - Add --build-plan for 'cargo build' - rust-lang/cargo#5460 - Be more conservative about which files are linked to the output dir. - rust-lang/cargo#5509 - Use the new stable - rust-lang/cargo#5507 - Does not print seconds fraction with minutes - rust-lang/cargo#5498 - Bump to 0.29.0 - rust-lang/cargo#5497 - Mention +nightly in ARCHITECTURE.md The PR fixes #50640.
Does this need a tracking issue? |
Sure yeah! @jonas-schievink want to write one up? |
Sure, opened #5579 to track this |
Awesome, thanks! |
With 'cargo build --build-plan', cargo does not actually run any
commands, but instead prints out what it would have done in the form of
a JSON data structure.
Fixes #3815