Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement cargo:rerun-if-env-changed=FOO #4125

Merged
merged 1 commit into from
Jun 15, 2017

Conversation

alexcrichton
Copy link
Member

This commit implements a new method of rerunning a build script if an
environment variable changes. Environment variables are one of the primary
methods of giving inputs to a build script today, and this'll help situations
where if you change an env var you don't have to remember to clean out an old
build directory to ensure fresh results.

Closes #2776

@rust-highfive
Copy link

r? @brson

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton
Copy link
Member Author

r? @matklad

}

struct MtimeSlot(Mutex<Option<FileTime>>);
struct VarSlot(Mutex<Option<String>>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, why do we need need to compute env-based fingerprints lazily? That is, why do we need Mutex here?

For MtimeSlot is needed because filesystem state will change after cargo is run, so we need to fetch mtime late.

But env::var(var) does not change, so we can eagerly calculate it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh now I'm not even sure how I ended up having it implemented this way... In any case I believe you're 100% correct!

local.push(LocalFingerprint::MtimeBased(mtime, output.clone()));
}

for var in deps.rerun_if_env_changed.iter() {
Copy link
Member

@matklad matklad Jun 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's sort env vars here to avoid dependency on the order of environment variables?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I don't believe this should be necessary, the rerun_if_env_changed has a deterministic ordering of whatever order the build script printed out, so the order here shouldn't change over time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But there's a slim chance that buildscript itself prints them non-deterministically (like iterating via HashMap). But yeah, that probably doesn't matter much.

@matklad
Copy link
Member

matklad commented Jun 7, 2017

This is so much better than than just manually cargo cleaning! However, there's still a rather nasty failure mode: if one reads an env variable in build.rs or some child build process, but does not report it via env_var_changed, then one gets stale artifacts and non deterministic builds. And we can't detect this situation and issue a warning either

Would it be feasible to take a more proactive stance towards env vars? Could we perhaps require to white-list all environment variables and execute build.rs in a clean env?

@alexcrichton
Copy link
Member Author

It's true yeah that there's a footgun that you forget an env var, but that's no different from today, no? All our builds basically can't be guaranteed reproducible b/c you still have access to things like the network. In that sense I see this as a feature for proactive authors but not bulletproof.

Now I'd totally be on board with some sort of whitelisting solution, but that's got backwards compatibility hazards with it. It's just something I haven't thought about and wouldn't know how to do immediately at least :(

@matklad
Copy link
Member

matklad commented Jun 7, 2017

Now I'd totally be on board with some sort of whitelisting solution, but that's got backwards compatibility hazards with it. It's just something I haven't thought about and wouldn't know how to do immediately at least :(

A strawman proposal would be to add a build-env field to Cargo.toml which lists env vars names. If cargo runs a build script and build-env is defined, it cleans the env, leaving only cargo specific variables and those listed in build-env, calculating the fingerprint along the way. If build-env is not defined, than Cargo works exactly as today.

@retep998
Copy link
Member

retep998 commented Jun 8, 2017

@matklad Note that wiping all environment variables causes things to break on Windows. rust-lang/rust#31259

@matklad
Copy link
Member

matklad commented Jun 8, 2017

@alexcrichton raised an interesting point that even if we whitelist env in Cargo.toml, we might still need rerun-if-env-changed to specify on which subset of vars we actually depend. For example, a build.rs for retrieving a native libfoo dependency might use both a LIB_FOO var to get compiled libfoo and CC to compile libfoo by itself, but only if LIB_FOO is not provided.

So in some sense, rerun-if-env-changed and whitelisting of env vars is orthogonal. I think we should merge this as is then, the implementation and tests look great!

@alexcrichton
Copy link
Member Author

Ah yes sorry I meant to write up a comment as well, but looks like @matklad beat me to it!

r? @matklad

if !outputs[&key].rerun_if_changed.is_empty() {
let slot = MtimeSlot(Mutex::new(None));
fingerprint.local = LocalFingerprint::MtimeBased(slot,
output_path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand what happens here, but looks like the logic before was

if we need a costly fingerprint update operation:
   update_local()
       do costly operation

and now it is like

if we need a costly fingerprint update operation:
   update_local()
       if we need a costly fingerprint update operation:
           do costly operation

That is, looks like this if and hash_busted variable from the update_local function are trying to do the same thing in two different places. Perhaps we can get rid of this outer if?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm so due to the way that this works I think we can actually avoid calling update_local entirely, the call in local_fingerprints_deps should already load the mtime and new env vars which should be all that's needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er actually I had to keep update_local()?. I don't know why and don't quite have time to investigate right now. I'm also not sure I understand your comment though, can you elaborate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure I understand your comment though, can you elaborate?

It was just a gut feeling that something's wrong about this if. However today I have a more concrete question :)

This test currently fails, and I believe that it should pass (according to the current docs):

#[test]
fn rerun_if_only_file_changes() {
    let p = project("foo")
        .file("Cargo.toml", r#"
            [package]
            name = "foo"
            version = "0.5.0"
            authors = []
        "#)
        .file("src/main.rs", r#"
            fn main() {
                println!("Hello, World");
            }
        "#)
        .file("build.rs", r#"
            fn main() {
                println!("cargo:rerun-if-env-changed=FOO");
            }
        "#)
        .file("foo", "");
    p.build();

    assert_that(p.cargo("build"),
                execs().with_status(0)
                       .with_stderr("\
[COMPILING] foo v0.5.0 ([..])
[FINISHED] [..]
"));
    sleep_ms(1000);
    File::create(&p.root().join("some-new-file")).unwrap();

    File::create(p.root().join("foo")).unwrap();
    assert_that(p.cargo("build"),
                execs().with_status(0)
                       .with_stderr("\
[COMPILING] foo v0.5.0 ([..])
[FINISHED] [..]
"));
}

If build.rs does not produce any rerun-if-changed then we promise that it will be rerun on any change inside the build directory, which is different from the case when build.rs specifies an empty set of files in rerun-if-changed.

Looks like currently specifying only rerun-if-env-changed implies empty rerun-if-changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, if you comment out println!("cargo:rerun-if-env-changed=FOO"); in the test, it passes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah it was originally my intention that the test you gisted here should fail. In other words specifying any rerun-if-*-changed is sufficient for telling Cargo "I've told you about all of my dependencies"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable! It's probably worth mentioning in the docs more explicitly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me! I've added a clause to the docs.

@alexcrichton
Copy link
Member Author

Updated!

This commit implements a new method of rerunning a build script if an
environment variable changes. Environment variables are one of the primary
methods of giving inputs to a build script today, and this'll help situations
where if you change an env var you don't have to remember to clean out an old
build directory to ensure fresh results.

Closes rust-lang#2776
@matklad
Copy link
Member

matklad commented Jun 15, 2017

#bors r+

@matklad
Copy link
Member

matklad commented Jun 15, 2017

@bors r+

@bors
Copy link
Collaborator

bors commented Jun 15, 2017

📌 Commit fe8bbb7 has been approved by matklad

@bors
Copy link
Collaborator

bors commented Jun 15, 2017

⌛ Testing commit fe8bbb7 with merge b4b7ed5...

bors added a commit that referenced this pull request Jun 15, 2017
Implement `cargo:rerun-if-env-changed=FOO`

This commit implements a new method of rerunning a build script if an
environment variable changes. Environment variables are one of the primary
methods of giving inputs to a build script today, and this'll help situations
where if you change an env var you don't have to remember to clean out an old
build directory to ensure fresh results.

Closes #2776
@bors
Copy link
Collaborator

bors commented Jun 15, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: matklad
Pushing b4b7ed5 to master...

@bors bors merged commit fe8bbb7 into rust-lang:master Jun 15, 2017
@alexcrichton alexcrichton deleted the rerun-if-env-changed branch June 15, 2017 18:42
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jun 30, 2017
…ulacrum

rustc_llvm: re-run build script when env var LLVM_CONFIG changes

This removes the changes done in rust-lang#42429 and use the newly introduced `cargo:rerun-if-env-changed` in rust-lang/cargo#4125.
As `LLVM_CONFIG` env var points to the `llvm-config` and changes when it gets configured in `config.toml` or removed from it, we can re-run the build script if this env var changes.

closes rust-lang#42444

r? @alexcrichton
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jun 30, 2017
…ulacrum

rustc_llvm: re-run build script when env var LLVM_CONFIG changes

This removes the changes done in rust-lang#42429 and use the newly introduced `cargo:rerun-if-env-changed` in rust-lang/cargo#4125.
As `LLVM_CONFIG` env var points to the `llvm-config` and changes when it gets configured in `config.toml` or removed from it, we can re-run the build script if this env var changes.

closes rust-lang#42444

r? @alexcrichton
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jun 30, 2017
…ulacrum

rustc_llvm: re-run build script when env var LLVM_CONFIG changes

This removes the changes done in rust-lang#42429 and use the newly introduced `cargo:rerun-if-env-changed` in rust-lang/cargo#4125.
As `LLVM_CONFIG` env var points to the `llvm-config` and changes when it gets configured in `config.toml` or removed from it, we can re-run the build script if this env var changes.

closes rust-lang#42444

r? @alexcrichton
bors added a commit that referenced this pull request Jan 2, 2019
Fix fingerprint calculation for patched deps.

If you have A→B→C where B and C are in a registry, and you `[patch]` C, the fingerprint calculation wasn't working correctly when C changes. The following sequence illustrates the problem:

1. Do a build from scratch.
2. Touch a file in C.
3. Build again. Everything rebuilds as expected.
4. Build again. You would expect this to be all fresh, but it rebuilds A.

The problem is the hash-busting doesn't propagate up to parents from dependencies. Normal targets normally aren't a problem because they have a `LocalFingerprint::MtimeBased` style local value which always recomputes the hash. However, registry dependencies have a `Precalculated` style local value which never recomputes the hash.

The solution here is to always recompute the hash. This shouldn't be too expensive, and is only done when writing the fingerprint, which should only happen when the target is dirty. I'm not entirely certain why the caching logic was added in #4125.

Fixes rust-lang/rust#57142
@ehuss ehuss added this to the 1.20.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a way for build scripts to be re-run if specific environment variables change
7 participants