Rewrite miri script in rust #2909

osiewicz · 2023-05-31T19:17:36Z

This is a sketch of a rewrite of miri script in Rust. It does not include changes made in #2908 yet. Environment variables are not properly propagated yet, which is something I plan to address.

This PR is mostly a heads-up about the ongoing effort and it's state.
It's definitely not the cleanest code I've seen in my life, but my first goal was feature/interface parity. I will iterate on it a bit before marking it as ready.

I wonder though how this should be integrated/tested. Are you aware of anyone using ./miri in their scripts?
I guess we should keep existing miri script in place and let it run miri-script package directly?

CI should probably cargo check this package as well.

Fixes #2883

RalfJung · 2023-06-08T09:39:11Z

Thanks for the PR! Someone will take a look and review this eventually, we're just all busy currently -- sorry for the delay.

osiewicz · 2023-06-08T09:42:32Z

Hey,
No problem - that's totally understandable. I intend to polish this before marking this PR as ready for review. Sadly I couldn't find time for that (or anything, really) recently.

Miri script works as is, we don't need to rush this in.

saethlin · 2023-06-09T22:22:58Z

I wonder though how this should be integrated/tested.

FWIW I'm not particularly worried about testing this code. Obviously it would be nice if it had good tests, but a lot of this is mucking about with the state of "the system", and the parts of this that really matter will get exercised regularly by us. Write tests if they help you build confidence that this works. But I think it would be easy to go overboard and spend more effort on the test suite than the tool.

Are you aware of anyone using ./miri in their scripts?

I don't think this is particularly worth worrying about.

I guess we should keep existing miri script in place and let it run miri-script package directly?

Yes. ./miri is much shorter than cargo run --manifest-path=miri-script/Cargo.toml -- . It would be neat if the script were turned into the thinnest wrapper around that. User workflows could be unaffected, the CI setup could stay as-is, and so on.

saethlin · 2023-06-09T22:25:01Z

miri-script/Cargo.toml

+authors = ["Miri Team"]
+description = "Helpers for miri maintenance"
+license = "MIT OR Apache-2.0"
+name = "miri"


If someone cargo installs this, unless they have a creative setup, this will replace the rustup proxy miri executable and break their install (this situation is not trivial to recover from and really frustrates people). The bin target here should be called something else, or otherwise prevent that situation.

Great point, I'll rename binary to miri-script then.

saethlin · 2023-06-09T22:31:11Z

miri-script/src/arg.rs

+    Install {
+        /// Flags that are passed through to `cargo install`.
+        #[arg(trailing_var_arg = true)]
+        flags: Vec<String>,


Is String the right type to use for a passed-through CLI argument? If the arguments are literally just passed through, I think OsString is better.

miri-script/src/commands.rs

saethlin · 2023-06-09T23:04:45Z

miri-script/src/commands.rs

+            .filter(|val| !val.is_empty())
+            .unwrap_or_else(|| {
+                let target_dir = path!(ret.miri_dir / "target");
+                target_dir.to_string_lossy().to_string()


Same comment on all the use of to_string_lossy as I had on pass-through CLI arguments. Perhaps that means we also need env::var_os?

saethlin · 2023-06-09T23:06:08Z

I'm done leaving comments. I think this is an amazing start, and I'd love to try it out when the problem with unset CARGO_EXTRA_FLAGS is addressed.

RalfJung

This is a great start indeed, thanks a lot. :) I left a bunch of notes from a first pass. My main concern is verbosity: this is a lot longer than the shell script it is porting. I think this can be reduced be avoiding all the conversion from paths to strings and whatnot -- no such conversion is needed when working with Command directly, I hope we can also make that work here.

Also command.rs could use some splitting up into multiple files: all the cargo-wrapping stuff, the pull/push stuff, and the other stuff (fmt, bench, many-seeds, toolchain)

As usual please don't force-push this branch, but add new commits, so that I can see more easily what changed compared to last time.

RalfJung · 2023-06-10T11:16:01Z

miri-script/Cargo.toml

+which = "4.4"
+walkdir = "2.3"
+itertools = "0.10"
+path_macro = "1.0"


missing newline at end of file

RalfJung · 2023-06-10T11:17:06Z

miri-script/src/arg.rs

+    Bless {
+        /// Flags that are passed through to `cargo test`.
+        #[arg(trailing_var_arg = true)]
+        flags: Vec<String>,
+    },


FWIW, what we really want is ./miri test --bless; that just was too annoying to implement with the shell script.

Feel free to just leave a FIXME to do this later.

Nah, it's fine - I've thought that --bless would be better than having a separate command for it, but I wanted to stay as close to script as possible. If we're fine with such "breaking" changes, that's fine with me.

RalfJung · 2023-06-10T11:17:26Z

miri-script/src/arg.rs

+    },
+    /// Build miri, set up a sysroot and then run the driver with the given <flags>.
+    /// (Also respects MIRIFLAGS environment variable.)
+    RunDep {


This should probably also eventually become ./miri run --dep or so.

RalfJung · 2023-06-10T11:19:19Z

miri-script/src/main.rs

+    }
+    fn get_auto_config(&self, miri_dir: &Path) -> Option<AutoConfig> {
+        use std::env::VarError;
+        if let Err(VarError::NotPresent) = std::env::var("MIRI_AUTO_OPS") {


Just use std::env::var_os(...).is_none()

RalfJung · 2023-06-10T11:20:22Z

miri-script/src/main.rs

+    fn get_auto_config(&self, miri_dir: &Path) -> Option<AutoConfig> {
+        use std::env::VarError;
+        if let Err(VarError::NotPresent) = std::env::var("MIRI_AUTO_OPS") {
+            std::env::set_var("MIRI_AUTO_OPS", "42");


This shouldn't be needed any more; the shell script did this because it recursively invoked itself to run the auto-ops, but we do no such nonsense here.

miri-script/src/commands.rs

RalfJung · 2023-06-10T11:47:54Z

miri-script/src/commands.rs

+            let toolchain = self.get_toolchain_override();
+            let command =
+                format!("cargo {} miri run --manifest-path {current_bench_dir}", toolchain);
+            cmd("hyperfine", ["-w", "1", "-m", "5", "--shell=none", &command]).run()?;


Note that this changed in the shell script recently, so the rust version needs updating

RalfJung · 2023-06-10T11:50:01Z

miri-script/src/commands.rs

+            let commit_hash_line = rustc_info
+                .lines()
+                .find(|line| line.starts_with("commit-hash: "))
+                .expect("rustc output does not contain commit-hash line");
+            commit_hash_line
+                .split_whitespace()
+                .nth(1)
+                .expect("rustc commit-hash line is malformed")
+                .trim()
+                .to_owned()


I think rustc_version can replace this ad-hoc parser

miri-script/src/commands.rs

miri-script/src/main.rs

miri-script/src/commands.rs

RalfJung · 2023-06-16T14:11:26Z

miri-script/src/commands.rs

-                self.many_seeds(&command, *seed_start, *seeds),
+            Subcommands::ManySeeds { command, seed_start, seeds } => {
+                self.many_seeds(&command, *seed_start, *seeds)
+            }


You keep changing back and forth between having and not having braces here in every other commit. That makes the commit-by-commit review unnecessarily noisy. What is going on? Almost looks like you are running rustfmt without applying Miri's rustfmt.toml sometimes, and sometimes with the toml?

Yup, I use multiple setups so it might be the case. I am sorry, I'll try to keep it consistent.

Whatever you are using to format your checkout should be able to pull in Miri's rustfmt.toml. Sometimes editors have difficulty with Miri or rustc because they aren't just standard Cargo projects. Can you take a skim over the CONTRIBUTING.md and see if anything there looks helpful? If there isn't, can you maybe open an issue or start a topic on the Zulip so we can figure out what Miri can do to improve this?

miri-script/src/commands.rs

RalfJung

xshell was a great choice, this looks a lot better. :)

miri

miri-script/src/commands.rs

RalfJung · 2023-06-28T15:53:04Z

miri-script/src/commands.rs

+}
+
+#[derive(Clone, Debug, Default)]
+struct Extras {


This name is not very descriptive. Why is this a separate type anyway?

I did it to essentially ensure that these variables are always available in env - and not have to deal with environment lookup in code paths that do use it. My stream of thought was "if we have a valid Extras object, sys root and co were obtained successfully".

And yeah, the name sucks.

OTOH, MiriRunner uses Extras::default in its new function, which sucks even more. Some of the member functions might see an empty sys root/libdir.

I think this is a problem that we had in the shell script as well: these variables need to be determined after the auto-toolchain update happened.

I certainly wouldn't want them to be in env, I was expecting them to be fields in MiriRunner. But we don't have the values always available.

libdir is only needed exactly once. I think we can just remove the field, and instead have a fn libdir that computes the libdir.

For sysroot I propose a cache: have a field sysroot: Option<PathBuf> in MiriRunner, and a fn sysroot() that computes the sysroot and caches the result in self.sysroot. Then we just need to make the toolchain command reset that cache and we should be good, I think.

Yup, these are good ideas. And yeah, I agree that neither sysroot nor libdir should be in env.

RalfJung · 2023-06-28T15:54:09Z

miri-script/src/commands.rs

+        let config = command.get_auto_config(&miri_dir).filter(|_| run_auto_things);
+        // CARGO_EXTRA_FLAGS do not have to be a valid UTF-8, but that's what shell_words' expects.
+        let cargo_extra_flags = std::env::var("CARGO_EXTRA_FLAGS").unwrap_or_default();
+        let cargo_extra_flags = shell_words::split(&cargo_extra_flags)?;


Please check how cargo treats RUSTFLAGS... we should use the exact same logic. I don't think it uses shell_words for that.

https://github.com/rust-lang/cargo/blob/master/src/cargo/core/compiler/build_context/target_info.rs#L761
Yeah, it doesn't. This should simplify things a bit.

RalfJung · 2023-06-28T15:55:50Z

miri-script/src/commands.rs

+                // Run this first, so that the toolchain doesn't change after
+                // other code has run.
+                let command = Subcommands::Toolchain { flags: vec![] };
+                Self::exec_inner(&command, false)?;


Why can't this just be self.toolchain? (similar for the other uses of exec_inner)

The control flow right now is rather hard to follow here

I've figured that exec_inner resembles old script a bit better (we start off a fresh MiriRunner instance), but I'd be happy to change it.
E.g: in fn bench we can just call exec_inner without worrying about auto things being executed recursively or environment not being set up proper for install.

The control flow in the script was terrible though.^^ We should initialize things once and then in bench we know things are initialized enough for install.

RalfJung · 2023-06-28T16:12:16Z

miri-script/src/commands.rs

+            println!("This will pull a copy of the rust-lang/rust history into this Miri checkout, growing it by about 1GB.");
+            println!("To avoid that, abort now and set the RUSTC_GIT environment variable to an existing rustc checkout. Proceed? [y/N] ");
+            let mut answer = String::new();
+            let _ = std::io::stdin().read_line(&mut answer);


Why ignore errors here?

Uh, I guess I did it out of laziness (not having to deal with type conversions to satisfy unwrap_or_else et al was nice).
I've moved this snippet to if let now.

RalfJung · 2023-06-28T16:13:34Z

miri-script/src/commands.rs

+        ) -> Result<()> {
+            let all_targets: Option<&OsStr> = all_targets.then_some("--all-targets".as_ref());
+            let sh = Shell::new()?;
+            cmd!(sh, "cargo {toolchain} check {extra_flags...} --manifest-path {path} {all_targets...} {args...}").run()?;


Suggested change

cmd!(sh, "cargo {toolchain} check {extra_flags...} --manifest-path {path} {all_targets...} {args...}").run()?;

cmd!(sh, "cargo +{toolchain} check {extra_flags...} --manifest-path {path} {all_targets...} {args...}").run()?;

Missing +?

RalfJung · 2023-06-28T16:14:30Z

miri-script/src/commands.rs

+        Self::build_package(
+            miri_manifest.as_ref(),
+            &toolchain,
+            &self.cargo_extra_flags,


If you make this a method (self.build_package), you don't have to pass toolchain and cargo_extra_flags.

RalfJung · 2023-06-28T16:15:01Z

miri-script/src/commands.rs

+        )?;
+        self.find_sysroot()?;
+        if bless {
+            self.set_env("MIRI_BLESS", "Gesundheit");


Seems better to just use sh.set_env here, and keep this local to this test invocation.

RalfJung · 2023-06-28T16:17:04Z

miri-script/src/commands.rs

+
+#[cfg(test)]
+mod tests {
+    use shell_words::split;


Why are you testing another crate's function here...?

I've left that in while looking for a way to split the shell words. I'll remove that, I am sorry. :(

RalfJung · 2023-06-30T12:29:22Z

miri-script/src/commands.rs

+
+    /// Prepare and set MIRI_SYSROOT. Respects `MIRI_TEST_TARGET` and takes into account
+    /// locally built vs. distributed rustc.
+    fn find_sysroot(&mut self) -> Result<()> {


Should probably be find_miri_sysroot to avoid confusion with extras.sysroot.

Same for build_sysroot → build_miri_sysroot

bors · 2023-07-02T11:24:30Z

☔ The latest upstream changes (presumably #2948) made this pull request unmergeable. Please resolve the merge conflicts.

osiewicz · 2023-07-02T16:07:02Z

@RalfJung it looks like there's an issue with ./miri toolchain where on Windows setting an override for a directory doesn't affect subsequent call to rustc --print sysroot. Using rustup run rustc --print sysroot does respect the override.
Obviously Miri-script does something incorrectly, as the master version works just fine.
I'm not sure what I'm doing wrong; I've pulled so much hair on this that at this point I might have to see a barber. :/

RalfJung · 2023-07-07T13:03:25Z

it looks like there's an issue with ./miri toolchain where on Windows setting an override for a directory doesn't affect subsequent call to rustc --print sysroot. Using rustup run rustc --print sysroot does respect the override.

I assume the difference is that miri-script itself is run inside a cargo run which runs in rustup. So some rustup env vars are set which take precedence over the toolchain file. We should probably unset them / not forward them to programs we invoke.

osiewicz · 2023-07-07T16:46:58Z

Hm, that sounds plausible. What do you say to having something akin to:
cargo build --manifest-path "$(dirname "$0")"/miri-script/Cargo.toml && "$(dirname "$0")"/miri-script/target/debug/miri-script $@
instead of:
cargo run --manifest-path "$(dirname "$0")"/miri-script/Cargo.toml -- $@
The former should not go through rustup, so it won't set its environment variables - but it will still be possible to set them just in case someone wants to do that.

RalfJung · 2023-07-07T18:10:36Z

Hm, then we have to make assumptions about where cargo puts the build... But that might still be the better option than fiddling with the environment.

bors · 2023-07-18T10:31:55Z

☔ The latest upstream changes (presumably #2986) made this pull request unmergeable. Please resolve the merge conflicts.

RalfJung · 2023-07-30T17:59:52Z

I'm going to take over this PR -- at this point I think it is easier if I just implement things the way I prefer, rather than trying to explain to you how I would like this to be changed. :) Your groundwork has been extremely helpful for this, thanks a ton!

RalfJung · 2023-07-30T18:11:37Z

Looks like there is a problem (which the old code work around in an incorrect way)... for things like ./miri clippy -- -D warnings, we want all of -- -D warnings to be forwarded to clippy. That is, we don't want the clap parser to intercept this -- at all.

Is that possible?

osiewicz · 2023-07-30T18:24:16Z

Hi,
I am sorry for not carrying this further - I've had less time recently for OS work. :/
As for the "--" forwarding issue, I believe I've annotated clippy and fmt subcommands with allow_leading_hyphen. I recall that there were still issues with it tho.
Another problem with what I've had up to this point was that new fmt was noticeably slower than the old one. I suppose it came down to formatting files sequentially Vs using xargs in a shell script - which could've parallelized different rustfmt instances under the hood.

RalfJung · 2023-07-30T18:30:51Z

Another problem with what I've had up to this point was that new fmt was noticeably slower than the old one.

I fixed that by running rustfmt a single time on all the files, instead of having a new rustfmt invocation for each file. (That matches what the old script does.)

As for the "--" forwarding issue, I believe I've annotated clippy and fmt subcommands with allow_leading_hyphen. I recall that there were still issues with it tho.

That's not sufficient -- as the clap docs for trailing_var_arg say, "users still have the option to explicitly escape ambiguous arguments with --". I don't know if clap has a way to to disable this. If not, we won't be able to use clap.

All the CI failures currently are caused by this.

RalfJung · 2023-07-30T18:41:43Z

miri-script/src/commands.rs

+        let toolchain: &OsStr = self.active_toolchain.as_ref();
+        cmd!(
+            sh,
+            "cargo +{toolchain} test {extra_flags...} --manifest-path {miri_manifest} -- {flags...}"


The -- {flags..} here is what got CI to work despite the parser issue, but it breaks other things -- for instance, ./miri test --all-features should pass that flag to cargo (before the --), not to the test runner. (Same for clippy.)

RalfJung · 2023-07-31T07:54:21Z

@bors r+

bors · 2023-07-31T07:54:24Z

📌 Commit e5b358a has been approved by RalfJung

It is now in the queue for this repository.

bors · 2023-07-31T07:54:30Z

⌛ Testing commit e5b358a with merge 273cdab...

bors · 2023-07-31T08:29:30Z

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing 273cdab to master...

osiewicz marked this pull request as draft May 31, 2023 19:18

saethlin reviewed Jun 9, 2023

View reviewed changes