Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change cargo-miri.rs to fix issue #978 #980

Merged
merged 4 commits into from
Oct 15, 2019
Merged

change cargo-miri.rs to fix issue #978 #980

merged 4 commits into from
Oct 15, 2019

Conversation

JOE1994
Copy link
Contributor

@JOE1994 JOE1994 commented Oct 5, 2019

In Windows 10, there was an issue with building MIRI locally and getting it running,
due to unpredictable backslash escaping issues in paths.
I added a code snippet that would only be compiled in Windows OS, which replaces all backslashes in paths to slashes.
This fix should only affect Windows users.
Building and testing MIRI locally now works fine after the fix.
miri_result_after_fix0

Fixes #978

@RalfJung
Copy link
Member

RalfJung commented Oct 5, 2019 via email

@JOE1994
Copy link
Contributor Author

JOE1994 commented Oct 6, 2019

@RalfJung Thank you for your feedback 👍

From my current understanding, the environment variable MIRI_SYSROOT is set inside the setup function (Line 337, cargo-miri.rs).

miri/src/bin/cargo-miri.rs

Lines 336 to 342 in de4a846

let sysroot = if is_host { dir.join("HOST") } else { PathBuf::from(dir) };
std::env::set_var("MIRI_SYSROOT", &sysroot); // pass the env var to the processes we spawn, which will turn it into "--sysroot" flags
if print_env {
println!("MIRI_SYSROOT={}", sysroot.display());
} else if !ask_user {
println!("A libstd for Miri is now available in `{}`.", sysroot.display());
}

May I ask why the setup function is called twice inside the build_sysroot function (from the miri bash script) ? It seems to me that calling the setup function once should be enough to set MIRI_SYSROOT.

miri/miri

Lines 52 to 58 in de4a846

# Build a sysroot and set MIRI_SYSROOT to use it. Arguments are passed to `cargo miri setup`.
build_sysroot() {
# Build once, for the user to see.
cargo run $CARGO_BUILD_FLAGS --bin cargo-miri -- miri setup "$@"
# Call again, to just set env var.
eval $(cargo run $CARGO_BUILD_FLAGS -q --bin cargo-miri -- miri setup --env "$@")
}

From my understanding, std::env::set_var("MIRI_SYSROOT", content) is stripping all backslashes from content before storing it to the environment variable "MIRI_SYSROOT".
ex. std::env::set_var("MIRI_SYSROOT", "C:\\A\\B\\C") -> MIRI_SYSROOT = C:ABC

No matter how many numbers of backslashes are in the second argument, it just seems that all the backslashes are simply dropped before storing the string to MIRI_SYSROOT. Currently, I don't have a good idea to fix this other than the proposed work-around in this pull-request. I can try looking into it with more time to come up with a better solution.
Please correct me if there is anything wrong with my current understanding. Thank you!

@pvdrz
Copy link
Contributor

pvdrz commented Oct 8, 2019

From my understanding, std::env::set_var("MIRI_SYSROOT", content) is stripping all backslashes from content before storing it to the environment variable "MIRI_SYSROOT".

Is it possible that this is windows specific? (sorry I don't have a Windows machine to try :P). If you run the following code, does it strip the \?

fn main() {
    std::env::set_var("TEST_VAR", "C:\\A\\B\\C");
    println!("{:?}", std::env::var("TEST_VAR"));
}

On linux the string is unchanged.

@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2019

From my understanding, std::env::set_var("MIRI_SYSROOT", content) is stripping all backslashes from content before storing it to the environment variable "MIRI_SYSROOT".
ex. std::env::set_var("MIRI_SYSROOT", "C:\A\B\C") -> MIRI_SYSROOT = C:ABC

That is definitely not desired behavior. Can you reproduce this in a stand-alone program? If yes, please report a bug for rustc. It seems surprising that such a critical bug would go unnoticed for so long though.

May I ask why the setup function is called twice inside the build_sysroot function (from the miri bash script) ? It seems to me that calling the setup function once should be enough to set MIRI_SYSROOT.

There are comments there explaining this. Basically, the first time runs the setup with output so the user sees what happens; the second time runs setup again which should be a NOP, but this time we capture the output and use it to determine the sysroot location.

Indeed the following build_sysroot would also work:

build_sysroot() {
    eval $(cargo run $CARGO_BUILD_FLAGS -q --bin cargo-miri -- miri setup --env "$@")
}

But now the user would not see any build log from the libstd build, and if there are errors during that build they are silently discarded. That's not great behavior.

But maybe that eval thing is the part that does not work on Windows? Does the following help?

build_sysroot() {
    # Build once, for the user to see.
    cargo run $CARGO_BUILD_FLAGS --bin cargo-miri -- miri setup "$@"
    # Call again, to just set env var.
    eval "$(cargo run $CARGO_BUILD_FLAGS -q --bin cargo-miri -- miri setup --env "$@")"
}

@JOE1994
Copy link
Contributor Author

JOE1994 commented Oct 10, 2019

@christianpoveda , @RalfJung
Thank you so much for your feedback 👍
I've now fully identified the problem.

The below screenshot shows a minimal reproduction of what was happening to me.
image

  1. Line 339 (of the code below) prints out a path string whose backslashes are not escaped.
    ex) MIRI_SYSROOT=C:\Rust\Language\Cool

    miri/src/bin/cargo-miri.rs

    Lines 336 to 342 in 3899dce

    let sysroot = if is_host { dir.join("HOST") } else { PathBuf::from(dir) };
    std::env::set_var("MIRI_SYSROOT", &sysroot); // pass the env var to the processes we spawn, which will turn it into "--sysroot" flags
    if print_env {
    println!("MIRI_SYSROOT={}", sysroot.display());
    } else if !ask_user {
    println!("A libstd for Miri is now available in `{}`.", sysroot.display());
    }

  2. Line 57 (of the code below) calls 'eval' to the string output from the Rust code.
    ex) eval MIRI_SYSROOT=C:\Rust\Language\Cool

    miri/miri

    Lines 53 to 58 in 3899dce

    build_sysroot() {
    # Build once, for the user to see.
    cargo run $CARGO_BUILD_FLAGS --bin cargo-miri -- miri setup "$@"
    # Call again, to just set env var.
    eval $(cargo run $CARGO_BUILD_FLAGS -q --bin cargo-miri -- miri setup --env "$@")
    }

  3. 'eval' called in the bash script drops all single backslashes that don't seem to be proper escape tokens. The string stripped of its backslashes is stored to MIRI_SYSROOT
    ex) MIRI_SYSROOT contains string "C:RustLanguageCool", which is no longer a valid sysroot.

I committed another fix to change line 339(cargo-miri.rs) to not use sysroot.display() which outputs a path string with its backslashes not escaped. Line 339 is now changed as below
println!("MIRI_SYSROOT={:?}", &sysroot); // for Windows users, prints path with backslashes escaped.

I've tested that the fixed version works as expected both on Windows10 and Ubuntu(Windows Subsystem for Linux)

@JOE1994
Copy link
Contributor Author

JOE1994 commented Oct 10, 2019

It turns out that there is nothing wrong with the current implementation of std::env::set_var ☺️

@pvdrz
Copy link
Contributor

pvdrz commented Oct 10, 2019

Hmm... that's suspicious. I think that calling Path::display() should return a valid path with \ and everything, quoting the docs:

Returns an object that implements Display for safely printing paths that may contain non-Unicode data.

So even if the path is not a valid unicode string it should be able to print it, idk why is it messing a perfectly valid unicode path in windows (on linux: https://play.integer32.com/?version=stable&mode=debug&edition=2018&gist=33c62dac4f10f4f524e9ce11ed343acf)

Edit: Is it possible that for some reason, when doing MIRI_SYSROOT=C:\Rust\Language\Cool, windows not escaping the \s?

@JOE1994
Copy link
Contributor Author

JOE1994 commented Oct 10, 2019

Hmm... that's suspicious. I think that calling Path::display() should return a valid path with \ and everything, quoting the docs:

Returns an object that implements Display for safely printing paths that may contain non-Unicode data.
So even if the path is not a valid unicode string it should be able to print it, idk why is it messing a perfectly valid unicode path in windows (on linux: https://play.integer32.com/?version=stable&mode=debug&edition=2018&gist=33c62dac4f10f4f524e9ce11ed343acf)

I'm sorry if my explanation up there was unclear.. I checked the rustplayground link you shared, and I see that Path.display() outputs a string with a valid Windows path with backslashes.

The real problem is that, the eval command in bash also requires escaping of backslashes in its input.
C:\Foo\Bar\Baz is a perfectly valid path, but in our case it is directly fed to the bash eval command, and the eval command strips off all backslashes which don't seem to be valid escaping tokens in the string.

What our Rust code needs to print out to the console in this particular case is C:\\Foo\\Bar\\Baz.
The eval command will then properly interpret double backslashes as an escaped single backslash,
and MIRI_SYSROOT will be set properly.

@JOE1994
Copy link
Contributor Author

JOE1994 commented Oct 10, 2019

The confusion comes from the fact that eval command in bash also escapes backslashes in its input string. Path::display() is doing its job well, but it's just that it was not suitable for our need in this particular case.

@pvdrz
Copy link
Contributor

pvdrz commented Oct 10, 2019

Uhh ok i got it now. So is it there any way to not escape those \ when assigning MIRI_SYSROOT?

@JOE1994
Copy link
Contributor Author

JOE1994 commented Oct 10, 2019

@christianpoveda
If sysroot contains a valid Windows path "C:\Rust\Great",
println!("MIRI_SYSROOT={:?}", &sysroot); prints out "MIRI_SYSROOT=C:\\Rust\\Great".
Feeding the output from the above println!() to bash eval solved the problem.

@bjorn3
Copy link
Member

bjorn3 commented Oct 10, 2019

Maybe use sysroot.display.to_string().replace('\\', "\\\\")?

src/bin/cargo-miri.rs Outdated Show resolved Hide resolved
@pvdrz
Copy link
Contributor

pvdrz commented Oct 10, 2019

After playing with bash for a while and smashing my head against the keyboard... What happens if instead of doing

println!("MIRI_SYSROOT={:?}", &sysroot);

you do

println!("MIRI_SYSROOT=\"{}\"", sysroot.display());

I think this is just the shell messing up the escaping because the string is not quoted. Like in bash:
X=A\nB; echo $X prints AnB

PD: I feel like a conspiracy theory guy with this escaping stuff, sorry for my sloppy comments :P

@JOE1994
Copy link
Contributor Author

JOE1994 commented Oct 10, 2019

@christianpoveda
image

Unfortunately, it seems like quoting the path doesn't solve the problem in bash 😭

@pvdrz
Copy link
Contributor

pvdrz commented Oct 10, 2019

That's odd, in my bash it works. Do you remember when stuff was platform independent :P (me neither). What about @bjorn3's idea?

@JOE1994
Copy link
Contributor Author

JOE1994 commented Oct 10, 2019

@bjorn3 's idea also seems to be a valid way to solve the problem, and is a very similar approach to my initial commit for this issue.

@RalfJung
Copy link
Member

Line 57 (of the code below) calls 'eval' to the string output from the Rust code.
ex) eval MIRI_SYSROOT=C:\Rust\Language\Cool

Ah yes that makes a lot of sense. Bash interprets this as shell so it applies one level of escaping.
And great idea using the debug printing facility! I think that might just do the right thing even for the corner cases, but I am going to play with it on the playground a bit.

std::env::set_var("MIRI_SYSROOT", &sysroot); // pass the env var to the processes we spawn, which will turn it into "--sysroot" flags
if print_env {
println!("MIRI_SYSROOT={}", sysroot.display());
println!("MIRI_SYSROOT={:?}", &sysroot); // for Windows users, prints path with backslashes escaped.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not just for Windows users. It just as much affects Linux users with ', " or \ in the path (all of which are valid filename characters on Linux).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Do you think I should add a macro for conditional compilation for Windows users??

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, what I am saying is the bug you are fixing also affects other platforms. With conditional compilation you'd only fix it on Windows, why do you want that? :D

@RalfJung
Copy link
Member

println!("MIRI_SYSROOT=\"{}\"", sysroot.display());

That'll go horribly wrong if there are " inside the string.

However, this made me realize that {:?} probably escapes too much. OTOH, sysroot.display.to_string().replace('\\', "\\\\") does not replace enough: the shell will also treat a bunch of other things as special, things like $ or ". Dang, why is this so annoying. :/

The strongest escape in shell is ', so I think we should use that around the entire string. That means the only character we have to escape is ' itself, and that has to become the funny '"'"'. So, I think we should do

println!("MIRI_SYSROOT='{}'", sysroot.display().to_string().replace('\'', r#"'"'"'"#));

Yeah this looks like a bad joke, I know... it's, in this order:

  • end the '-enclosed string
  • start a "-enclosed string
  • character '
  • end the "-enclosed string
  • start a '-enclosed string

Does this make sense?

@RalfJung RalfJung added the S-waiting-on-author Status: Waiting for the PR author to address review comments label Oct 14, 2019
@JOE1994
Copy link
Contributor Author

JOE1994 commented Oct 15, 2019

@RalfJung Thanks for your suggestion!
It looks good to me! It runs as expected in my Windows machine 😄
I made a commit to reflect your suggestion and @bjorn3 's review.

@RalfJung
Copy link
Member

Thanks for the PR!

@bors r+

@bors
Copy link
Contributor

bors commented Oct 15, 2019

📌 Commit 65fd006 has been approved by RalfJung

@RalfJung RalfJung removed the S-waiting-on-author Status: Waiting for the PR author to address review comments label Oct 15, 2019
bors added a commit that referenced this pull request Oct 15, 2019
change cargo-miri.rs to fix issue #978

In Windows 10, there was an issue with building MIRI locally and getting it running,
due to unpredictable backslash escaping issues in paths.
I added a code snippet that would only be compiled in Windows OS, which replaces all backslashes in paths to slashes.
This fix should only affect Windows users.
Building and testing MIRI locally now works fine after the fix.
![miri_result_after_fix0](https://user-images.githubusercontent.com/10286488/66260998-344abc80-e794-11e9-9d7c-b4ef098443de.PNG)

Fixes #978
@bors
Copy link
Contributor

bors commented Oct 15, 2019

⌛ Testing commit 65fd006 with merge 42c1e77...

@bors
Copy link
Contributor

bors commented Oct 15, 2019

☀️ Test successful - checks-travis, status-appveyor
Approved by: RalfJung
Pushing 42c1e77 to master...

@bors bors merged commit 65fd006 into rust-lang:master Oct 15, 2019
bors added a commit that referenced this pull request Oct 15, 2019
explain our shell encoding

Follow-up to #980
bors added a commit that referenced this pull request Oct 15, 2019
explain our shell encoding

Follow-up to #980
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need to fix 'miri' bash script for Windows powershell users
5 participants