Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent compiler stack overflow for deeply recursive code #55617

Merged
merged 6 commits into from
May 7, 2020

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Nov 2, 2018

I was unable to write a test that

  1. runs in under 1s
  2. overflows on my machine without this patch

The following reproduces the issue, but I don't think it's sensible to include a test that takes 30s to compile. We can now easily squash newly appearing overflows by the strategic insertion of calls to ensure_sufficient_stack.

// compile-pass

#![recursion_limit="1000000"]

macro_rules! chain {
    (EE $e:expr) => {$e.sin()};
    (RECURSE $i:ident $e:expr) => {chain!($i chain!($i chain!($i chain!($i $e))))};
    (Z $e:expr) => {chain!(RECURSE EE $e)};
    (Y $e:expr) => {chain!(RECURSE Z $e)};
    (X $e:expr) => {chain!(RECURSE Y $e)};
    (A $e:expr) => {chain!(RECURSE X $e)};
    (B $e:expr) => {chain!(RECURSE A $e)};
    (C $e:expr) => {chain!(RECURSE B $e)};
    // causes overflow on x86_64 linux
    // less than 1 second until overflow on test machine
    // after overflow has been fixed, takes 30s to compile :/
    (D $e:expr) => {chain!(RECURSE C $e)};
    (E $e:expr) => {chain!(RECURSE D $e)};
    (F $e:expr) => {chain!(RECURSE E $e)};
    // more than 10 seconds
    (G $e:expr) => {chain!(RECURSE F $e)};
    (H $e:expr) => {chain!(RECURSE G $e)};
    (I $e:expr) => {chain!(RECURSE H $e)};
    (J $e:expr) => {chain!(RECURSE I $e)};
    (K $e:expr) => {chain!(RECURSE J $e)};
    (L $e:expr) => {chain!(RECURSE L $e)};
}


fn main() {
    let x = chain!(D 42.0_f32);
}

fixes #55471
fixes #41884
fixes #40161
fixes #34844
fixes #32594

cc @alexcrichton @rust-lang/compiler

I looked at all code that checks the recursion limit and inserted stack growth calls where appropriate.

@rust-highfive
Copy link
Collaborator

r? @michaelwoerister

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 2, 2018
@nagisa

This comment has been minimized.

@rust-highfive

This comment has been minimized.

@@ -34,6 +34,7 @@ byteorder = { version = "1.1", features = ["i128"]}
chalk-engine = { version = "0.8.0", default-features=false }
rustc_fs_util = { path = "../librustc_fs_util" }
smallvec = { version = "0.6.5", features = ["union"] }
stacker = "0.1.3"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure stacker is necessarily rustc-ready right now, but that doesn't mean that it can't be! Some issues I can think of are:

  • It only has support for x86 platforms basically, and only Windows/Mac/Linux. It should be easy enough to "add support" for other platforms by basically doing nothing. Full support could be added over time as necessary
  • I don't think stacker does anything with guard pages, but ideally it'd also be sure to allocate guard pages for larger segments to protect agains accidental stack overflow
  • I'm not entirely sure how well panics and such work? It should be relatively easy to catch_unwind and resume_unwind though when necessary (just needs to be done)

These are all pretty minor, but I'd want to be sure to handle them before merging if possible!

@oli-obk oli-obk added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 2, 2018
@@ -1460,6 +1460,8 @@ fn parse_crate_attrs<'a>(sess: &'a Session, input: &Input) -> PResult<'a, Vec<as
}
}

const STACK_SIZE: usize = 4 * 1024 * 1024; // 4MB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make this dependent on whether stacker actually works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How's that relevant? When stacker doesn't work, these values don't matter, because they don't do anything

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was the default thread stack size, unrelated to stacker. My bad if that's not the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna rename the constant to be clearer about this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you're right, I was looking at the wrong value. But this change is just for crater as noted by @nagisa in #55617 (comment)

We'll bump it back after crater succeeds

@michaelwoerister
Copy link
Member

Let's do a perf run.
@bors try

@bors
Copy link
Contributor

bors commented Nov 5, 2018

🔒 Merge conflict

This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again.

How do I rebase?

Assuming self is your fork and upstream is this repository, you can resolve the conflict following these steps:

  1. git checkout stacker (switch to your branch)
  2. git fetch upstream master (retrieve the latest master)
  3. git rebase upstream/master -p (rebase on top of it)
  4. Follow the on-screen instruction to resolve conflicts (check git status if you got lost).
  5. git push self stacker --force-with-lease (update this PR)

You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial.

Please avoid the "Resolve conflicts" button on GitHub. It uses git merge instead of git rebase which makes the PR commit history more difficult to read.

Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Cargo.lock conflict is handled during merge and rebase. This is normal, and you should still perform step 5 to update this PR.

Error message
warning: Cannot merge binary files: src/Cargo.lock (HEAD vs. heads/homu-tmp)
Auto-merging src/librustc_typeck/check/mod.rs
Auto-merging src/librustc_traits/dropck_outlives.rs
Auto-merging src/librustc_mir/monomorphize/collector.rs
Auto-merging src/librustc_driver/lib.rs
Auto-merging src/librustc/traits/select.rs
Auto-merging src/librustc/traits/query/normalize.rs
Auto-merging src/librustc/traits/project.rs
Auto-merging src/librustc/hir/lowering.rs
Auto-merging src/Cargo.lock
CONFLICT (content): Merge conflict in src/Cargo.lock
Automatic merge failed; fix conflicts and then commit the result.

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 6, 2018

@bors try

@bors
Copy link
Contributor

bors commented Nov 6, 2018

⌛ Trying commit 9f61d00 with merge 2b10b3d...

bors added a commit that referenced this pull request Nov 6, 2018
Prevent compiler stack overflow for deeply recursive code

I was unable to write a test that

1. runs in under 1s
2. overflows on my machine without this patch

The following reproduces the issue, but I don't think it's sensible to include a test that takes 30s to compile. We can now easily squash newly appearing overflows by the strategic insertion of calls to `ensure_sufficient_stack`.

```rust
// compile-pass

#![recursion_limit="1000000"]

macro_rules! chain {
    (EE $e:expr) => {$e.sin()};
    (RECURSE $i:ident $e:expr) => {chain!($i chain!($i chain!($i chain!($i $e))))};
    (Z $e:expr) => {chain!(RECURSE EE $e)};
    (Y $e:expr) => {chain!(RECURSE Z $e)};
    (X $e:expr) => {chain!(RECURSE Y $e)};
    (A $e:expr) => {chain!(RECURSE X $e)};
    (B $e:expr) => {chain!(RECURSE A $e)};
    (C $e:expr) => {chain!(RECURSE B $e)};
    // causes overflow on x86_64 linux
    // less than 1 second until overflow on test machine
    // after overflow has been fixed, takes 30s to compile :/
    (D $e:expr) => {chain!(RECURSE C $e)};
    (E $e:expr) => {chain!(RECURSE D $e)};
    (F $e:expr) => {chain!(RECURSE E $e)};
    // more than 10 seconds
    (G $e:expr) => {chain!(RECURSE F $e)};
    (H $e:expr) => {chain!(RECURSE G $e)};
    (I $e:expr) => {chain!(RECURSE H $e)};
    (J $e:expr) => {chain!(RECURSE I $e)};
    (K $e:expr) => {chain!(RECURSE J $e)};
    (L $e:expr) => {chain!(RECURSE L $e)};
}

fn main() {
    let x = chain!(D 42.0_f32);
}
```

fixes #55471
fixes #41884
fixes #40161
fixes #34844
fixes #32594

cc @alexcrichton @rust-lang/compiler

I looked at all code that checks the recursion limit and inserted stack growth calls where appropriate.
@bors
Copy link
Contributor

bors commented Nov 6, 2018

☀️ Test successful - status-travis
State: approved= try=True

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 6, 2018

@rust-timer build 2b10b3d

@rust-timer
Copy link
Collaborator

Success: Queued 2b10b3d with parent f90aab7, comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 2b10b3d

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 6, 2018

Improvements for ctfe stress tests (spurious?), regressions up to 3% for everything else except the clean-incremental part

@michaelwoerister
Copy link
Member

Makes sense that the clean-incremental cases don't see a difference since they hardly execute any of the changed code. The other regressions are rather unfortunate though. Can we optimize this?

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 7, 2018

Well, this is an operation we now run on every single query (not every call, just every evaluation). There are loads of queries. It seems logical that this introduces some regression that we can't get rid of.

@michaelwoerister
Copy link
Member

But

  • do we need to run the operation really for every query invocation?
  • can we make the operation less expensive, especially in the case where the stack doesn't have to be grown?

@eddyb
Copy link
Member

eddyb commented Nov 7, 2018

I feel like the actual stack check shouldn't be noticeable compared to hashing and looking up a key in a hashmap. Maybe we're growing the stack more often than we need to?

@nikic
Copy link
Contributor

nikic commented Nov 7, 2018

Looking at the stacker implementation, a possible issue might be that we're sitting somewhere close to the stack limit and regularly go over it and below it again. This will allocate and deallocate a new stack every time. Maybe retaining the last allocation would help to reduce the performance impact?

@nagisa
Copy link
Member

nagisa commented Nov 8, 2018 via email

@Dylan-DPC-zz
Copy link

@bors retry (yield)

@bors
Copy link
Contributor

bors commented May 6, 2020

⌛ Testing commit 935a05f with merge 1e4ad8ae1ae7c74ced296fd2e6380a81a5f6357c...

@Dylan-DPC-zz
Copy link

@bors retry yield

@bors
Copy link
Contributor

bors commented May 6, 2020

⌛ Testing commit 935a05f with merge 698a0e16dbbb8fa45c5f0dd8c298ad8e98d14f80...

@Dylan-DPC-zz
Copy link

@bors retry yield

@bors
Copy link
Contributor

bors commented May 6, 2020

⌛ Testing commit 935a05f with merge 18e1bdf136f4f0a269ac64e89c73020ad8b882a6...

@Dylan-DPC-zz
Copy link

@bors retry yield

@bors
Copy link
Contributor

bors commented May 7, 2020

⌛ Testing commit 935a05f with merge 97f3eee...

@bors bors mentioned this pull request May 7, 2020
@bors
Copy link
Contributor

bors commented May 7, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: nagisa,oli-obk
Pushing 97f3eee to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 7, 2020
@bors bors merged commit 97f3eee into rust-lang:master May 7, 2020
@oli-obk oli-obk deleted the stacker branch May 7, 2020 10:51
@Mark-Simulacrum Mark-Simulacrum added the relnotes-perf Performance improvements that should be mentioned in the release notes. label May 8, 2020
@Mark-Simulacrum
Copy link
Member

cc @XAMPPRocky I tagged this with relnotes-perf but it's not really perf so much as "hey this fixes a longstanding problem, would be cool to mention"

nnethercote added a commit to nnethercote/rust that referenced this pull request Aug 5, 2020
rustdoc's `main()` immediately spawns a thread, M, with a large stack
(16MiB or 32MiB) on which it runs `main_args()`. `main_args()` does a
small amount of options processing and then calls
`setup_callbacks_and_run_in_default_thread_pool_with_globals()`, which
spawns it own thread, and M is not used further.

So, thread M seems unnecessary. However, it does serve a purpose: if the
options processing in `main_args()` panics, that panic is caught when M
is joined. So M can't simply be removed.

However, `main_options()`, which is called by `main_args()`, has a
`catch_fatal_errors()` call within it. We can move that call to `main()`
and change it to the very similar `catch_with_exit_code()`. With that in
place, M can be removed, and panics from options processing will still
be caught appropriately.

Even better, this makes rustdoc's `main()` match rustc's `main()`, which
also uses `catch_with_exit_code()`.

(Also note that the use of a 16MiB/32MiB stack was eliminated from rustc
in rust-lang#55617.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. relnotes-perf Performance improvements that should be mentioned in the release notes. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet