Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't store locals that have been moved from in generators #61922

Merged
merged 7 commits into from
Jul 2, 2019

Conversation

tmandry
Copy link
Member

@tmandry tmandry commented Jun 18, 2019

This avoids reserving storage in generators for locals that are moved
out of (and not re-initialized) prior to yield points. Fixes #59123.

This adds a new dataflow analysis, RequiresStorage, to determine whether the storage of a local can be destroyed without being observed by the program. The rules are:

  1. StorageLive(x) => mark x live
  2. StorageDead(x) => mark x dead
  3. If a local is moved from, and has never had its address taken, mark it dead
  4. If (any part of) a local is initialized, mark it live'

This is used to determine whether to save a local in the generator object at all, as well as which locals can be overlapped in the generator layout.

Here's the size in bytes of all testcases included in the change, before and after the change:

async fn test Size before Size after
single 1028 1028
single_with_noop 2056 1032
joined 5132 3084
joined_with_noop 8208 3084
generator test Size before Size after
move_before_yield 1028 1028
move_before_yield_with_noop 2056 1032
overlap_move_points 3080 2056

Future work

Note that there is a possible extension to this optimization, which modifies rule 3 to read: "If a local is moved from, and either has never had its address taken, or is Freeze and has never been mutably borrowed, mark it dead." This was discussed at length in #59123 and then #61849. Because this would cause some behavior to be UB which was not UB before, it's a step that needs to be taken carefully.

A more immediate priority for me is inlining std::mem::size_of_val(&x) so it becomes apparent that the address of x is not taken. This way, using size_of_val to look at the size of your inner futures does not affect the size of your outer future.

cc @cramertj @eddyb @Matthias247 @nikomatsakis @RalfJung @Zoxc

@rust-highfive
Copy link
Collaborator

r? @matthewjasper

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 18, 2019
@bstrie
Copy link
Contributor

bstrie commented Jun 18, 2019

Are there any size comparisons so we can see the magnitude of the effect of this patch?

@RalfJung
Copy link
Member

So what are the assumptions about MIR semantics that are made here? Would be great to have that summarized in the code somewhere and in the PR description, so that one does not have to reverse engineer that from the dataflow analysis.

@RalfJung
Copy link
Member

Oh also, what would be really awesome (both for this PR and #60187) is if you could write some code that would "contradict" this optimization and thus should have UB. This would be helpful as a Miri testcase and also to better document the assumption that is being made.

That might not always be possible though.

@Zoxc
Copy link
Contributor

Zoxc commented Jun 18, 2019

It would be nice to see a minimal MIR example of a case it's supposed to help with too.

@tmandry tmandry force-pushed the moar-generator-optimization branch 2 times, most recently from 92f55b7 to e5214a9 Compare June 21, 2019 01:28
@tmandry tmandry changed the title [WIP] Don't store locals that have been moved from in generators Don't store locals that have been moved from in generators Jun 21, 2019
@tmandry
Copy link
Member Author

tmandry commented Jun 21, 2019

Oh also, what would be really awesome (both for this PR and #60187) is if you could write some code that would "contradict" this optimization and thus should have UB. This would be helpful as a Miri testcase and also to better document the assumption that is being made.

@RalfJung The minimal testcase that produces UB would be something like

fn main() {
  static || {
    let x = String::new("42");
    let y = x;
    yield;
    assert!(&x == "42");
  }
}

i.e., you have to depend on the value of a var after moving it. (And this has to be done after a yield point to trigger the optimization and thus the UB.)

AFAIK there's no way to write this testcase in surface Rust today. If you borrow x before moving from it, the optimization is defeated (as I mentioned over in the #59123 discussion, I'd like to remove this restriction after we do something like #61849). But if the borrow occurs after the move in MIR then the optimization is still enabled, so it's possible the testcase could be written in MIR.

@tmandry
Copy link
Member Author

tmandry commented Jun 21, 2019

It would be nice to see a minimal MIR example of a case it's supposed to help with too.

@Zoxc I included some testcases, do you think MIR examples would be more clear?

@Zoxc
Copy link
Contributor

Zoxc commented Jun 21, 2019

@tmandry Yes. It's easier to see the problematic unwind edges with MIR.

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:28cc7da5:start=1561080575161169292,finish=1561080664852383195,duration=89691213903
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
$ export GCP_CACHE_BUCKET=rust-lang-ci-cache
$ export AWS_ACCESS_KEY_ID=AKIA46X5W6CZEJZ6XT55
---
Check compiletest suite=run-pass mode=run-pass (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:56:40] 
[00:56:40] running 2922 tests
[00:56:51] .................................................................................................... 100/2922
[00:57:03] .................................................F..........................i....................... 200/2922
[00:57:23] .................................................................................................... 400/2922
[00:57:32] .................................................................................................... 500/2922
[00:57:43] .................................................................................................... 600/2922
[00:57:58] .................................................................................................... 700/2922
---
[01:03:02] failures:
[01:03:02] 
[01:03:02] ---- [run-pass] run-pass/async-fn-size-moved-locals.rs stdout ----
[01:03:02] 
[01:03:02] error: test compilation failed although it shouldn't!
[01:03:02] status: exit code: 1
[01:03:02] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/run-pass/async-fn-size-moved-locals.rs" "-Zthreads=1" "--target=x86_64-unknown-linux-gnu" "--error-format" "json" "-Zui-testing" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass/async-fn-size-moved-locals/a" "-Crpath" "-O" "-Cdebuginfo=0" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass/async-fn-size-moved-locals/auxiliary"
[01:03:02] ------------------------------------------
[01:03:02] 
[01:03:02] ------------------------------------------
[01:03:02] stderr:
[01:03:02] stderr:
[01:03:02] ------------------------------------------
[01:03:02] error[E0670]: `async fn` is not permitted in the 2015 edition
[01:03:02]   --> /checkout/src/test/run-pass/async-fn-size-moved-locals.rs:52:1
[01:03:02]    |
[01:03:02] LL | async fn single() {
[01:03:02] 
[01:03:02] 
[01:03:02] error[E0670]: `async fn` is not permitted in the 2015 edition
[01:03:02]   --> /checkout/src/test/run-pass/async-fn-size-moved-locals.rs:57:1
[01:03:02]    |
[01:03:02] LL | async fn single_with_noop() {
[01:03:02] 
[01:03:02] 
[01:03:02] error[E0670]: `async fn` is not permitted in the 2015 edition
[01:03:02]   --> /checkout/src/test/run-pass/async-fn-size-moved-locals.rs:63:1
[01:03:02]    |
[01:03:02] LL | async fn joined() {
[01:03:02] 
[01:03:02] 
[01:03:02] error[E0670]: `async fn` is not permitted in the 2015 edition
[01:03:02]   --> /checkout/src/test/run-pass/async-fn-size-moved-locals.rs:76:1
[01:03:02]    |
[01:03:02] LL | async fn joined_with_noop() {
[01:03:02] 
[01:03:02] 
[01:03:02] error[E0609]: no field `await` on type `BigFut`
[01:03:02]   --> /checkout/src/test/run-pass/async-fn-size-moved-locals.rs:54:7
[01:03:02]    |
[01:03:02] LL |     x.await;
[01:03:02]    |
[01:03:02]    = note: available fields are: `0`
[01:03:02] 
[01:03:02] 
[01:03:02] error[E0609]: no field `await` on type `BigFut`
[01:03:02]   --> /checkout/src/test/run-pass/async-fn-size-moved-locals.rs:60:7
[01:03:02]    |
[01:03:02] LL |     x.await;
[01:03:02]    |
[01:03:02]    = note: available fields are: `0`
[01:03:02] 
[01:03:02] 
[01:03:02] error[E0609]: no field `await` on type `Joiner`
[01:03:02]   --> /checkout/src/test/run-pass/async-fn-size-moved-locals.rs:73:12
[01:03:02]    |
[01:03:02] LL |     joiner.await
[01:03:02]    |
[01:03:02]    |
[01:03:02]    = note: available fields are: `a`, `b`, `c`
[01:03:02] 
[01:03:02] error[E0609]: no field `await` on type `Joiner`
[01:03:02]   --> /checkout/src/test/run-pass/async-fn-size-moved-locals.rs:87:12
[01:03:02]    |
[01:03:02] LL |     joiner.await
[01:03:02]    |
[01:03:02]    |
[01:03:02]    = note: available fields are: `a`, `b`, `c`
[01:03:02] error: aborting due to 8 previous errors
[01:03:02] 
[01:03:02] Some errors have detailed explanations: E0609, E0670.
[01:03:02] For more information about an error, try `rustc --explain E0609`.
---
[01:03:02] thread 'main' panicked at 'Some tests failed', src/tools/compiletest/src/main.rs:521:22
[01:03:02] note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
[01:03:02] 
[01:03:02] 
[01:03:02] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/run-pass" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass" "--stage-id" "stage2-x86_64-unknown-linux-gnu" "--mode" "run-pass" "--target" "x86_64-unknown-linux-gnu" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/usr/lib/llvm-6.0/bin/FileCheck" "--host-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--target-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--docck-python" "/usr/bin/python2.7" "--lldb-python" "/usr/bin/python2.7" "--gdb" "/usr/bin/gdb" "--quiet" "--llvm-version" "6.0.0\n" "--system-llvm" "--cc" "" "--cxx" "" "--cflags" "" "--llvm-components" "" "--llvm-cxxflags" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"
[01:03:02] 
[01:03:02] 
[01:03:02] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:03:02] Build completed unsuccessfully in 0:59:03
---
156504 ./src/llvm-project/clang
150156 ./obj/build/bootstrap/debug/incremental
145036 ./obj/build/x86_64-unknown-linux-gnu/stage0-bootstrap-tools
134688 ./obj/build/bootstrap/debug/incremental/bootstrap-1llt3ypt1ftzv
134684 ./obj/build/bootstrap/debug/incremental/bootstrap-1llt3ypt1ftzv/s-fdcum8p00l-azhk3x-2adl2f17sobif
118076 ./obj/build/x86_64-unknown-linux-gnu/stage1-rustc
108532 ./src/llvm-project/lldb
98116 ./obj/build/x86_64-unknown-linux-gnu/stage0-std
97592 ./src/llvm-project/clang/test
---
travis_time:end:04356d38:start=1561084457058468223,finish=1561084457064196197,duration=5727974
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:0f827862
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE" ]; then printf travis_fold":start:crashlog\n\033[31;1m%s\033[0m\n" "$CORE"; gdb --batch -q -c "$CORE" "$EXE" -iex 'set auto-load off' -iex 'dir src/' -iex 'set sysroot .' -ex bt -ex q; echo travis_fold":"end:crashlog; fi; done || true
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:0b72d434
travis_time:start:0b72d434
$ cat ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
cat: ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers: No such file or directory
travis_fold:end:after_failure.5
travis_fold:start:after_failure.6
travis_time:start:1d8c4e00
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@tmandry tmandry force-pushed the moar-generator-optimization branch from e5214a9 to 11631e9 Compare June 21, 2019 21:12
@tmandry
Copy link
Member Author

tmandry commented Jun 22, 2019

Sorry, I should have pointed out that the problem is described quite well by @Matthias247 with a CFG diagram level over at #59123. In brief, from the MIR it looks like we can reach a drop from a variable after it has already been moved out of. In reality, we will never execute the drop after moving because of the state of the drop flag in any execution.

If we didn't have drop flags, we might not need this change. However, there's another problem I think this pass could solve, but doesn't yet. Consider:

|| {
  let first = [0; 1024];
  yield;
  let second = first;
  yield;
  let _third = second;
  yield;
}

We can definitely overlap first and _third in the generator layout, as their storage is never needed at the same time. The original generator optimization cannot optimize this, as is only looks at StorageLive and StorageDead, and the generation of those are tied to variable scopes right now. With this change, in theory it could see that first is done by the time _third becomes StorageLive. I need to dig into why this isn't happening already.

(EDIT: It's because I was only using RequiresStorage to decide whether to store locals at all, not whether they can be overlapped. I might change this.)

Copy link
Contributor

@matthewjasper matthewjasper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @pnkfelix @ecstatic-morse for the dataflow changes

src/librustc_mir/dataflow/impls/storage_liveness.rs Outdated Show resolved Hide resolved
src/librustc_mir/dataflow/mod.rs Outdated Show resolved Hide resolved
src/librustc_mir/dataflow/mod.rs Outdated Show resolved Hide resolved
@RalfJung
Copy link
Member

AFAIK there's no way to write this testcase in surface Rust today. If you borrow x before moving from it, the optimization is defeated

To be clear, this includes any way to take x's address? Taking the address is the relevant part here, not whatever the borrow checker does.

So basically soundness of optimizations relies only on the fact that if the address of a local has not been taken, after moving out there is just no way to access that storage again. It can't be accessed through existing pointers because none have been taken, and new pointers cannot be created because the move checker would prevent that.

I may have asked this before, but you only do this for "whole" locals right, not the individual fields of a struct? There should probably be a comment somewhere saying that this is important. (Taking the address of one field "leaks" information about other fields, so we'd have to commit to a stricter semantics if we wanted to have UB from those fields being accessed.)

But if the borrow occurs after the move in MIR then we optimization is still enabled, so it's possible the testcase could be written in MIR.

Unfortunately without a MIR parser or so, we can't feed MIR to Miri. :/ Cc rust-lang/miri#196

StatementKind::StorageDead(l) => sets.kill(l),
StatementKind::Assign(ref place, _)
| StatementKind::SetDiscriminant { ref place, .. } => {
place.base_local().map(|l| sets.gen(l));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct if place is a deref projection? In other words, do you want *(_1) = ... to gen _1 and not do anything else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so, because we don't kill the local if it has ever had its address taken. So there's no need to track pointers and derefs so we can gen it again.

@tmandry
Copy link
Member Author

tmandry commented Jun 24, 2019

To be clear, this includes any way to take x's address? Taking the address is the relevant part here, not whatever the borrow checker does.

Yes, and specifically I'm checking for Rvalue::Ref.

So basically soundness of optimizations relies only on the fact that if the address of a local has not been taken, after moving out there is just no way to access that storage again. It can't be accessed through existing pointers because none have been taken, and new pointers cannot be created because the move checker would prevent that.

Yep!

I may have asked this before, but you only do this for "whole" locals right, not the individual fields of a struct? There should probably be a comment somewhere saying that this is important. (Taking the address of one field "leaks" information about other fields, so we'd have to commit to a stricter semantics if we wanted to have UB from those fields being accessed.)

That's right; I'll add a comment to make this clear.

@tmandry tmandry force-pushed the moar-generator-optimization branch 2 times, most recently from c6ea824 to e39e3cd Compare June 24, 2019 19:28
@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:035ea7e5:start=1561404569756646119,finish=1561404647815978248,duration=78059332129
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
$ export GCP_CACHE_BUCKET=rust-lang-ci-cache
$ export AWS_ACCESS_KEY_ID=AKIA46X5W6CZEJZ6XT55
---
[00:39:50]    Compiling rustc_typeck v0.0.0 (/checkout/src/librustc_typeck)
[00:40:24] error: outlives requirements can be inferred
[00:40:24]   --> src/librustc_mir/dataflow/impls/storage_liveness.rs:89:38
[00:40:24]    |
[00:40:24] 89 | pub struct RequiresStorage<'mir, 'tcx: 'mir, 'b> {
[00:40:24]    |
[00:40:24] note: lint level defined here
[00:40:24]   --> src/librustc_mir/lib.rs:30:9
[00:40:24]    |
[00:40:24]    |
[00:40:24] 30 | #![deny(rust_2018_idioms)]
[00:40:24]    |         ^^^^^^^^^^^^^^^^
[00:40:24]    = note: #[deny(explicit_outlives_requirements)] implied by #[deny(rust_2018_idioms)]
[00:40:24] error: outlives requirements can be inferred
[00:40:24]    --> src/librustc_mir/dataflow/impls/storage_liveness.rs:196:36
[00:40:24]     |
[00:40:24]     |
[00:40:24] 196 | struct MoveVisitor<'a, 'b, 'c, 'mir: 'a, 'tcx> {
[00:40:24] 
[00:40:24] error: aborting due to 2 previous errors
[00:40:24] 
[00:40:24] error: Could not compile `rustc_mir`.
---
travis_time:end:02b0fa00:start=1561407273504890228,finish=1561407273509872935,duration=4982707
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:1d4b0c62
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE" ]; then printf travis_fold":start:crashlog\n\033[31;1m%s\033[0m\n" "$CORE"; gdb --batch -q -c "$CORE" "$EXE" -iex 'set auto-load off' -iex 'dir src/' -iex 'set sysroot .' -ex bt -ex q; echo travis_fold":"end:crashlog; fi; done || true
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:2a978828
travis_time:start:2a978828
$ cat ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
cat: ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers: No such file or directory
travis_fold:end:after_failure.5
travis_fold:start:after_failure.6
travis_time:start:1c180491
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@bors
Copy link
Contributor

bors commented Jun 24, 2019

☔ The latest upstream changes (presumably #61787) made this pull request unmergeable. Please resolve the merge conflicts.

} else {
self.flow_state.reconstruct_statement_effect(loc);
self.flow_state.apply_local_effect(loc);
}
Copy link
Contributor

@ecstatic-morse ecstatic-morse Jun 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has slightly different semantics than DataflowResultsConsumer. Your code applies all of {before_,}{statement,terminator}_effect for the statement at loc to the dataflow state, whereas DataflowResultsConsumer only applies before_{statement,terminator}_effect. Both your code and DataflowResultsConsumer pick up the effects on {statement,terminator}_effect on the transfer function, however.

I think you should not call apply_local_effect for that last statement (similar to what state_for_location does) and document precisely what state the underlying DataflowResults will be in after calling seek.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, PTAL. I wish the implementation were more elegant.

@tmandry tmandry force-pushed the moar-generator-optimization branch from e39e3cd to 453fbde Compare June 25, 2019 21:28
@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:0b6476e3:start=1561498206946919488,finish=1561498210505857504,duration=3558938016
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
$ export GCP_CACHE_BUCKET=rust-lang-ci-cache
$ export AWS_ACCESS_KEY_ID=AKIA46X5W6CZEJZ6XT55
---
[00:38:29]    Compiling rustc_mir v0.0.0 (/checkout/src/librustc_mir)
[00:39:03] error: outlives requirements can be inferred
[00:39:03]   --> src/librustc_mir/dataflow/impls/storage_liveness.rs:80:38
[00:39:03]    |
[00:39:03] 80 | pub struct RequiresStorage<'mir, 'tcx: 'mir, 'b> {
[00:39:03]    |
[00:39:03] note: lint level defined here
[00:39:03]   --> src/librustc_mir/lib.rs:30:9
[00:39:03]    |
[00:39:03]    |
[00:39:03] 30 | #![deny(rust_2018_idioms)]
[00:39:03]    |         ^^^^^^^^^^^^^^^^
[00:39:03]    = note: #[deny(explicit_outlives_requirements)] implied by #[deny(rust_2018_idioms)]
[00:39:03] error: outlives requirements can be inferred
[00:39:03]    --> src/librustc_mir/dataflow/impls/storage_liveness.rs:178:32
[00:39:03]     |
[00:39:03]     |
[00:39:03] 178 | struct MoveVisitor<'a, 'b, 'mir: 'a, 'tcx> {
[00:39:03] 
[00:39:03] error: aborting due to 2 previous errors
[00:39:03] 
[00:39:04] error: Could not compile `rustc_mir`.
---
travis_time:end:1700ae9a:start=1561500734100287484,finish=1561500734105280759,duration=4993275
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:03d0adc9
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE" ]; then printf travis_fold":start:crashlog\n\033[31;1m%s\033[0m\n" "$CORE"; gdb --batch -q -c "$CORE" "$EXE" -iex 'set auto-load off' -iex 'dir src/' -iex 'set sysroot .' -ex bt -ex q; echo travis_fold":"end:crashlog; fi; done || true
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:0698e263
travis_time:start:0698e263
$ cat ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
cat: ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers: No such file or directory
travis_fold:end:after_failure.5
travis_fold:start:after_failure.6
travis_time:start:02630400
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@tmandry
Copy link
Member Author

tmandry commented Jun 29, 2019

Oh, I forgot to use RequiresStorage in determining which locals can overlap. I changed that and added a test.

Copy link
Contributor

@matthewjasper matthewjasper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me with comments addressed

src/librustc_mir/dataflow/impls/storage_liveness.rs Outdated Show resolved Hide resolved
src/librustc_mir/dataflow/mod.rs Outdated Show resolved Hide resolved
@tmandry
Copy link
Member Author

tmandry commented Jul 1, 2019

@bors r=matthewjasper

@bors
Copy link
Contributor

bors commented Jul 1, 2019

📌 Commit a68e2c7 has been approved by matthewjasper

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 1, 2019
Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 2, 2019
…, r=matthewjasper

Don't store locals that have been moved from in generators

This avoids reserving storage in generators for locals that are moved
out of (and not re-initialized) prior to yield points. Fixes rust-lang#59123.

This adds a new dataflow analysis, `RequiresStorage`, to determine whether the storage of a local can be destroyed without being observed by the program. The rules are:

1. StorageLive(x) => mark x live
2. StorageDead(x) => mark x dead
3. If a local is moved from, _and has never had its address taken_, mark it dead
4. If (any part of) a local is initialized, mark it live'

This is used to determine whether to save a local in the generator object at all, as well as which locals can be overlapped in the generator layout.

Here's the size in bytes of all testcases included in the change, before and after the change:

async fn test    |Size before |Size after
-----------------|------------|----------
single           | 1028       | 1028
single_with_noop | 2056       | 1032
joined           | 5132       | 3084
joined_with_noop | 8208       | 3084

generator test              |Size before |Size after
----------------------------|------------|----------
move_before_yield           | 1028       | 1028
move_before_yield_with_noop | 2056       | 1032
overlap_move_points         | 3080       | 2056

## Future work

Note that there is a possible extension to this optimization, which modifies rule 3 to read: "If a local is moved from, _**and either has never had its address taken, or is Freeze and has never been mutably borrowed**_, mark it dead." This was discussed at length in rust-lang#59123 and then rust-lang#61849. Because this would cause some behavior to be UB which was not UB before, it's a step that needs to be taken carefully.

A more immediate priority for me is inlining `std::mem::size_of_val(&x)` so it becomes apparent that the address of `x` is not taken. This way, using `size_of_val` to look at the size of your inner futures does not affect the size of your outer future.

cc @cramertj @eddyb @Matthias247 @nikomatsakis @RalfJung @Zoxc
@bors
Copy link
Contributor

bors commented Jul 2, 2019

⌛ Testing commit a68e2c7 with merge 848e0a2...

bors added a commit that referenced this pull request Jul 2, 2019
…jasper

Don't store locals that have been moved from in generators

This avoids reserving storage in generators for locals that are moved
out of (and not re-initialized) prior to yield points. Fixes #59123.

This adds a new dataflow analysis, `RequiresStorage`, to determine whether the storage of a local can be destroyed without being observed by the program. The rules are:

1. StorageLive(x) => mark x live
2. StorageDead(x) => mark x dead
3. If a local is moved from, _and has never had its address taken_, mark it dead
4. If (any part of) a local is initialized, mark it live'

This is used to determine whether to save a local in the generator object at all, as well as which locals can be overlapped in the generator layout.

Here's the size in bytes of all testcases included in the change, before and after the change:

async fn test    |Size before |Size after
-----------------|------------|----------
single           | 1028       | 1028
single_with_noop | 2056       | 1032
joined           | 5132       | 3084
joined_with_noop | 8208       | 3084

generator test              |Size before |Size after
----------------------------|------------|----------
move_before_yield           | 1028       | 1028
move_before_yield_with_noop | 2056       | 1032
overlap_move_points         | 3080       | 2056

## Future work

Note that there is a possible extension to this optimization, which modifies rule 3 to read: "If a local is moved from, _**and either has never had its address taken, or is Freeze and has never been mutably borrowed**_, mark it dead." This was discussed at length in #59123 and then #61849. Because this would cause some behavior to be UB which was not UB before, it's a step that needs to be taken carefully.

A more immediate priority for me is inlining `std::mem::size_of_val(&x)` so it becomes apparent that the address of `x` is not taken. This way, using `size_of_val` to look at the size of your inner futures does not affect the size of your outer future.

cc @cramertj @eddyb @Matthias247 @nikomatsakis @RalfJung @Zoxc
@bors
Copy link
Contributor

bors commented Jul 2, 2019

☀️ Test successful - checks-azure, checks-travis, status-appveyor
Approved by: matthewjasper
Pushing 848e0a2 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 2, 2019
@bors bors merged commit a68e2c7 into rust-lang:master Jul 2, 2019
@tmandry tmandry deleted the moar-generator-optimization branch July 2, 2019 20:08
@Matthias247
Copy link
Contributor

Just for reference: The size of the future in my program dropped from 32kB to 12kB with this change - while the previous change already brought it down from 64kB to 32kB. We are getting there! Thanks @tmandry !

Unfortunately some doubling of future sizes still seems to happen, as initially reported in #59087. I have seen that @tmandry has now also opened #62321 for this issue (I used size_of_val, and removing it definitely shrinks the future).

@tmandry
Copy link
Member Author

tmandry commented Jul 8, 2019

@Matthias247 Just to make sure, does the doubling all seem to come from your use of size_of_val, or are other causes?

I know that any borrowing of the future before await will cause doubling (#59087), but I'm wondering if you've noticed any cases of that in your code other than size_of_val (#62321).

And if there is doubling without any borrowing, I'd definitely like to know about that.

tmandry added a commit to tmandry/rust that referenced this pull request Aug 6, 2019
I tested the generator optimizations in rust-lang#60187 and rust-lang#61922 on the Fuchsia
build, and noticed that some small generators (about 8% of the async fns
in our build) increased in size slightly.

This is because in rust-lang#60187 we split the fields into two groups, a
"prefix" non-overlap region and an overlap region, and lay them out
separately. This can introduce unnecessary padding bytes between the two
groups.

In every single case in the Fuchsia build, it was due to there being
only a single variant being used in the overlap region. This means that
we aren't doing any overlapping, period. So it's better to combine the
two regions into one and lay out all the fields at once, which is what
this change does.
Centril added a commit to Centril/rust that referenced this pull request Aug 6, 2019
…ssions, r=cramertj

Fix generator size regressions due to optimization

I tested the generator optimizations in rust-lang#60187 and rust-lang#61922 on the Fuchsia
build, and noticed that some small generators (about 8% of the async fns
in our build) increased in size slightly.

This is because in rust-lang#60187 we split the fields into two groups, a
"prefix" non-overlap region and an overlap region, and lay them out
separately. This can introduce unnecessary padding bytes between the two
groups.

In every single case in the Fuchsia build, it was due to there being
only a single variant being used in the overlap region. This means that
we aren't doing any overlapping, period. So it's better to combine the
two regions into one and lay out all the fields at once, which is what
this change does.

r? @cramertj
cc @eddyb @Zoxc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generator size: unwinding and drops force extra generator state allocation