Implement the `loop_break_value` feature. #37487

goffrie · 2016-10-30T19:33:56Z

This implements RFC 1624, tracking issue #37339.

FnCtxt (in typeck) gets a stack of LoopCtxts, which store the
currently deduced type of that loop, the desired type, and a list of
break expressions currently seen. loop loops get a fresh type
variable as their initial type (this logic is stolen from that for
arrays). while loops get ().
break {expr} looks up the broken loop, and unifies the type of
expr with the type of the loop.
break with no expr unifies the loop's type with ().
When building MIR, loops no longer construct a () value at
termination of the loop; rather, the break expression assigns the
result of the loop.
I have also changed the loop scoping in MIR-building so that the test
of a while loop is not considered to be part of that loop. This makes
the rules consistent with resolve: fix label scopes #37360. The new loop scopes in typeck also
follow this rule. That means that loop { while (break) {} } now
terminates instead of looping forever. This is technically a breaking
change.
On that note, expressions like while break {} and if break {} no
longer parse because {} is interpreted as an expression argument to
break. But no code except compiler test cases should do that anyway
because it makes no sense.
The RFC did not make it clear, but I chose to make break () inside
of a while loop illegal, just in case we wanted to do anything with
that design space in the future.

This is my first time dealing with this part of rustc so I'm sure
there's plenty of problems to pick on here ^_^

rust-highfive · 2016-10-30T19:34:11Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Aatch (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

nagisa · 2016-10-30T22:21:20Z

src/librustc_mir/build/expr/into.rs

+                    body_block = loop_block;
+                }
+
+                let might_break = this.in_loop_scope(


NIT: formatting is extra-weird here. extracting destination into variable here might be better.

I also feel that you might not need the destination to be conditional (i.e. Option<T>) here either. Instead, just assign unit to the destination if there’s break without EXPR.

nagisa · 2016-10-30T22:35:16Z

src/librustc_mir/build/expr/into.rs

                    }
-
-                    // The “return” value of the loop body must always be an unit, but we cannot


I remember this well-formed-ness business being quite non-trivial, so I feel like this comment should stay. That being said its entirely possible that introducing break EXPR made this well-formedness story obsolete, in which case this whole code might be ready to be cleaned up.

AFAICT the comment is about reusing the value of the loop's block (which is always ()) as the value of the loop itself. When you consider that break EXPR exists, though, it becomes apparent (IMO) that this is nonsensical because the value of the loop is not always (). That's why I removed most of the comment.

nagisa · 2016-10-30T22:45:27Z

src/librustc_mir/build/expr/stmt.rs

-        let (exit_block, extent) = {
-            let loop_scope = self.find_loop_scope(span, label);
-            (exit_selector(loop_scope), loop_scope.extent)
-        };
        self.exit_scope(span, extent, block, exit_block);
        self.cfg.start_new_block().unit()


Could you please just inline this? Having a helper method with just 2 lines with 2 callers doesn’t seem to justify it anymore.

nagisa · 2016-10-30T22:45:40Z

src/librustc_mir/build/expr/stmt.rs

+                    (break_block, extent, destination.clone())
+                };
+                if let Some(dest) = destination {
+                    if let Some(value) = value.take() {


nit: value.take() is superfluous and this is ignoring the continuation block returned by either branch. This whole function could instead look like this:

let scope = this.find_loop_scope(expr_span, label); scope.might_break = true; if let Some(ref dst) = scope.destination { block /* this is important! */ = match value { Some(value) => unpack!(block = this.into(dst, block, value)), None => this.cfg.push_assign_unit(block, source_info, dst), }; } this.exit_scope(expr_span, scope.extent, scope.break_block, block); this.cfg.start_new_block().unit()

The reason why this code is so awkward is that we can't keep this borrowed when we get to further calls like this.into. (find_loop_scope borrows &mut self and that's tied to the returned &mut LoopScope).
Also, the first branch already takes care of the returned block (it uses unpack!; into returns BlockAnd<()>) while the second branch doesn't return a block (push_assign_unit returns ()).

.take() is a remnant though, my bad.

Also, the first branch already takes care of the returned block (it uses unpack!

Ah, I missed the block = inside the unpack, my bad.

The reason why this code is so awkward is that we can't keep this borrowed

You can manually relinquish the borrow by doing something along the lines of:

let (extent, break_block) = { let scope = this.find_loop_scope(expr_span, label); scope.might_break = true; if let Some(ref dst) = scope.destination { block /* this is important! */ = match value { Some(value) => unpack!(block = this.into(dst, block, value)), None => this.cfg.push_assign_unit(block, source_info, dst), }; } (scope.extent, scope.break_block) }; this.exit_scope(expr_span, extent, break_block, block); this.cfg.start_new_block().unit()

This should work because both Extent and BasicBlock are Copy AFAIR.

error[E0499]: cannot borrow `*this` as mutable more than once at a time --> src/librustc_mir/build/expr/stmt.rs:91:45 | 87 | let scope = this.find_loop_scope(expr_span, label); | ---- first mutable borrow occurs here ... 91 | unpack!(block = this.into(dest, block, value)) | ----------------^^^^-------------------------- | | | | | second mutable borrow occurs here | in this macro invocation ... 97 | }; | - first borrow ends here

I don't think we can avoid cloning destination without unsafe code because the borrow checker doesn't know that this.into will never pop the loop scope (among other reasons).

I don't think we can avoid cloning destination without unsafe code because the borrow checker doesn't know that this.into will never pop the loop scope (among other reasons).

Right, for this reason the clone is not unnecessary either, so you ought to remove the comment.

nagisa · 2016-10-30T22:50:48Z

src/test/run-pass/loop-break-value.rs

+
+#![feature(loop_break_value)]
+
+enum Void {}


use the ! type instead of defining Void.

nagisa · 2016-10-30T23:13:19Z

src/librustc_passes/diagnostics.rs

@@ -228,4 +228,5 @@ pub impl Foo for Bar {
 register_diagnostics! {
    E0472, // asm! is unsupported on this target
    E0561, // patterns aren't allowed in function pointer types
+    E0571, // `break` with an argument in a `while` loop


What about break in for loops? Can we investigate desugaring while loops into regular loop instead of adding this special case?

nagisa · 2016-10-30T23:14:00Z

This is technically a breaking change.

A silent and very breaking change. Might need discussion and crater runs.

goffrie · 2016-10-30T23:55:20Z

While you are right, I just want to mention that the rules were inconsistent before and MIR building was in the minority, since both the "check loops" pass and label resolution (as of #37360) believed that a break in the test of a while could not refer to the while itself. Though, the construct itself is quite confusing and perhaps we should just ban it (users can write an explicit label if they want).

goffrie · 2016-10-31T00:12:21Z

Hm, I just found out that while let and for are desugared into ExprLoop loops, not ExprWhile (which makes perfect sense in retrospect). I guess that checking pass needs to change regardless. Also my typeck code panics on break () in for loops --- but not while let, go figure.

goffrie · 2016-10-31T01:15:05Z

Alright, I removed the conditional loop destination - break now always writes its argument (or ()). However, in the case of a while loop, a break will lead to two writes of () since we need to do the write in case the test evaluates to false. Do you think that's too ugly?

eddyb · 2016-11-01T02:26:36Z

On that note, expressions like while break {} and if break {} no longer parse because {} is interpreted as an expression argument to
break.
But no code except compiler test cases should do that anyway because it makes no sense.

We try hard to avoid syntax breaking changes, this needs to be put under discussion.

cc @rust-lang/lang @rust-lang/compiler

r? @nikomatsakis

nikomatsakis · 2016-11-01T16:45:26Z

I would have assumed that while break { would parse as while (break) {, pursuant to our usual policy of excluding expressions that end in blocks from within a while. That said, I think that while return { () } has a similar problem, and we appear not to accept that. But I consider this a bug -- and I think there is even an open bug about it?

nikomatsakis · 2016-11-01T21:14:28Z

@nagisa

A silent and very breaking change.

In what sense "silent"? I interpret that to mean "silently changes the semantics of your code, without causing a compilation error" -- is that the case here, or did you have another meaning in mind?

nagisa · 2016-11-01T21:30:34Z

@nikomatsakis the exact example was provided (I guess I ought’ve’d quoted more context):

means that loop { while (break) {} } now terminates instead of looping forever.

This changes semantics of a certain code snippet from something to something else.

Though I wonder what the behaviour was before MIR, with the old trans.

nikomatsakis · 2016-11-02T19:18:30Z

@nagisa Ah, I see! You were not referring to the parsing change, but rather this, which I overlooked:

I have also changed the loop scoping in MIR-building so that the test
of a while loop is not considered to be part of that loop. This makes
the rules consistent with #37360. The new loop scopes in typeck also
follow this rule. That means that loop { while (break) {} } now
terminates instead of looping forever. This is technically a breaking
change.

Interesting. This doesn't actually match my intuition -- but my intuition is weak here, and I suspect that it may change from day to day. But it does seem like the code is (or was) inconsistent in this respect. I think we should make a firm decision here and document it; we could also make some effort to detect code that may change semantics.

cc @jseyfried -- in preparing #37360, did you survey different parts of the compiler to get a feeling for what the "majority" of them tried to do with while break { }?

bors · 2016-11-03T00:50:02Z

☔ The latest upstream changes (presumably #37540) made this pull request unmergeable. Please resolve the merge conflicts.

nikomatsakis · 2016-11-04T00:24:19Z

We discussed some of the corner cases in the @rust-lang/lang meeting and came to these conclusions:

while break { } should parse as while (break) { }.
- This follows typical Rust practice.
- The fact that return does not behave in this way is also considered a bug (Parsing inconsistencies (lambda, proc, return) #28784) -- though I think I neglected to raise this point in the meeting.
As for whether while break { } should break out of the while loop (as it does today) or out of the containing loop (as it does under this PR), it seems like neither is particularly intuitive. Our preference would be to simply make it an error to put break in the condition of a while loop.
- We can make it a hard error and do a test, but it may be that we want to go through a warning period here.

jseyfried · 2016-11-04T03:56:09Z

@nikomatsakis
I am aware of three sources of semantics for while break {}.

After tcx creation, I believe the semantics of while break {} are determined entirely by middle::liveness, which always considers the break to apply to the while loop as opposed to its enclosing loop.
The loops pass checks for breaks outside of loops and considers the break to apply to the while's enclosing loop.
resolve (well, its authors) used to consider the break to apply to the while loop (i.e. the break's label scope includes the while's label). However, due to an apparently unintended consequence of the old hygiene algorithm, the break's label scope happened to not include the while's label. When I removed the old hygiene algorithm in Simplify the macro hygiene algorithm #34570, the while's label got included. However, using the added label caused middle::liveness to mysteriously ICE, so I reverted my unintended change to the already unintended scope. I don't understand why using the label caused an ICE -- I'll investigate.

My intuition is that the break should apply to the while loop, not its enclosing loop. Since a while loop's condition is executed in each iteration of the loop, I think it should be treated like the loop body w.r.t. breaks. That is, I think of while cond { expr } as sugar for loop { if !cond { break }; expr }.

eddyb · 2016-11-04T05:36:52Z

Interesting desugaring, I suppose I didn't consider inverting the condition before. Should we desugar while loops like that in HIR?

jseyfried · 2016-11-04T06:24:18Z

@eddyb Yeah, I think that would be a good idea.

nikomatsakis · 2016-11-04T18:24:07Z

@eddyb heh, for a while we were going to remove while altogether in favor of loop, but decided not to weird people out too much.

@jseyfried I agree that the logical place is that break should target the while loop. However, I think that this is natural to compiler authors -- and maybe people who stop to think very carefully -- but is not obvious at first glance when you just look over the code.

nikomatsakis · 2016-11-08T21:57:35Z

@goffrie did you see my previous comment? how do you feel about trying to update the implementation as described?

goffrie · 2016-11-08T22:03:51Z

Will do. I'll have some time to work on it tomorrow. As far as this PR goes

should we keep the existing inconsistent behaviour for break in the
condition, and fix it later?

On Nov 8, 2016 1:57 PM, "Niko Matsakis" notifications@github.com wrote:

@goffrie https://github.com/goffrie did you see my previous comment?
how do you feel about trying to update the implementation as described?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#37487 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABLtTuiPZOXi6BWx35xPHUziuxA1n3Dwks5q8PBdgaJpZM4KkeCn
.

nikomatsakis · 2016-11-09T22:24:52Z

@goffrie

should we keep the existing inconsistent behaviour for break in the condition, and fix it later?

seems good, maybe open an issue on this topic?

goffrie · 2016-11-10T07:09:37Z

Alright, I'm now treating the break in while break {} to refer to the while. Also, while break {} parses correctly.

seems good, maybe open an issue on this topic?

There's #37576.

nikomatsakis

This basically looks good -- happy to r+ if you address nits plus answer that one question? (Maybe I'm missing something there...). Nice job!

nikomatsakis · 2016-11-14T23:49:34Z

src/librustc_passes/loops.rs

+                            Some(match self.hir_map.expect_expr(loop_id).node {
+                                hir::ExprWhile(..) => LoopKind::WhileLoop,
+                                hir::ExprLoop(_, _, source) => LoopKind::Loop(source),
+                                _ => span_bug!(e.span, "break label resolved to a non-loop"),


Nit: I usually find it helpful to include some more debugging info in cases like these. e.g.,

ref r => span_bug!(e.span, "break label resolved to a non-loop: {:?}", r),

that said, having the span is probably enough here

nikomatsakis · 2016-11-14T23:50:24Z

src/librustc_passes/loops.rs

+                    } else if let Loop(kind) = self.cx {
+                        Some(kind)
+                    } else {
+                        None


is this case an error that will be reported elsewhere? if so, can you leave a comment saying so (and maybe hinting at where?)

I'm adding a comment. This case is a break outside of a loop, which will be caught just below.

nikomatsakis · 2016-11-14T23:55:02Z

src/librustc_typeck/check/mod.rs

+                    } else {
+                        // `break` without argument acts like `break ()`.
+                        e_ty = tcx.mk_nil();
+                        origin = TypeOrigin::Misc(expr.span);


We could probably give better errors here if we used a more specific origin, sort of like we do for match arms.

nikomatsakis · 2016-11-14T23:59:29Z

src/librustc_typeck/check/mod.rs

                // No way to know whether it's diverging because
                // of a `break` or an outer `break` or `return.
                self.diverges.set(Diverges::Maybe);

-                tcx.mk_nil()
+                ctxt.unified
            } else {
                tcx.types.never


hmm, do we have any tests around this case? I'm a bit surprised we don't get an "unconstrained type variable" for unified, at least in some cases. What happens if you do something like this, for example?

fn foo() { loop { }; }

ctxt.unified is only used if ctxt.may_break is true, i.e. there was at least one break, which indicates that the type variable has been unified with something (either the type of a break argument, or ()).

bors · 2016-11-16T19:49:38Z

☔ The latest upstream changes (presumably #37545) made this pull request unmergeable. Please resolve the merge conflicts.

bors · 2016-11-17T18:57:58Z

☔ The latest upstream changes (presumably #37732) made this pull request unmergeable. Please resolve the merge conflicts.

nikomatsakis · 2016-11-17T19:33:17Z

Good enough. @goffrie after you rebase, ping me and I will r+.

KalitaAlexey · 2016-11-18T04:43:13Z

Dear @nikomatsakis,
I can't find in the changes any changes in the Rust Book.
Is information about this feature going to be added in an another pull request?

goffrie · 2016-11-18T06:14:15Z

@nikomatsakis alright, rebased!

nikomatsakis · 2016-11-22T01:40:56Z

@bors r+

bors · 2016-11-22T01:40:57Z

📌 Commit 712082e has been approved by nikomatsakis

nikomatsakis · 2016-11-22T01:41:44Z

@KalitaAlexey

Is information about this feature going to be added in an another pull request?

before it is stabilized, yes -- but in the meantime, you can checkout RFC 1624

bors · 2016-11-22T03:15:47Z

🔒 Merge conflict

bors · 2016-11-22T03:17:57Z

☔ The latest upstream changes (presumably #37642) made this pull request unmergeable. Please resolve the merge conflicts.

This implements RFC 1624, tracking issue rust-lang#37339. - `FnCtxt` (in typeck) gets a stack of `LoopCtxt`s, which store the currently deduced type of that loop, the desired type, and a list of break expressions currently seen. `loop` loops get a fresh type variable as their initial type (this logic is stolen from that for arrays). `while` loops get `()`. - `break {expr}` looks up the broken loop, and unifies the type of `expr` with the type of the loop. - `break` with no expr unifies the loop's type with `()`. - When building MIR, `loop` loops no longer construct a `()` value at termination of the loop; rather, the `break` expression assigns the result of the loop. `while` loops are unchanged. - `break` respects contexts in which expressions may not end with braced blocks. That is, `while break { break-value } { while-body }` is illegal; this preserves backwards compatibility. - The RFC did not make it clear, but I chose to make `break ()` inside of a `while` loop illegal, just in case we wanted to do anything with that design space in the future. This is my first time dealing with this part of rustc so I'm sure there's plenty of problems to pick on here ^_^

goffrie · 2016-11-22T04:28:10Z

rebased :/

aturon · 2016-11-22T16:40:50Z

@bors r=nikomatsakis

aturon · 2016-11-22T18:46:23Z

Since the r= notation doesn't seem to be working...

@bors r+

nikomatsakis · 2016-11-22T20:11:18Z

@bors r-

nikomatsakis · 2016-11-22T20:11:23Z

@bors r+

bors · 2016-11-22T20:11:24Z

📌 Commit 9d42549 has been approved by nikomatsakis

bors · 2016-11-22T23:52:00Z

⌛ Testing commit 9d42549 with merge 1cabe21...

Implement the `loop_break_value` feature. This implements RFC 1624, tracking issue #37339. - `FnCtxt` (in typeck) gets a stack of `LoopCtxt`s, which store the currently deduced type of that loop, the desired type, and a list of break expressions currently seen. `loop` loops get a fresh type variable as their initial type (this logic is stolen from that for arrays). `while` loops get `()`. - `break {expr}` looks up the broken loop, and unifies the type of `expr` with the type of the loop. - `break` with no expr unifies the loop's type with `()`. - When building MIR, loops no longer construct a `()` value at termination of the loop; rather, the `break` expression assigns the result of the loop. - ~~I have also changed the loop scoping in MIR-building so that the test of a while loop is not considered to be part of that loop. This makes the rules consistent with #37360. The new loop scopes in typeck also follow this rule. That means that `loop { while (break) {} }` now terminates instead of looping forever. This is technically a breaking change.~~ - ~~On that note, expressions like `while break {}` and `if break {}` no longer parse because `{}` is interpreted as an expression argument to `break`. But no code except compiler test cases should do that anyway because it makes no sense.~~ - The RFC did not make it clear, but I chose to make `break ()` inside of a `while` loop illegal, just in case we wanted to do anything with that design space in the future. This is my first time dealing with this part of rustc so I'm sure there's plenty of problems to pick on here ^_^

bors · 2016-11-23T03:54:08Z

rust-highfive assigned Aatch Oct 30, 2016

nagisa requested changes Oct 30, 2016

View reviewed changes

rust-highfive assigned nikomatsakis and unassigned Aatch Nov 1, 2016

eddyb added I-nominated T-lang Relevant to the language team labels Nov 1, 2016

nagisa mentioned this pull request Nov 4, 2016

It is not intuitive which loop is broken out from in loop { while break {} } #37576

Closed

goffrie force-pushed the break branch from 93fa646 to d38b5c9 Compare November 10, 2016 07:08

nikomatsakis removed the I-nominated label Nov 10, 2016

nikomatsakis requested changes Nov 15, 2016

View reviewed changes

goffrie force-pushed the break branch from 2d08c0c to 9c3fa4c Compare November 16, 2016 06:41

goffrie force-pushed the break branch from 9c3fa4c to bd02ceb Compare November 17, 2016 05:03

nikomatsakis approved these changes Nov 17, 2016

View reviewed changes

goffrie force-pushed the break branch from bd02ceb to 712082e Compare November 18, 2016 06:13

goffrie force-pushed the break branch from 712082e to 9d42549 Compare November 22, 2016 04:26

bors merged commit 9d42549 into rust-lang:master Nov 23, 2016

bors mentioned this pull request Nov 23, 2016

i128 and u128 support #37900

Closed

2 tasks

goffrie deleted the break branch November 23, 2016 06:04

withoutboats mentioned this pull request Jan 8, 2017

Add 'else match' blocks to if expressions. rust-lang/rfcs#1712

Closed

nikomatsakis mentioned this pull request Feb 16, 2017

ICE in while break {} #39867

Closed

		}

		// The “return” value of the loop body must always be an unit, but we cannot


		#![feature(loop_break_value)]

		enum Void {}

Implement the loop_break_value feature. #37487

Implement the loop_break_value feature. #37487

Uh oh!

Conversation

goffrie commented Oct 30, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rust-highfive commented Oct 30, 2016

Uh oh!

nagisa Oct 30, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nagisa Oct 30, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nagisa commented Oct 30, 2016

Uh oh!

goffrie commented Oct 30, 2016

Uh oh!

goffrie commented Oct 31, 2016

Uh oh!

goffrie commented Oct 31, 2016

Uh oh!

eddyb commented Nov 1, 2016

Uh oh!

nikomatsakis commented Nov 1, 2016

Uh oh!

nikomatsakis commented Nov 1, 2016

Uh oh!

nagisa commented Nov 1, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikomatsakis commented Nov 2, 2016

Uh oh!

bors commented Nov 3, 2016

Uh oh!

nikomatsakis commented Nov 4, 2016

Uh oh!

jseyfried commented Nov 4, 2016

Uh oh!

eddyb commented Nov 4, 2016

Uh oh!

jseyfried commented Nov 4, 2016

Uh oh!

nikomatsakis commented Nov 4, 2016

Uh oh!

nikomatsakis commented Nov 8, 2016

Uh oh!

goffrie commented Nov 8, 2016

Uh oh!

nikomatsakis commented Nov 9, 2016

Uh oh!

goffrie commented Nov 10, 2016

Uh oh!

nikomatsakis left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Implement the `loop_break_value` feature. #37487

Implement the `loop_break_value` feature. #37487

goffrie commented Oct 30, 2016 •

edited

Loading

nagisa Oct 30, 2016 •

edited

Loading

nagisa Oct 30, 2016 •

edited

Loading

nagisa commented Nov 1, 2016 •

edited

Loading