Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new scoping rules for safe dtors can yield spurious semi-colon or trailing unit expr #21114

Closed
pnkfelix opened this issue Jan 13, 2015 · 43 comments
Labels
A-destructors Area: Destructors (`Drop`, …) A-lifetimes Area: Lifetimes / regions C-bug Category: This is a bug. fixed-by-NLL Bugs fixed, but only when NLL is enabled. metabug Issues about issues themselves ("bugs about bugs") T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@pnkfelix
Copy link
Member

pnkfelix commented Jan 13, 2015

Spawned off of #21022, #21972

There are a number of situations where a trailing expression like a for-loop ends up being treated by the region lifetime inferencer as requiring a much longer lifetime assignment than what you would intuitively expect.

A lot of the relevant cases have been addressed. In fact, its gotten to the point that I want to have a summary comment up here that specifies the status for each example. Each demo should link to a playpen (using AST-borrowck when that has a checkmark, and NLL when AST-borrowck does not have a checkmark). Some rows also include a link to the comment where it was pointed out.

  • A checkmark (✅) is used for cases where either 1. code is now accepted, or 2. code is (still) rejected but has a clear diagnostic explaining why.
  • The column marked 2018 is testing here the 2018 edition, where migration mode gets most of the benefits from NLL. They will also eventually trickle down to the 2015 edition when we switch that to use NLL as well.
  • The magnifying glass (🔍) is for cases where we issue a diagnostic suggesting that one add a local binding or a semi-colon (depending on whether the value is needed or not), explaining that the temporaries (under the current language semantics) are being dropped too late without it. This sort of case is, to my eye, acceptable.
  • The column marked NLL is for cases where you need to opt into #![feature(nll)] (or use the synonymous command line options -Z borrowck=mir -Z two-phase-borrows) to get the nicest output. In most cases the 2018 output is the same, so I didn't fill in the column in those cases.
Example on playpen 2015 2018 NLL source
io::stdin demo 1
io::stdin demo 2 🤢
alloc::rc demo
ebfull drain demo 🤢 #21114 (comment)
felispere stdio demo 🤢 #21114 (comment)
drbawb demo #21114 (comment)
niconii demo 🤢 🔍 #21114 (comment)
kixunil demo 🤢 #21114 (comment)
thiez demo 1 🤢 #21114 (comment)
thiez demo 2 🤢 🔍 #21114 (comment)
shepmaster demo 🤢 #21114 (comment)
jonas-schievink demo 🤢 🤢 #42574
stephaneyfx demo 1 🤢 🔍 #46413 (comment)
stephaneyfx demo 2 🤢 🔍 #46413 (comment)

More details for issue follow.

A simple example of this (from examples embedded in the docs for io::stdio):

    use std::io;

    let mut stdin = io::stdin();
    for line in stdin.lock().lines() {
        println!("{}", line.unwrap());
    };

Another example, this time from an example for alloc::rc (where this time I took the time to add a comment explaining what I encountered):

fn main() {
     let gadget_owner : Rc<Owner> = Rc::new(
            Owner {
                name: "Gadget Man".to_string(),
                gadgets: RefCell::new(Vec::new())
            }
    );

    let gadget1 = Rc::new(Gadget{id: 1, owner: gadget_owner.clone()});
    let gadget2 = Rc::new(Gadget{id: 2, owner: gadget_owner.clone()});

    gadget_owner.gadgets.borrow_mut().push(gadget1.clone().downgrade());
    gadget_owner.gadgets.borrow_mut().push(gadget2.clone().downgrade());

    for gadget_opt in gadget_owner.gadgets.borrow().iter() {
        let gadget = gadget_opt.upgrade().unwrap();
        println!("Gadget {} owned by {}", gadget.id, gadget.owner.name);
    }

    // This is an unfortunate wart that is a side-effect of the implmentation
    // of new destructor semantics: if the above for-loop is the final expression
    // in the function, the borrow-checker treats the gadget_owner as needing to
    // live past the destruction scope of the function (which of course it does not).
    // To work around this, for now I am inserting a dummy value just so the above
    // for-loop is no longer the final expression in the block.
    ()
}

Luckily for #21022, the coincidence of factors here is not very frequent, which is why I'm not planning on blocking #21022 on resolving this, but instead leaving it as something to address after issues for the alpha have been addressed.

(I'm filing this bug before landing the PR so that I can annotation each of the corresponding cases with a FIXME so that I can go back and address them after I get a chance to investigate this properly.)

@pnkfelix pnkfelix changed the title new scoping rules for safe dtors often yield spurious semi-colon or trailing unit expr new scoping rules for safe dtors can yield spurious semi-colon or trailing unit expr Jan 13, 2015
@kmcallister kmcallister added A-lifetimes Area: Lifetimes / regions A-destructors Area: Destructors (`Drop`, …) labels Jan 14, 2015
@pnkfelix
Copy link
Member Author

The solution here seems to be to adjust the infer lifetimes for temporaries generated from the <iter> in for <pat> in <iter> { ... }.

I have a patch to do this on my branch, and so far it seems to work, but @nikomatsakis has pointed out to me that I may be better off waiting until after #20790 lands.

@pnkfelix
Copy link
Member Author

(Nonetheless, there may be similar or related issues in other syntactic forms, such as if let <pat> = <expr> { ... }. At the very least, I will be able to use this bug as a reference point in FIXME notes.)

pnkfelix referenced this issue in pnkfelix/rust Feb 2, 2015
note that some of these cases may actually be fixes to latent bugs, in
some sense (depending on what our guarantees are about e.g. what a
hashmap should be allowed to access in its own destructor).
@pnkfelix
Copy link
Member Author

(This was largely addressed by #21984. I am not sure whether there is much more we can actually do here. You can search for FIXME's in this commit bdb9f3e for most of the fallout that was caused by this in the end (its mostly with if let, along with one semicolon I had to add to a test; it was not that bad.)

@ebfull
Copy link
Contributor

ebfull commented Feb 13, 2015

I think I'm having a similar issue:

use std::collections::HashMap;
use std::sync::Mutex;

fn i_used_to_be_able_to(foo: &Mutex<HashMap<usize, usize>>) -> Vec<(usize, usize)> {
    let mut foo = foo.lock().unwrap();

    foo.drain().collect()
}

fn but_after_nightly_update_now_i_gotta(foo: &Mutex<HashMap<usize, usize>>) -> Vec<(usize, usize)> {
    let mut foo = foo.lock().unwrap();

    return foo.drain().collect();
}

fn main() {}

@felipesere
Copy link

I get the above mentioned error when running rustc rustc 1.0.0-nightly (3ef8ff1f8 2015-02-12 00:38:24 +0000)
With the following code:

    std::old_io::stdio::stdin().lock().lines().map( |line| {
        line.unwrap().trim().to_string()
    }).collect()

When using rustc 1.0.0-nightly (00df3251f 2015-02-08 23:24:33 +0000) I don't get the error.

felipesere added a commit to felipesere/icepick that referenced this issue Feb 13, 2015
Due to the following bug in rustc lifetime inferrer:
rust-lang/rust#21114
@pnkfelix
Copy link
Member Author

@felipesere Your example, I think, is running afoul of the failure of the region system's model to precisely track the destruction order for r-value temporaries; see #22323.

After rewriting your example to accommodate that short-coming, we have this:

fn lines() -> Vec<String> {
    let mut stdin = std::old_io::stdio::stdin();
    let mut locked = stdin.lock();
    locked.lines().map( |line| {
        line.unwrap().trim().to_string()
    }).collect()
}

which compiles for me. So I do not think this is an instance of this ticket, but rather of #22323

@pnkfelix
Copy link
Member Author

@ebfull your example is very interesting. (I am not yet sure whether the underlying cause is the same as the one that led to this ticket, but I definitely would like to fix it.)

cc #22321

@drbawb
Copy link

drbawb commented Feb 15, 2015

I think I'm bumping up against this, I have a function that acquires a RwLock in a trailing expression. Though it's a match expression, not a for loop.

The code looks similar to this:

type Context = Arc<RwLock<HashMap<Uuid, User>>>;

fn disconnect_user(ctx: Context, uid: Uuid) -> Result<String, String> {
    // ....
   match ctx.write().unwrap().remove(&uid) {
         Some(user) => { Ok(format!("user removed")) },
         None => { Err(format!("user not removed")) },
    }
}

Which yields this error on the latest nightly (b63cee4a1 2015-02-14):

src/supervisor/handlers/connections.rs:162:8: 162:11 error: `ctx` does not live long enough
src/supervisor/handlers/connections.rs:162      match ctx.users.write().ok().unwrap().remove(&uid) {
                                                      ^~~
src/supervisor/handlers/connections.rs:153:78: 186:2 note: reference must be valid for the destruction scope surrounding block at 153:77...
src/supervisor/handlers/connections.rs:153 pub fn disconnect_user(req: &Request, env: &mut Env) -> Result<String,String>{
src/supervisor/handlers/connections.rs:154      let ctx = env.get::<Arc<Context>>().unwrap().clone();
src/supervisor/handlers/connections.rs:155      let io  = env.get::<Sender<Envelope>>().unwrap().clone();
src/supervisor/handlers/connections.rs:156      let uid = req.uid.clone();
src/supervisor/handlers/connections.rs:157 
src/supervisor/handlers/connections.rs:158      let reason = req.payload.as_ref().and_then(|pl| {
                                           ...
src/supervisor/handlers/connections.rs:154:54: 186:2 note: ...but borrowed value is only valid for the block suffix following statement 0 at 154:53
src/supervisor/handlers/connections.rs:154      let ctx = env.get::<Arc<Context>>().unwrap().clone();
src/supervisor/handlers/connections.rs:155      let io  = env.get::<Sender<Envelope>>().unwrap().clone();
src/supervisor/handlers/connections.rs:156      let uid = req.uid.clone();
src/supervisor/handlers/connections.rs:157 
src/supervisor/handlers/connections.rs:158      let reason = req.payload.as_ref().and_then(|pl| {
src/supervisor/handlers/connections.rs:159              json::decode(&pl[]).ok()
                                           ...
error: aborting due to previous error

I seem to be able to work around it in two ways. I can either bind the result of the match exprsession to a variable and return that instead, or I can bind context's lock guard to a variable and perform the match on that.

@pnkfelix pnkfelix self-assigned this Feb 16, 2015
@pnkfelix
Copy link
Member Author

@drbawb yes, I suspect this is the same issue. I doubt I will be fixing this any time soon (i.e., I do not think it is a 1.0 blocker issue), so while I do hope to fix it eventually, I think you will need to live with the workarounds for now.

@steveklabnik
Copy link
Member

steveklabnik commented May 24, 2016

A full reproduction of @pnkfelix 's early example compiles today without the ():

use std::rc::Rc;
use std::rc::Weak;
use std::cell::RefCell;

struct Owner {
    name: String,
    gadgets: RefCell<Vec<Weak<Gadget>>>,
}

struct Gadget {
    id: i32,
    owner: Rc<Owner>,
}

fn main() {
    let gadget_owner : Rc<Owner> = Rc::new(
        Owner { 
            name: "Gadget Man".to_string(),
            gadgets: RefCell::new(Vec::new())
        }
        );

    let gadget1 = Rc::new(Gadget{id: 1, owner: gadget_owner.clone()});
    let gadget2 = Rc::new(Gadget{id: 2, owner: gadget_owner.clone()});

    gadget_owner.gadgets.borrow_mut().push(Rc::downgrade(&gadget1.clone()));
    gadget_owner.gadgets.borrow_mut().push(Rc::downgrade(&gadget2.clone()));

    for gadget_opt in gadget_owner.gadgets.borrow().iter() {
        let gadget = gadget_opt.upgrade().unwrap();
        println!("Gadget {} owned by {}", gadget.id, gadget.owner.name);
    }
}

That said, obviously this issue is about a bit more than that, and as he said:

(I'm filing this bug before landing the PR so that I can annotation each of the corresponding cases with a FIXME so that I can go back and address them after I get a chance to investigate this properly.)

@pnkfelix where does this stand today?

@pnkfelix
Copy link
Member Author

@steveklabnik it seems like the cases of interest here got fixed via independent changes elsewhere in rustc

It might be interesting to track down which PR did this, but given that we have made a number of refinements to the region system since this was filed, I'm not astounded that it was fixed .... merely mildly surprised

@niconii
Copy link
Contributor

niconii commented Jun 24, 2016

This problem seems to still crop up with code like this, which doesn't compile unless ; or an expression is added after the if-let:

use std::sync::Mutex;

fn main() {
    let counter = Mutex::new(0);

    if let Ok(_) = counter.lock() { }
}

@pnkfelix pnkfelix reopened this Sep 20, 2016
@pnkfelix
Copy link
Member Author

reopening based on discussion at #22252

@pnkfelix
Copy link
Member Author

(I will try to accumulate a representative set of currently failing examples in the description for ease of reference)

@pnkfelix
Copy link
Member Author

(Shifting effort into making a diagnostic to address this pain point; see #54556.)

WIth that shift, this bug is dedicated to the hypothetical attempt to "fix" this problem in some way that isn't just changing diagnostics.

But also with that shift, this bug is no longer part of the NLL work, nor on the Rust 2018 release milestone.

@pnkfelix
Copy link
Member Author

pnkfelix commented Oct 4, 2018

Some months ago @Thiez wrote an interesting note that pointed out an oddity where a tail expression with if _ { EXPR } else { EXPR } in the block tail position was compiling, while just EXPR in the tail expression (which you would think has the same dynamic semantics) was not compiling.

I investigated, and discovered something interesting: Apparently the two forms do not have the same dynamic semantics, because the temporaries for the EXPR in if _ { EXPR } else { EXPR } are dropped sooner than for EXPR alone.

Here is a concrete demonstration illustrating this (play:

#![feature(nll)]

struct T(&'static str, u8);
impl Drop for T {
    fn drop(&mut self){ println!("Dropping T {:?}", self.0); }
}

fn ends_with_if_else() -> u8 {
    println!("entered `ends_with_if_else`");
    let _tmp = T("temp", 0);
    if true { T("expr", 1).1 } else { T("expr", 2).1 }
}

fn ends_with_tail() -> u8 {
    println!("entered `ends_with_tail`");
    let _tmp = T("temp", 3);
    T("expr", 4).1
}

fn main() {
    {
        let _outer1 = T("outer1", 5);
        println!("invoke method `ends_with_if_else`");
        ends_with_if_else();
        println!("returned from `ends_with_if_else`");
    }

    println!("");

    {
        let _outer2 = T("outer2", 6);
        println!("invoke method `ends_with_tail`");
        ends_with_tail();
        println!("returned from `ends_with_tail`");
    }
}

This prints:

invoke method `ends_with_if_else`
entered `ends_with_if_else`
Dropping T "expr"
Dropping T "temp"
returned from `ends_with_if_else`
Dropping T "outer1"

invoke method `ends_with_tail`
entered `ends_with_tail`
Dropping T "temp"
Dropping T "expr"
returned from `ends_with_tail`
Dropping T "outer2"

The implication is that our current temporary r-value rules do not let temporaries from the tails of if/else expressions bubble out to the if/else itself.

This discrepancy is perhaps unfortunate, but I don't know if it would make sense to try to change it. If we were to try to change it, we should probably attach such a change to a particular Rust edition shift (and we're too close to the release of the 2018 edition to attempt it there.)

@kwanyyoss
Copy link

The implication is that our current temporary r-value rules do not let temporaries from the tails of if/else expressions bubble out to the if/else itself.

I am confused. I thought the behavior of "ends_with_if_else" is the good one.

This discrepancy is perhaps unfortunate, but I don't know if it would make sense to try to change it. If we were to try to change it, we should probably attach such a change to a particular Rust edition shift (and we're too close to the release of the 2018 edition to attempt it there.)

I added "let ret = ..." to the last statements and ended with a naked "ret" as returned value from both "ends_with_tail" and "ends_with_if_else". The behavior of the new forms looks completely correct to me: Playground

invoke method `ends_with_if_else`
entered `ends_with_if_else`
Dropping T "expr"
result obtained
Dropping T "temp"
returned from `ends_with_if_else`
Dropping T "outer1"

invoke method `ends_with_tail`
entered `ends_with_tail`
Dropping T "expr"
result obtained
Dropping T "temp"
returned from `ends_with_tail`
Dropping T "outer2"

"expr" was dropped even before "result obtained" as it should be. I suppose this was also what @Thiez wanted.

Why is let ret = T("expr", 4).1; ret not the same as T("expr", 4).1? I even tried let ret = &T("expr", 4).1; *ret and let ret = &T("expr", 4); ret.1. "expr" was dropped before "temp" in all 3 cases with "let ret =".

@pnkfelix
Copy link
Member Author

pnkfelix commented Oct 9, 2018

@kwanyyoss wrote:

Why is let ret = T("expr", 4).1; ret not the same as T("expr", 4).1?

Because the former is assigning its result into a let-bound variable of the block, while the latter is passing its result back outside of the block directly, without going through a let-statement.

For better or for worse, the temporary r-value lifetime rules we have chosen dictate that the temporaries are dropped at different times for these two scenarios.

There is much background discussion of this, as well as proposals for revising our semantics here. See e.g.

@pnkfelix pnkfelix closed this as completed Oct 9, 2018
@pnkfelix pnkfelix reopened this Oct 9, 2018
@kwanyyoss
Copy link

Thanks @pnkfelix for the reply and references!

I think the problem with this case is that the return value is just a "u8". As long as the type is "Copy", it is hard to see why the lifetime of any temporary where the final value came from needs to be extended. Along this line, I wonder if extending the lifetime of the entire tail expression may be too broad. Does it make sense to extended (the lifetime of) only the last object created by the tail expression rather than the whole expression?

I can find descriptions of how C++ deals with return objects and temporaries using copy constructors. The way I understand it is that it is basically doing a "placement new" into an ancestor stack frame/scope. Is there a similar treatise of how Rust returns objects or bubbles up the value of a {} block? I doubt whether "Clone" is involved.

I think there is still a common belief that return <expr>; at the end of a function is the same as <expr>. I hope this issue will bring forth a section in Rust documentation on why they are not the same.

@Havvy
Copy link
Contributor

Havvy commented Oct 16, 2018

There's discussion about temporary lifetimes in the Reference but having an explicit section on this would make sense.

Edit: Filed rust-lang/reference#452

@pnkfelix
Copy link
Member Author

@kwanyyoss wrote:

I think the problem with this case is that the return value is just a "u8". As long as the type is "Copy", it is hard to see why the lifetime of any temporary where the final value came from needs to be extended.

Part of the goal here is to have temporary r-value rules that are uniform, in the sense that they come directly from the syntactic structure from the program, and not from e.g. the types that have been inferred nor the results of the borrow-check analysis.

Here's further elaboration of this point:

Does it make sense to extended (the lifetime of) only the last object created by the tail expression rather than the whole expression?

I don't know. My most immediate reaction to that is that it would really complicate encoding the cases where you need those intermediate values to survive long enough.

(And of course, the interpretation of your question depends very much on how one interprets the phrase "the last object created" ...)

@pnkfelix
Copy link
Member Author

pnkfelix commented Oct 16, 2018

I wrote up above:

My most immediate reaction to that is that it would really complicate encoding the cases where you need those intermediate values to survive long enough.

As an attempt to illustrate this: people already complain about the fact that they are forced to add extra let-bindings of intermediate results in cases like this:

fn does_not_work() {
    let x = &std::env::args().collect::<Vec<_>>().as_slice()[1..];
    println!("{:?}", x);
}

fn works() {
    println!("{:?}", &std::env::args().collect::<Vec<_>>().as_slice()[1..]);
}

(See #15023 )

I suspect your proposal would further exacerbate this already present problem. But I have not done any experiment to validate that suspicion.

@pnkfelix pnkfelix added the metabug Issues about issues themselves ("bugs about bugs") label Nov 9, 2018
@pnkfelix
Copy link
Member Author

pnkfelix commented Nov 9, 2018

I actually think this issue as described is effectively resolved now, under NLL at least.

There are related topics in terms of temporary lifetimes being unintuitive (#37612, #46413) and proposals to improve them (#15023).

But this specific issue was about how the dropck rules caused a number of cases to appear to regress, and as far as I can tell, NLL fixes those instances.

So I am going to tag this issue as NLL-fixed-by-NLL to reflect that.

@pnkfelix pnkfelix added the fixed-by-NLL Bugs fixed, but only when NLL is enabled. label Nov 9, 2018
@pnkfelix pnkfelix removed their assignment Nov 9, 2018
@pnkfelix
Copy link
Member Author

NLL (migrate mode) is enabled in all editions as of PR #59114.

I went through all the cases listed in the table in the description; 2015 and 2018 editions indeed now match in their behaviors.

The only case in the table that was not addressed by NLL migrate mode, is:

| jonas-schievink demo | 🤢| 🤢 | ✅ | #42574

(Note that you need to comment out the #![feature(nll)] to observe the undesirable behavior.)

Since #42574 remains open as a bug, we can safely close this ticket as fixed by PR #59114.

@ZhangHanDong
Copy link

@pnkfelix #49616 has not been resolved yet

@pnkfelix
Copy link
Member Author

pnkfelix commented Sep 20, 2019

@ZhangHanDong : The issue in #49616 is that you have a RefMut created as a temporary, and it is dropped after the RefCell it refers to.

The reason the destructor runs at that time is because the temporaries in the tail expression of a block are specified as being dropped after the locals of that block are dropped.

I don't know if there's a reasonable way to resolve that, apart from changing the language to allow us to run such destructors earlier (which is the effect one gets in that case by putting a semi-colon after the tail expression).

But in any case, the error message (play) does describe what is happening, and does tell you how to resolve it. This is not something that NLL was ever going to address; NLL is about changes to the static semantics of the language, but the problem you are describing in #49616 is an issue with the dynamic semantics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-destructors Area: Destructors (`Drop`, …) A-lifetimes Area: Lifetimes / regions C-bug Category: This is a bug. fixed-by-NLL Bugs fixed, but only when NLL is enabled. metabug Issues about issues themselves ("bugs about bugs") T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests