Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested return expressions can cause drop to run on incorrect values. #15763

Closed
luqmana opened this issue Jul 18, 2014 · 8 comments · Fixed by #16081
Closed

Nested return expressions can cause drop to run on incorrect values. #15763

luqmana opened this issue Jul 18, 2014 · 8 comments · Fixed by #16081
Labels
A-codegen Area: Code generation A-destructors Area: Destructors (`Drop`, …) P-medium Medium priority
Milestone

Comments

@luqmana
Copy link
Member

luqmana commented Jul 18, 2014

Came across this while working on inlining enum constructors but I could reproduce on master as well:

#![allow(dead_code)]

struct Foo {
    a: int
}

impl Drop for Foo {
    fn drop(&mut self) {
        println!("Dropping Foo: {}", self.a);
    }
}

struct Bar {
    x: Foo,
    y: int
}

fn foo() -> Bar {
    return Bar {
        x: Foo { a: 23 },
        y: return Bar {
            x: Foo { a: 32 },
            y: 0
        }
    };
}

fn main() {
    unsafe {
        std::mem::forget(foo());
    }
}

When compiled and run this will print out:

Dropping Foo: 32

This is wrong since the Foo we're actually dropping at that point (before returning in foo) is the one with it's field set to 23. The problem is that we translate return expressions to write directly into the return pointer (either a hidden outpointer or a temporary alloca which we'll return from). So with the outer return we start writing out the struct but then we encounter another return in one of the fields and overwrite the previous result so now when we pass a pointer (which is just some offset into the return pointer) to the drop glue for what should be the initial Foo we're actually calling drop on the second one.

apoelstra added a commit to rust-bitcoin/rust-bitcoin that referenced this issue Jul 22, 2014
@apoelstra
Copy link
Contributor

Hi, I have encountered this bug in the wild. Here it is causing a segfault:

use std::io::IoResult;
use std::io::{standard_error, InvalidInput};

fn build_vec_fail() -> IoResult<Vec<uint>> {
  Err(standard_error(InvalidInput))
}

fn build_vec() -> IoResult<Vec<uint>> {
  Ok(vec![])
}

struct MyStruct {
  field1: Vec<uint>,
  field2: Vec<uint>
}

fn real_main() -> IoResult<MyStruct> {
  // This is fine if we assign it to a variable then return the
  // variable. As written this segfaults.
  Ok(MyStruct {
    field1: try!(build_vec()),
    field2: try!(build_vec_fail()),
  })
}

fn main() { real_main(); }

A simple workaround is to assign the return value of real_main to a variable, which you return in the next line.

Edit: Oops, I posted the wrong source code! Fixed.

@huonw
Copy link
Member

huonw commented Jul 22, 2014

This bug causes code like

extern crate serialize;
use serialize::json;

#[deriving(Decodable)]
struct Foo {
    field1: String,
    field2: String
}

fn main() {
    let data = r##"
    {
        "field1": "something",
        "non_existing_field": "something_else"
    }
    "##;
    let _ = json::decode::<Foo>(data.as_slice());
}

to segfault. It expands to the following, which contains a return inside a struct initialiser (causing the String destructor of field1 to misbehave):

#![feature(phase)]
#![no_std]
#![feature(globs)]
#[phase(plugin, link)]
extern crate std;
extern crate native;
extern crate serialize;
use std::prelude::*;
use serialize::json;

struct Foo {
    field1: String,
    field2: String,
}
#[automatically_derived]
impl <__D: ::serialize::Decoder<__E>, __E> ::serialize::Decodable<__D, __E>
     for Foo {
    fn decode(__arg_0: &mut __D) -> ::std::result::Result<Foo, __E> {
        __arg_0.read_struct("Foo", 2u,
                            |_d|
                                ::std::result::Ok(Foo{field1:
                                                          match _d.read_struct_field("field1",
                                                                                     0u,
                                                                                     |_d|
                                                                                         ::serialize::Decodable::decode(_d))
                                                              {
                                                              Ok(__try_var) =>
                                                              __try_var,
                                                              Err(__try_var)
                                                              =>
                                                              return Err(__try_var)
                                                          },
                                                      field2:
                                                          match _d.read_struct_field("field2",
                                                                                     1u,
                                                                                     |_d|
                                                                                         ::serialize::Decodable::decode(_d))
                                                              {
                                                              Ok(__try_var) =>
                                                              __try_var,
                                                              Err(__try_var)
                                                              =>
                                                              return Err(__try_var)
                                                          },}))
    }
}

fn main() {
    let data =
        r##"
    {
        "field1": "something",
        "non_existing_field": "something_else"
    }
    "##;
    let _ = json::decode::<Foo>(data.as_slice());
}

@alexcrichton
Copy link
Member

Nominating, decoders are pretty important!

@pcwalton
Copy link
Contributor

Note that this is not P-backcompat-lang.

@pnkfelix
Copy link
Member

Assigning P-high, 1.0 milestone.

@apoelstra
Copy link
Contributor

Same story with tuples:

use std::io::{IoResult, InvalidInput, standard_error};

fn deserialize() -> IoResult<(Box<uint>, ())> {
  Ok((try!(Ok(box 0)),
      try!(Err(standard_error(InvalidInput)))))
}

fn main() {
  println!("{}", deserialize());
}

also segfaults.

Edit: Here is a maybe more enlightening test case:

struct MyStruct(Box<int>);

impl Drop for MyStruct {
  fn drop(&mut self) {
    let &MyStruct(ref n) = self;
    println!("dropping {}", n);
  }
}

fn foo() -> Result<(MyStruct, MyStruct), MyStruct> {
  Ok((MyStruct(box 0), try!(Err(MyStruct(box 1)))))
}

fn main() {
  foo();
}

This outputs dropping 1 twice and never dropping 0. Valgrind shows a double-free.

apoelstra added a commit to rust-bitcoin/rust-bitcoin that referenced this issue Jul 25, 2014
bors added a commit that referenced this issue Aug 1, 2014
apoelstra added a commit to apoelstra/rust that referenced this issue Aug 3, 2014
The `repr_c_implicit_padding` lint checks structures marked
a u8 followed by a u64, which would require 7 bytes of padding
to get the u64 aligned correctly.

The reason for this is that in the BSD socket code, there is
some shifty casting between structs which results in data being
stored in padding bytes. Since the LLVM Load instruction does
not copy padding, this can result in data loss. (This is also a
problem in C, but since structs are rarely copied by value,
instead using pointers and memcpy(), it is not as serious a
problem.) We therefore need explicit padding for the socket
structures, and this lint will make sure we get it.

This is necessary because the recent pull

  rust-lang#16081

which is a fix for rust-lang#15763, changes the return value semantics
to involve a Load. Without padding fixes, this breaks the Tcp
code.
apoelstra added a commit to apoelstra/rust that referenced this issue Aug 4, 2014
…ut-ptr instead

The BSD socket code does some cast tricks with the `libc::sockaddr*`
structs, which causes useful data to be stored in struct padding.
Since Load/Store instructions do not copy struct padding, this makes
these structures dangerous to pass or return by value.

In particular, rust-lang#15763 changes
return semantics so that a Load instruction is used, breaking the TCP
code. Once this PR is merged, that one should merge without error.
bors added a commit that referenced this issue Aug 4, 2014
Replacement for PR #16208 (make sockaddr* struct padding explicit) to make PR #15763 (fix nested returns) merge without breaking TCP code.
@apoelstra
Copy link
Contributor

@luqmana can you rebase and retry this one? It should pass the tests with the recent TCP changes.

@luqmana
Copy link
Member Author

luqmana commented Aug 5, 2014

@apoelstra yup! works now!

bors added a commit that referenced this issue Aug 12, 2014
alexcrichton added a commit to alexcrichton/rust that referenced this issue Dec 21, 2014
rust-lang#16081 fixed an issue where a nested return statement would cause incorrect behaviour due to the inner return writing over the return stack slot that had already been written too. However, the check was very broad and picked many cases that wouldn't ever be affected by this issue.

As a result, the number of allocas increased dramatically and therefore stack-size increased. LLVM is not able to remove all of the extraneous allocas. Any code that had multiple return values in a compound expression at the end of a function (including loops) would be hit by the issue.

The check now uses a control-flow graph to only consider the case when the inner return is executed conditionally. By itself, this narrowed definition causes rust-lang#15763 to return, so the control-flow graph is also used to avoid passing the return slot as a destination when the result won't be used.

This change allows the stack-size of the main rustc task to be reduced to 8MB from 32MB.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-destructors Area: Destructors (`Drop`, …) P-medium Medium priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants