Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NLL iterating and updating &mut fails for non-local Place. #62007

Closed
pnkfelix opened this issue Jun 20, 2019 · 7 comments · Fixed by #62010
Closed

NLL iterating and updating &mut fails for non-local Place. #62007

pnkfelix opened this issue Jun 20, 2019 · 7 comments · Fixed by #62010
Assignees
Labels
A-NLL Area: Non Lexical Lifetimes (NLL) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@pnkfelix
Copy link
Member

pnkfelix commented Jun 20, 2019

Spawned off of discussion with @ecstatic-morse and their work on PR #61787

There is a dataflow subroutine for Borrows, kill_borrows_on_place, added in PR #56649 to resolve #46589. kill_borrows_on_place reads from the on_entry field of BlockSets (even though I am pretty sure no one was ever supposed to do so; its inclusion in this part of the API was a wart):

// Otherwise, look at all borrows that are live and if they conflict with the assignment
// into our place then we can kill them.
let mut borrows = sets.on_entry.clone();
let _ = borrows.union(sets.gen_set);
for borrow_index in borrows.iter() {
let borrow_data = &self.borrows()[borrow_index];
debug!(
"kill_borrows_on_place: borrow_index={:?} borrow_data={:?}",
borrow_index, borrow_data,
);

I think this was written with the assumption that sets.on_entry.clone().union(sets.gen_set) (aka on-entry U gen-set) would represent an upper-bound on the things that should be in the kill set. (And then the subsequent loop determines which of those things to actually put into the kill set.)

The problem with that is that when we are computing the transfer-function initially, on_entry is going to be the empty-set, at least for almost all basic blocks. That all you have to work with in dataflow. So you end up with a kill-set here that is an under-approximation to what it should actually be, if you are assuming that we're supposed to start with the complete (post dataflow) on-entry and combine that with gen-set to get the upper bound on what to use for the kill-set.

Why hasn't this been a problem for us so far? Well, here's my theory:

  1. With respect to soundness: if the kill-set for Borrows is a subset of what it should "truly be", then I think the compiler ends up treating borrows as living longer than they should, and so it ends up detecting conflicts that are not actually there, and so the compiler rejects code that you might have other expected it to accept.

  2. With respect to expressiveness: We don't actually get into the problematic part of kill_borrows_on_place as often as you might think, because there's a special dodge at the start of the function:

// Handle the `Place::Local(..)` case first and exit early.
if let Place::Base(PlaceBase::Local(local)) = place {
if let Some(borrow_indices) = self.borrow_set.local_map.get(&local) {
debug!("kill_borrows_on_place: borrow_indices={:?}", borrow_indices);
sets.kill_all(borrow_indices);
return;
}
}


You combine the two points made above, and you end up with things like the following example: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=3d29327853729ec280799760d5e58003

Click to see the example Rust code
struct List<T> {
    value: T,
    next: Option<Box<List<T>>>,
}


fn to_refs1<T>(mut list: &mut List<T>) -> Vec<&mut T> {
    let mut result = vec![];
    loop {
        result.push(&mut list.value);
        if let Some(n) = list.next.as_mut() {
            list = n;
        } else {
            return result;
        }
    }
}

fn to_refs2<T>(mut list: (&mut List<T>,)) -> Vec<&mut T> {
    let mut result = vec![];
    loop {
        result.push(&mut (list.0).value);
        if let Some(n) = (list.0).next.as_mut() {
            list.0 = n;
        } else {
            return result;
        }
    }
}

(In that example, to_refs2 is basically the same as to_refs1, except that its wrapped in a unary tuple (T,). I think the intention is that if to_refs1 is accepted, then so should to_refs2. But NLL currently accepts to_refs1 and )

@pnkfelix pnkfelix self-assigned this Jun 20, 2019
@pnkfelix pnkfelix added A-NLL Area: Non Lexical Lifetimes (NLL) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 20, 2019
@pnkfelix
Copy link
Member Author

pnkfelix commented Jun 20, 2019

I'm not 100% sure what action is best here. I think my game plan is roughly:

  • Double-check the hypothesis that on_entry is always the null-set for this part of the code.
  • Experiment with computing the "correct" kill-set. My assumption is that doing so will increase the number of programs we accept. Hopefully it will do so without exposing soundness bugs elsewhere.
    • But even if its correct, the question will be whether we can afford, computation time-wise,. to do it in the compiler. I am currently assuming that the reason for this loop over on-entry U gen-set is to avoid trying to loop over the whole universe of bits here.
    • (Update: This is covered in PR Kill conflicting borrows of places with projections. #62010; it avoids looping over the whole universe of bits by mapping the place to its base and then feeding that into the local-to-borrows map.)
  • (Also it would be good to try to evaluate whether this is even worth doing. Maybe no one cares about this oversight.)

@pnkfelix pnkfelix changed the title NLL iterating and updating &mut ref fails when in non-local Place. NLL iterating and updating &mut ref fails for non-local Place. Jun 20, 2019
@pnkfelix pnkfelix changed the title NLL iterating and updating &mut ref fails for non-local Place. NLL iterating and updating &mut ref fails for non-local Place. Jun 20, 2019
@pnkfelix pnkfelix changed the title NLL iterating and updating &mut ref fails for non-local Place. NLL iterating and updating &mut fails for non-local Place. Jun 20, 2019
@pnkfelix
Copy link
Member Author

cc @davidtwco

@ecstatic-morse
Copy link
Contributor

Currently, in the "special dodge", we loop over all borrows of that same local. Could we simply filter that list using places_conflict when we have a projection to a field?

@pnkfelix
Copy link
Member Author

@ecstatic-morse I'm not sure if I understood the question, but: to support the "special dodge", we specifically build up a local_map that maps each local to all of the borrows of that local.

We don't have an analogous map for all places in a given MIR. (and I suspect we do not want to pay the time+memory expense of building up such a map.)

@pnkfelix
Copy link
Member Author

or are you saying we should map each place to its base local, and then feed that into the local_map for the basis for the filter here? I hadn't considered whether that could work.

@ecstatic-morse
Copy link
Contributor

Ah, so borrow_set.local_map only contains the borrows of exactly local, not local.field?

@pnkfelix
Copy link
Member Author

(I think the local_map[local] does contain borrows for local.field as well as local, so this may be a reasonable way forward to computing a more complete kill-set and thus get a more accurate transfer-function for the dataflow here.)

ecstatic-morse added a commit to ecstatic-morse/rust that referenced this issue Jun 20, 2019
Resolves rust-lang#62007.

Due to a bug, the previous version of this check did not actually kill
any conflicting borrows unless the borrowed place had no projections.
Specifically, `entry_set` will always be empty when `statement_effect`
is called. It does not contain the set of borrows which are live at this
point in the program.
pietroalbini added a commit to pietroalbini/rust that referenced this issue Jun 21, 2019
…, r=pnkfelix

Kill conflicting borrows of places with projections.

Resolves rust-lang#62007.

Due to a bug, the previous version of this check did not actually kill all conflicting borrows unless the borrowed place had no projections. Specifically, `sets.on_entry` will always be empty when `statement_effect` is called. It does not contain the set of borrows which are live at this point in the program.

@pnkfelix describes why this was not caught before in rust-lang#62007, and created an example where the current borrow checker failed unnecessarily. This PR adds their example as a test, but they will likely want to add some additional ones.

r? @pnkfelix
Centril added a commit to Centril/rust that referenced this issue Jun 21, 2019
…, r=pnkfelix

Kill conflicting borrows of places with projections.

Resolves rust-lang#62007.

Due to a bug, the previous version of this check did not actually kill all conflicting borrows unless the borrowed place had no projections. Specifically, `sets.on_entry` will always be empty when `statement_effect` is called. It does not contain the set of borrows which are live at this point in the program.

@pnkfelix describes why this was not caught before in rust-lang#62007, and created an example where the current borrow checker failed unnecessarily. This PR adds their example as a test, but they will likely want to add some additional ones.

r? @pnkfelix
bors added a commit that referenced this issue Jun 22, 2019
Kill conflicting borrows of places with projections.

Resolves #62007.

Due to a bug, the previous version of this check did not actually kill all conflicting borrows unless the borrowed place had no projections. Specifically, `sets.on_entry` will always be empty when `statement_effect` is called. It does not contain the set of borrows which are live at this point in the program.

@pnkfelix describes why this was not caught before in #62007, and created an example where the current borrow checker failed unnecessarily. This PR adds their example as a test, but they will likely want to add some additional ones.

r? @pnkfelix
ecstatic-morse added a commit to ecstatic-morse/rust that referenced this issue Jun 22, 2019
This commit makes `sets.on_entry` inaccessible in
`{before_,}{statement,terminator}_effect`. This field was meant to allow
implementors of `BitDenotation` to access the initial state for each
block (optionally with the effect of all previous statements applied via
`accumulates_intrablock_state`) while defining transfer functions.
However, the ability to set the initial value for the entry set of each
basic block (except for START_BLOCK) no longer exists. As a result, this
functionality is mostly useless, and when it *was* used it was used
erroneously (see rust-lang#62007).

Since `on_entry` is now useless, we can also remove `BlockSets`, which
held the `gen`, `kill`, and `on_entry` bitvectors and replace it with a
`GenKill` struct. Variables of this type are called `trans` since they
represent a transfer function. `GenKill`s are stored contiguously in
`AllSets`, which reduces the number of bounds checks and may improve
cache performance: one is almost never accessed without the other.

Replacing `BlockSets` with `GenKill` allows us to define some new helper
functions which streamline dataflow iteration and the
dataflow-at-location APIs. Notably, `state_for_location` used a subtle
side-effect of the `kill`/`kill_all` setters to apply the transfer
function, and could be incorrect if a transfer function depended on
effects of previous statements in the block on `gen_set`.
bors added a commit that referenced this issue Jun 24, 2019
…kfelix

rustc_mir: Hide initial block state when defining transfer functions

This PR addresses [this FIXME](https://github.com/rust-lang/rust/blob/2887008e0ce0824be4e0e9562c22ea397b165c97/src/librustc_mir/dataflow/mod.rs#L594-L596).

This makes `sets.on_entry` inaccessible in `{before_,}{statement,terminator}_effect`. This field was meant to allow implementors of `BitDenotation` to access the initial state for each block (optionally with the effect of all previous statements applied via `accumulates_intrablock_state`) while defining transfer functions.  However, the ability to set the initial value for the entry set of each basic block (except for START_BLOCK) no longer exists. As a result, this functionality is mostly useless, and when it *was* used it was used erroneously (see #62007).

Since `on_entry` is now useless, we can also remove `BlockSets`, which held the `gen`, `kill`, and `on_entry` bitvectors and replace it with a `GenKill` struct. Variables of this type are called `trans` since they represent a transfer function. `GenKill`s are stored contiguously in `AllSets`, which reduces the number of bounds checks and may improve cache performance: one is almost never accessed without the other.

Replacing `BlockSets` with `GenKill` allows us to define some new helper functions which streamline dataflow iteration and the dataflow-at-location APIs. Notably, `state_for_location` used a subtle side-effect of the `kill`/`kill_all` setters to apply the transfer function, and could be incorrect if a transfer function depended on effects of previous statements in the block on `gen_set`.

Additionally, this PR merges `BitSetOperator` and `InitialFlow` into one trait. Since the value of `InitialFlow` defines the semantics of the `join` operation, there's no reason to have seperate traits for each. We can add a default impl of `join` which branches based on `BOTTOM_VALUE`.  This should get optimized away.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-NLL Area: Non Lexical Lifetimes (NLL) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants