-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a Use-Definition Chain implementation for the MIR. #62547
Conversation
r? @varkor (rust_highfive has picked a reviewer for you, use r? to override) |
r? @eddyb cc @pcwalton, @wesleywiser (for MIR-opt) |
2409e5c
to
88921e6
Compare
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
76673aa
to
1c02ee7
Compare
This comment has been minimized.
This comment has been minimized.
cc @pnkfelix @davidtwco @matthewjasper (just in case) I also wanted to note here that I think "use-def" is not ideal given our far-from-SSA nature (@nikomatsakis might want to comment on this), and "reaching def" is closer, but IMO "reaching writes" for the analysis and "write sets" for the output seem better. |
4fcbd4f
to
fa2b191
Compare
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
0632c47
to
6794aff
Compare
Okay, I think this is ready to be reviewed now. I'll stop rebasing for the time being. I don't agree that A use-def chain is a graph-like data structure that can be built on top of a reaching definitions analysis. It is not a data-flow analysis itself. In the following example, definitions (2), (3), and (4) reach the final use of let mut x = 0; // (1)
let p = &mut x; // (2)
*p = 1; // (3)
x = 2; // (4)
x I'm not aware of any link between use-def chains and SSA form besides the fact that SSA makes a use-def chain trivial to compute since each variable has exactly one definition. It seems perfectly reasonable to have a use-def chain on a non-SSA IR, although the only resources I found supporting this are a wikipedia page and some lecture slides. |
A use-def chain can be used for constant propagation (I mistakenly said the opposite in a zulip chat). let x = 5;
let mut y = x;
if condition {
y = 5;
} The use-def chain for
We can traverse the def-use chain (the transpose graph of the use-def chain) beginning from each definition, remembering the right-hand side of that definition as a prospective value for each variable. If we encounter a variable which already has a prospective value which is indeterminate (e.g. a function parameter or non-const function call) or a constant with a different value, we know that we cannot do constant propagation to that variable. Once we have iterated the def-use chain from every definition, constant propagation can be performed on all variables which have exactly one prospective definition. This seems less efficient than doing a separate dataflow analysis with a lattice specifically intended for constant propagation such as the following (although I'm not sure if that particular algorithm is the best way to achieve this on use-def chains). However there are performance benefits to using bit vectors to do dataflow, and the use-def chain can be shared across multiple compiler passes, whereas a custom dataflow analysis needs to be run from scratch each time.
|
|
||
pub struct DefVisitor<F>(pub F); | ||
|
||
impl<'tcx, F> Visitor<'tcx> for DefVisitor<F> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about the correctness and efficiency of DefVisitor
and UseVisitor
. Is this a good approach? For DefVisitor
at least, we only have to visit the left-hand side of any StatementKind::Assign
.
type LocationMap<T> = FxHashMap<Location, T>; | ||
type DefsForLocalDense = IndexVec<Local, BitSet<DefIndex>>; | ||
type DefsForLocalSparse = BTreeMap<Local, BitSet<DefIndex>>; | ||
type DefsForUseAtLocation = LocationMap<DefsForLocalSparse>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once there's a pass that uses a UseDefChain
, these data structures should be tweaked for efficiency. I think that DefsForLocalDense
should be a BitMatrix
and that some uses of BitSet
can be changed to HybridBitSet
s.
/// It's expensive to clone the entry set at the start of each block for every local and apply | ||
/// the appropriate mask. However, we only need to keep `curr_defs_for_local` up-to-date for | ||
/// the `Local`s which are actually used in a given `BasicBlock`. | ||
locals_used_in_block: LocalsUsedInBlock, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a holdover from a previous version of this PR when all_defs_for_local
was not cached. It may be faster not to iterate over the mir::Body
to initialize locals_used_in_block
.
I have some concerns about performance and test coverage, but I think this in a decent place overall. I'm removing the draft status. |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
☔ The latest upstream changes (presumably #59369) made this pull request unmergeable. Please resolve the merge conflicts. |
3e1d1fb
to
fe93ca4
Compare
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
Its functionality is duplicated in `borrowed_locals.rs` and `path_utils.rs`
This informs callers that the lifetime of the reference to the underlying `PlaceBase` is the same as that of the `Place` being iterated over. This lets callers return a reference to the underlying `PlaceBase` from the closure passed to `Place::iterate`.
We now use `dataflow::state_for_location` to get the dataflow state before calls to `rustc_peek`. The original version used a custom implementation that only looked at assignment statements.
fe93ca4
to
5d5681a
Compare
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The previous code ignored that `Drop` terminators could execute custom `Drop` implementations which might mutate the environment. Now we correctly implement `has_side_effects` and tests that custom `Drop` impls are recorded as an indirect definition.
This means that calls to `rustc_peek` are no longer indirect definitions of any locals in the body. Other pure intrinsics (e.g. `transmute`) could also be added to the list of exceptions.
5d5681a
to
976ba13
Compare
Ping from triage. @ecstatic-morse any updates on this? Thanks. |
☔ The latest upstream changes (presumably #63640) made this pull request unmergeable. Please resolve the merge conflicts. |
Hello from triage! Thank you! |
This won't get used in the final version of dataflow-based const qualification, and no one from the MIR optimization team has a use case for it, so I'll close this for now. |
…oli-obk A more generic interface for dataflow analysis rust-lang#64470 requires a transfer function that is slightly more complex than the typical `gen`/`kill` one. Namely, it must copy state bits between locals when assignments occur (see rust-lang#62547 for an attempt to make this fit into the existing framework). This PR contains a dataflow interface that allows for arbitrary transfer functions. The trade-off is efficiency: we can no longer coalesce transfer functions for blocks and must visit each statement individually while iterating to fixpoint. Another issue is that poorly behaved transfer functions can result in an analysis that fails to converge. `gen`/`kill` sets do not have this problem. I believe that, in order to guarantee convergence, flipping a bit from `false` to `true` in the entry set cannot cause an output bit to go from `true` to `false` (negate all preceding booleans when `true` is the bottom value). Perhaps someone with a more formal background can confirm and we can add a section to the docs? This approach is not maximally generic: it still requires that the lattice used for analysis is the powerset of values of `Analysis::Idx` for the `mir::Body` of interest. This can be done at a later date. Also, this is the bare minimum to get rust-lang#64470 working. I've not adapted the existing debug framework to work with the new analysis, so there are no `rustc_peek` tests either. I'm planning to do this after rust-lang#64470 is merged. Finally, my ultimate plan is to make the existing, `gen`/`kill`-based `BitDenotation` a special case of `generic::Analysis`. Currently they share a ton of code. I should be able to do this without changing any implementers of `BitDenotation`. Something like: ```rust struct GenKillAnalysis<A: BitDenotation> { trans_for_block: IndexVec<BasicBlock, GenKillSet<A::Idx>>, analysis: A, } impl<A> generic::Analysis for GenKillAnalysis<A> { // specializations of `apply_{partial,whole}_block_effect`... } ``` r? @pnkfelix
…oli-obk A more generic interface for dataflow analysis rust-lang#64470 requires a transfer function that is slightly more complex than the typical `gen`/`kill` one. Namely, it must copy state bits between locals when assignments occur (see rust-lang#62547 for an attempt to make this fit into the existing framework). This PR contains a dataflow interface that allows for arbitrary transfer functions. The trade-off is efficiency: we can no longer coalesce transfer functions for blocks and must visit each statement individually while iterating to fixpoint. Another issue is that poorly behaved transfer functions can result in an analysis that fails to converge. `gen`/`kill` sets do not have this problem. I believe that, in order to guarantee convergence, flipping a bit from `false` to `true` in the entry set cannot cause an output bit to go from `true` to `false` (negate all preceding booleans when `true` is the bottom value). Perhaps someone with a more formal background can confirm and we can add a section to the docs? This approach is not maximally generic: it still requires that the lattice used for analysis is the powerset of values of `Analysis::Idx` for the `mir::Body` of interest. This can be done at a later date. Also, this is the bare minimum to get rust-lang#64470 working. I've not adapted the existing debug framework to work with the new analysis, so there are no `rustc_peek` tests either. I'm planning to do this after rust-lang#64470 is merged. Finally, my ultimate plan is to make the existing, `gen`/`kill`-based `BitDenotation` a special case of `generic::Analysis`. Currently they share a ton of code. I should be able to do this without changing any implementers of `BitDenotation`. Something like: ```rust struct GenKillAnalysis<A: BitDenotation> { trans_for_block: IndexVec<BasicBlock, GenKillSet<A::Idx>>, analysis: A, } impl<A> generic::Analysis for GenKillAnalysis<A> { // specializations of `apply_{partial,whole}_block_effect`... } ``` r? @pnkfelix
…oli-obk A more generic interface for dataflow analysis rust-lang#64470 requires a transfer function that is slightly more complex than the typical `gen`/`kill` one. Namely, it must copy state bits between locals when assignments occur (see rust-lang#62547 for an attempt to make this fit into the existing framework). This PR contains a dataflow interface that allows for arbitrary transfer functions. The trade-off is efficiency: we can no longer coalesce transfer functions for blocks and must visit each statement individually while iterating to fixpoint. Another issue is that poorly behaved transfer functions can result in an analysis that fails to converge. `gen`/`kill` sets do not have this problem. I believe that, in order to guarantee convergence, flipping a bit from `false` to `true` in the entry set cannot cause an output bit to go from `true` to `false` (negate all preceding booleans when `true` is the bottom value). Perhaps someone with a more formal background can confirm and we can add a section to the docs? This approach is not maximally generic: it still requires that the lattice used for analysis is the powerset of values of `Analysis::Idx` for the `mir::Body` of interest. This can be done at a later date. Also, this is the bare minimum to get rust-lang#64470 working. I've not adapted the existing debug framework to work with the new analysis, so there are no `rustc_peek` tests either. I'm planning to do this after rust-lang#64470 is merged. Finally, my ultimate plan is to make the existing, `gen`/`kill`-based `BitDenotation` a special case of `generic::Analysis`. Currently they share a ton of code. I should be able to do this without changing any implementers of `BitDenotation`. Something like: ```rust struct GenKillAnalysis<A: BitDenotation> { trans_for_block: IndexVec<BasicBlock, GenKillSet<A::Idx>>, analysis: A, } impl<A> generic::Analysis for GenKillAnalysis<A> { // specializations of `apply_{partial,whole}_block_effect`... } ``` r? @pnkfelix
This work began from a discussion about extending const-qualification to work on arbitrary CFGs (currently it does not support conditional jumps). @eddyb was not convinced that this was the right approach, so I stopped working on it for a while. I picked it up again since it was almost complete, and it might also be useful for MIR optimization. I'm opening this PR so there's a place to discuss possible use cases, but it's very much a work in progress.
You can see an example of this analysis here. The
ERROR
lines contain the possible definitions of the given variable at that point in the program prefixed by their line number.Note the presence ofThe fact thatrustc_peek(&x)
in the possible definitions on line 25. This is because function calls may mutate locals higher up the call stack via a pointer, so any local which has had its address taken at that point in the CFG could be modified when calling an arbitrary function (obviouslyrustc_peek
does not actually mutate its environment).rustc_peek
does not mutate its environment has been special-cased in the analysis (this could be extended to other intrinsics).