Auto merge of #76044 - ecstatic-morse:dataflow-lattice, r=oli-obk

Support dataflow problems on arbitrary lattices This PR implements last of the proposed extensions I mentioned in the design meeting for the original dataflow refactor. It extends the current dataflow framework to work with arbitrary lattices, not just `BitSet`s. This is a prerequisite for dataflow-enabled MIR const-propagation. Personally, I am skeptical of the usefulness of doing const-propagation pre-monomorphization, since many useful constants only become known after monomorphization (e.g. `size_of::<T>()`) and users have a natural tendency to hand-optimize the rest. It's probably worth exprimenting with, however, and others have shown interest cc `@rust-lang/wg-mir-opt.` The `Idx` associated type is moved from `AnalysisDomain` to `GenKillAnalysis` and replaced with an associated `Domain` type that must implement `JoinSemiLattice`. Like before, each `Analysis` defines the "bottom value" for its domain, but can no longer override the dataflow join operator. Analyses that want to use set intersection must now use the `lattice::Dual` newtype. `GenKillAnalysis` impls have an additional requirement that `Self::Domain: BorrowMut<BitSet<Self::Idx>>`, which effectively means that they must use `BitSet<Self::Idx>` or `lattice::Dual<BitSet<Self::Idx>>` as their domain. Most of these changes were mechanical. However, because a `Domain` is no longer always a powerset of some index type, we can no longer use an `IndexVec<BasicBlock, GenKillSet<A::Idx>>>` to store cached block transfer functions. Instead, we use a boxed `dyn Fn` trait object. I discuss a few alternatives to the current approach in a commit message. The majority of new lines of code are to preserve existing Graphviz diagrams for those unlucky enough to have to debug dataflow analyses. I find these diagrams incredibly useful when things are going wrong and considered regressing them unacceptable, especially the pretty-printing of `MovePathIndex`s, which are used in many dataflow analyses. This required a parallel `fmt` trait used only for printing dataflow domains, as well as a refactoring of the `graphviz` module now that we cannot expect the domain to be a `BitSet`. Some features did have to be removed, such as the gen/kill display mode (which I didn't use but existed to mirror the output of the old dataflow framework) and line wrapping. Since I had to rewrite much of it anyway, I took the opportunity to switch to a `Visitor` for printing dataflow state diffs instead of using cursors, which are error prone for code that must be generic over both forward and backward analyses. As a side-effect of this change, we no longer have quadratic behavior when writing graphviz diagrams for backward dataflow analyses. r? `@pnkfelix`
rust-lang · Sep 7, 2020 · 0e2c128 · 0e2c128
2 parents 9fe551a + b015109
commit 0e2c128
Show file tree

Hide file tree

Showing 23 changed files with 947 additions and 678 deletions.
diff --git a/Cargo.lock b/Cargo.lock
@@ -3756,6 +3756,7 @@ dependencies = [
  "itertools 0.8.2",
  "log_settings",
  "polonius-engine",
+ "regex",
  "rustc_apfloat",
  "rustc_ast",
  "rustc_attr",

diff --git a/compiler/rustc_index/src/bit_set.rs b/compiler/rustc_index/src/bit_set.rs
@@ -28,13 +28,20 @@ pub const WORD_BITS: usize = WORD_BYTES * 8;
 /// will panic if the bitsets have differing domain sizes.
 ///
 /// [`GrowableBitSet`]: struct.GrowableBitSet.html
-#[derive(Clone, Eq, PartialEq, Decodable, Encodable)]
-pub struct BitSet<T: Idx> {
+#[derive(Eq, PartialEq, Decodable, Encodable)]
+pub struct BitSet<T> {
     domain_size: usize,
     words: Vec<Word>,
     marker: PhantomData<T>,
 }
 
+impl<T> BitSet<T> {
+    /// Gets the domain size.
+    pub fn domain_size(&self) -> usize {
+        self.domain_size
+    }
+}
+
 impl<T: Idx> BitSet<T> {
     /// Creates a new, empty bitset with a given `domain_size`.
     #[inline]
@@ -52,11 +59,6 @@ impl<T: Idx> BitSet<T> {
         result
     }
 
-    /// Gets the domain size.
-    pub fn domain_size(&self) -> usize {
-        self.domain_size
-    }
-
     /// Clear all elements.
     #[inline]
     pub fn clear(&mut self) {
@@ -75,12 +77,6 @@ impl<T: Idx> BitSet<T> {
         }
     }
 
-    /// Efficiently overwrite `self` with `other`.
-    pub fn overwrite(&mut self, other: &BitSet<T>) {
-        assert!(self.domain_size == other.domain_size);
-        self.words.clone_from_slice(&other.words);
-    }
-
     /// Count the number of set bits in the set.
     pub fn count(&self) -> usize {
         self.words.iter().map(|e| e.count_ones() as usize).sum()
@@ -243,6 +239,21 @@ impl<T: Idx> SubtractFromBitSet<T> for BitSet<T> {
     }
 }
 
+impl<T> Clone for BitSet<T> {
+    fn clone(&self) -> Self {
+        BitSet { domain_size: self.domain_size, words: self.words.clone(), marker: PhantomData }
+    }
+
+    fn clone_from(&mut self, from: &Self) {
+        if self.domain_size != from.domain_size {
+            self.words.resize(from.domain_size, 0);
+            self.domain_size = from.domain_size;
+        }
+
+        self.words.copy_from_slice(&from.words);
+    }
+}
+
 impl<T: Idx> fmt::Debug for BitSet<T> {
     fn fmt(&self, w: &mut fmt::Formatter<'_>) -> fmt::Result {
         w.debug_list().entries(self.iter()).finish()
@@ -363,7 +374,7 @@ const SPARSE_MAX: usize = 8;
 ///
 /// This type is used by `HybridBitSet`; do not use directly.
 #[derive(Clone, Debug)]
-pub struct SparseBitSet<T: Idx> {
+pub struct SparseBitSet<T> {
     domain_size: usize,
     elems: ArrayVec<[T; SPARSE_MAX]>,
 }
@@ -464,18 +475,27 @@ impl<T: Idx> SubtractFromBitSet<T> for SparseBitSet<T> {
 /// All operations that involve an element will panic if the element is equal
 /// to or greater than the domain size. All operations that involve two bitsets
 /// will panic if the bitsets have differing domain sizes.
-#[derive(Clone, Debug)]
-pub enum HybridBitSet<T: Idx> {
+#[derive(Clone)]
+pub enum HybridBitSet<T> {
     Sparse(SparseBitSet<T>),
     Dense(BitSet<T>),
 }
 
+impl<T: Idx> fmt::Debug for HybridBitSet<T> {
+    fn fmt(&self, w: &mut fmt::Formatter<'_>) -> fmt::Result {
+        match self {
+            Self::Sparse(b) => b.fmt(w),
+            Self::Dense(b) => b.fmt(w),
+        }
+    }
+}
+
 impl<T: Idx> HybridBitSet<T> {
     pub fn new_empty(domain_size: usize) -> Self {
         HybridBitSet::Sparse(SparseBitSet::new_empty(domain_size))
     }
 
-    fn domain_size(&self) -> usize {
+    pub fn domain_size(&self) -> usize {
         match self {
             HybridBitSet::Sparse(sparse) => sparse.domain_size,
             HybridBitSet::Dense(dense) => dense.domain_size,

diff --git a/compiler/rustc_mir/Cargo.toml b/compiler/rustc_mir/Cargo.toml
@@ -14,6 +14,7 @@ itertools = "0.8"
 tracing = "0.1"
 log_settings = "0.1.1"
 polonius-engine = "0.12.0"
+regex = "1"
 rustc_middle = { path = "../rustc_middle" }
 rustc_attr = { path = "../rustc_attr" }
 rustc_data_structures = { path = "../rustc_data_structures" }

diff --git a/compiler/rustc_mir/src/dataflow/framework/cursor.rs b/compiler/rustc_mir/src/dataflow/framework/cursor.rs
@@ -4,6 +4,7 @@ use std::borrow::Borrow;
 use std::cmp::Ordering;
 
 use rustc_index::bit_set::BitSet;
+use rustc_index::vec::Idx;
 use rustc_middle::mir::{self, BasicBlock, Location};
 
 use super::{Analysis, Direction, Effect, EffectIndex, Results};
@@ -26,7 +27,7 @@ where
 {
     body: &'mir mir::Body<'tcx>,
     results: R,
-    state: BitSet<A::Idx>,
+    state: A::Domain,
 
     pos: CursorPosition,
 
@@ -46,17 +47,16 @@ where
 {
     /// Returns a new cursor that can inspect `results`.
     pub fn new(body: &'mir mir::Body<'tcx>, results: R) -> Self {
-        let bits_per_block = results.borrow().entry_set_for_block(mir::START_BLOCK).domain_size();
-
+        let bottom_value = results.borrow().analysis.bottom_value(body);
         ResultsCursor {
             body,
             results,
 
-            // Initialize to an empty `BitSet` and set `state_needs_reset` to tell the cursor that
+            // Initialize to the `bottom_value` and set `state_needs_reset` to tell the cursor that
             // it needs to reset to block entry before the first seek. The cursor position is
             // immaterial.
             state_needs_reset: true,
-            state: BitSet::new_empty(bits_per_block),
+            state: bottom_value,
             pos: CursorPosition::block_entry(mir::START_BLOCK),
 
             #[cfg(debug_assertions)]
@@ -68,23 +68,21 @@ where
         self.body
     }
 
-    /// Returns the `Analysis` used to generate the underlying results.
+    /// Returns the underlying `Results`.
+    pub fn results(&self) -> &Results<'tcx, A> {
+        &self.results.borrow()
+    }
+
+    /// Returns the `Analysis` used to generate the underlying `Results`.
     pub fn analysis(&self) -> &A {
         &self.results.borrow().analysis
     }
 
     /// Returns the dataflow state at the current location.
-    pub fn get(&self) -> &BitSet<A::Idx> {
+    pub fn get(&self) -> &A::Domain {
         &self.state
     }
 
-    /// Returns `true` if the dataflow state at the current location contains the given element.
-    ///
-    /// Shorthand for `self.get().contains(elem)`
-    pub fn contains(&self, elem: A::Idx) -> bool {
-        self.state.contains(elem)
-    }
-
     /// Resets the cursor to hold the entry set for the given basic block.
     ///
     /// For forward dataflow analyses, this is the dataflow state prior to the first statement.
@@ -94,7 +92,7 @@ where
         #[cfg(debug_assertions)]
         assert!(self.reachable_blocks.contains(block));
 
-        self.state.overwrite(&self.results.borrow().entry_set_for_block(block));
+        self.state.clone_from(&self.results.borrow().entry_set_for_block(block));
         self.pos = CursorPosition::block_entry(block);
         self.state_needs_reset = false;
     }
@@ -202,12 +200,23 @@ where
     ///
     /// This can be used, e.g., to apply the call return effect directly to the cursor without
     /// creating an extra copy of the dataflow state.
-    pub fn apply_custom_effect(&mut self, f: impl FnOnce(&A, &mut BitSet<A::Idx>)) {
+    pub fn apply_custom_effect(&mut self, f: impl FnOnce(&A, &mut A::Domain)) {
         f(&self.results.borrow().analysis, &mut self.state);
         self.state_needs_reset = true;
     }
 }
 
+impl<'mir, 'tcx, A, R, T> ResultsCursor<'mir, 'tcx, A, R>
+where
+    A: Analysis<'tcx, Domain = BitSet<T>>,
+    T: Idx,
+    R: Borrow<Results<'tcx, A>>,
+{
+    pub fn contains(&self, elem: T) -> bool {
+        self.get().contains(elem)
+    }
+}
+
 #[derive(Clone, Copy, Debug)]
 struct CursorPosition {
     block: BasicBlock,

diff --git a/compiler/rustc_mir/src/dataflow/framework/direction.rs b/compiler/rustc_mir/src/dataflow/framework/direction.rs
@@ -18,7 +18,7 @@ pub trait Direction {
     /// `effects.start()` must precede or equal `effects.end()` in this direction.
     fn apply_effects_in_range<A>(
         analysis: &A,
-        state: &mut BitSet<A::Idx>,
+        state: &mut A::Domain,
         block: BasicBlock,
         block_data: &mir::BasicBlockData<'tcx>,
         effects: RangeInclusive<EffectIndex>,
@@ -27,7 +27,7 @@ pub trait Direction {
 
     fn apply_effects_in_block<A>(
         analysis: &A,
-        state: &mut BitSet<A::Idx>,
+        state: &mut A::Domain,
         block: BasicBlock,
         block_data: &mir::BasicBlockData<'tcx>,
     ) where
@@ -55,9 +55,9 @@ pub trait Direction {
         tcx: TyCtxt<'tcx>,
         body: &mir::Body<'tcx>,
         dead_unwinds: Option<&BitSet<BasicBlock>>,
-        exit_state: &mut BitSet<A::Idx>,
+        exit_state: &mut A::Domain,
         block: (BasicBlock, &'_ mir::BasicBlockData<'tcx>),
-        propagate: impl FnMut(BasicBlock, &BitSet<A::Idx>),
+        propagate: impl FnMut(BasicBlock, &A::Domain),
     ) where
         A: Analysis<'tcx>;
 }
@@ -72,7 +72,7 @@ impl Direction for Backward {
 
     fn apply_effects_in_block<A>(
         analysis: &A,
-        state: &mut BitSet<A::Idx>,
+        state: &mut A::Domain,
         block: BasicBlock,
         block_data: &mir::BasicBlockData<'tcx>,
     ) where
@@ -112,7 +112,7 @@ impl Direction for Backward {
 
     fn apply_effects_in_range<A>(
         analysis: &A,
-        state: &mut BitSet<A::Idx>,
+        state: &mut A::Domain,
         block: BasicBlock,
         block_data: &mir::BasicBlockData<'tcx>,
         effects: RangeInclusive<EffectIndex>,
@@ -224,9 +224,9 @@ impl Direction for Backward {
         _tcx: TyCtxt<'tcx>,
         body: &mir::Body<'tcx>,
         dead_unwinds: Option<&BitSet<BasicBlock>>,
-        exit_state: &mut BitSet<A::Idx>,
+        exit_state: &mut A::Domain,
         (bb, _bb_data): (BasicBlock, &'_ mir::BasicBlockData<'tcx>),
-        mut propagate: impl FnMut(BasicBlock, &BitSet<A::Idx>),
+        mut propagate: impl FnMut(BasicBlock, &A::Domain),
     ) where
         A: Analysis<'tcx>,
     {
@@ -281,7 +281,7 @@ impl Direction for Forward {
 
     fn apply_effects_in_block<A>(
         analysis: &A,
-        state: &mut BitSet<A::Idx>,
+        state: &mut A::Domain,
         block: BasicBlock,
         block_data: &mir::BasicBlockData<'tcx>,
     ) where
@@ -321,7 +321,7 @@ impl Direction for Forward {
 
     fn apply_effects_in_range<A>(
         analysis: &A,
-        state: &mut BitSet<A::Idx>,
+        state: &mut A::Domain,
         block: BasicBlock,
         block_data: &mir::BasicBlockData<'tcx>,
         effects: RangeInclusive<EffectIndex>,
@@ -428,9 +428,9 @@ impl Direction for Forward {
         tcx: TyCtxt<'tcx>,
         body: &mir::Body<'tcx>,
         dead_unwinds: Option<&BitSet<BasicBlock>>,
-        exit_state: &mut BitSet<A::Idx>,
+        exit_state: &mut A::Domain,
         (bb, bb_data): (BasicBlock, &'_ mir::BasicBlockData<'tcx>),
-        mut propagate: impl FnMut(BasicBlock, &BitSet<A::Idx>),
+        mut propagate: impl FnMut(BasicBlock, &A::Domain),
     ) where
         A: Analysis<'tcx>,
     {
@@ -499,7 +499,7 @@ impl Direction for Forward {
                         // MIR building adds discriminants to the `values` array in the same order as they
                         // are yielded by `AdtDef::discriminants`. We rely on this to match each
                         // discriminant in `values` to its corresponding variant in linear time.
-                        let mut tmp = BitSet::new_empty(exit_state.domain_size());
+                        let mut tmp = analysis.bottom_value(body);
                         let mut discriminants = enum_def.discriminants(tcx);
                         for (value, target) in values.iter().zip(targets.iter().copied()) {
                             let (variant_idx, _) =
@@ -508,7 +508,7 @@ impl Direction for Forward {
                                          from that of `SwitchInt::values`",
                                 );
 
-                            tmp.overwrite(exit_state);
+                            tmp.clone_from(exit_state);
                             analysis.apply_discriminant_switch_effect(
                                 &mut tmp,
                                 bb,