Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir][scf] Track replacements using a listener in TileAndFuse #120999

Merged
merged 1 commit into from
Dec 24, 2024

Conversation

Groverkss
Copy link
Member

This PR makes TileAndFuse explicitly track replacements using a listener instead of assuming that the results always come from the outer most tiling loop. scf::tileUsingInterface can introduce merge operations whose results are the actual replacements to use, instead of the outer most loop results.

@llvmbot
Copy link
Member

llvmbot commented Dec 23, 2024

@llvm/pr-subscribers-mlir

Author: Kunwar Grover (Groverkss)

Changes

This PR makes TileAndFuse explicitly track replacements using a listener instead of assuming that the results always come from the outer most tiling loop. scf::tileUsingInterface can introduce merge operations whose results are the actual replacements to use, instead of the outer most loop results.


Full diff: https://github.com/llvm/llvm-project/pull/120999.diff

1 Files Affected:

  • (modified) mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp (+59-21)
diff --git a/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp b/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp
index 90db42d479a193..2277989bf8411b 100644
--- a/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp
+++ b/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp
@@ -28,6 +28,7 @@
 #include "mlir/Interfaces/TilingInterface.h"
 #include "mlir/Rewrite/FrozenRewritePatternSet.h"
 #include "mlir/Transforms/GreedyPatternRewriteDriver.h"
+#include "llvm/ADT/ScopeExit.h"
 #include "llvm/ADT/TypeSwitch.h"
 #include "llvm/Support/Debug.h"
 #include <optional>
@@ -1467,6 +1468,47 @@ void SliceTrackingListener::notifyOperationReplaced(Operation *op,
                                                     ValueRange replacement) {
   removeOp(op);
 }
+
+//===----------------------------------------------------------------------===//
+// ReplacementListener
+//===----------------------------------------------------------------------===//
+
+/// Listener that tracks updates replacements for values which can be mutated.
+/// This listener runs on top of the existing listener for the rewriter,
+/// to make sure external users can still run listeners.
+class ReplacementListener : public RewriterBase::ForwardingListener {
+public:
+  ReplacementListener(DenseMap<Value, Value> &replacements,
+                      OpBuilder::Listener *listener)
+      : ForwardingListener(listener), replacements(replacements) {}
+
+  void updateReplacementValues(ValueRange origValues,
+                               ValueRange replaceValues) {
+    // This can probably be written better, but just iterates over the map
+    // and the new replacements for now.
+    for (auto &[key, val] : replacements) {
+      for (auto [orig, replace] : llvm::zip_equal(origValues, replaceValues)) {
+        if (val == orig) {
+          val = replace;
+        }
+      }
+    }
+  }
+
+  void notifyOperationReplaced(Operation *op, Operation *newOp) override {
+    ForwardingListener::notifyOperationReplaced(op, newOp);
+    updateReplacementValues(op->getResults(), newOp->getResults());
+  }
+
+  void notifyOperationReplaced(Operation *op, ValueRange values) override {
+    ForwardingListener::notifyOperationReplaced(op, values);
+    updateReplacementValues(op->getResults(), values);
+  }
+
+private:
+  DenseMap<Value, Value> &replacements;
+};
+
 } // namespace
 
 /// Implementation of tile consumer and fuse producer greedily.
@@ -1493,26 +1535,27 @@ mlir::scf::tileConsumerAndFuseProducersUsingSCF(
   for (auto *tiledOp : tilingResult->tiledOps)
     tiledAndFusedOps.insert(tiledOp);
 
+  DenseMap<Value, Value> replacements;
+  for (auto [origVal, replacement] : llvm::zip_equal(
+           consumer->getResults(), tilingResult->mergeResult.replacements)) {
+    replacements[origVal] = replacement;
+  }
+
   // If there are no loops generated, fusion is immaterial.
   auto &loops = tilingResult->loops;
   if (loops.empty()) {
-    DenseMap<Value, Value> replacements;
-    for (auto [origVal, replacement] : llvm::zip_equal(
-             consumer->getResults(), tilingResult->mergeResult.replacements)) {
-      replacements[origVal] = replacement;
-    }
     return scf::SCFTileAndFuseResult{fusedProducers, tiledAndFusedOps, loops,
                                      replacements};
   }
 
-  // To keep track of replacements for now just record the map from the
-  // original untiled value to the result number of the for loop. Since the
-  // loop gets potentially replaced during fusion, keeping the value directly
-  // wont work.
-  DenseMap<Value, size_t> origValToResultNumber;
-  for (auto [index, result] : llvm::enumerate(consumer->getResults())) {
-    origValToResultNumber[result] = index;
-  }
+  // Since the loop gets potentially replaced during fusion, we need to track
+  // the mutation of replacement values. To do this, we attach a listener to
+  // update the replacements as they happen.
+  OpBuilder::Listener *previousListener = rewriter.getListener();
+  auto resetListener =
+      llvm::make_scope_exit([&]() { rewriter.setListener(previousListener); });
+  ReplacementListener replaceListener(replacements, previousListener);
+  rewriter.setListener(&replaceListener);
 
   // 2. Typically, the operands of the tiled operation are slices of the
   //    operands of the untiled operation. These are expressed in IR using
@@ -1581,9 +1624,9 @@ mlir::scf::tileConsumerAndFuseProducersUsingSCF(
       worklistCandidates.append(newSlices.value());
       for (auto [index, result] :
            llvm::enumerate(fusableProducerOp->getResults())) {
-        origValToResultNumber[result] = loops.front()->getNumResults() -
-                                        fusableProducerOp->getNumResults() +
-                                        index;
+        replacements[result] = loops.front()->getResult(
+            loops.front()->getNumResults() -
+            fusableProducerOp->getNumResults() + index);
       }
     }
     if (Operation *tiledAndFusedOp =
@@ -1597,11 +1640,6 @@ mlir::scf::tileConsumerAndFuseProducersUsingSCF(
     }
   }
 
-  DenseMap<Value, Value> replacements;
-  for (auto [origVal, resultNumber] : origValToResultNumber) {
-    replacements[origVal] = loops.front()->getResult(resultNumber);
-  }
-
   return scf::SCFTileAndFuseResult{fusedProducers, tiledAndFusedOps, loops,
                                    replacements};
 }

@llvmbot
Copy link
Member

llvmbot commented Dec 23, 2024

@llvm/pr-subscribers-mlir-scf

Author: Kunwar Grover (Groverkss)

Changes

This PR makes TileAndFuse explicitly track replacements using a listener instead of assuming that the results always come from the outer most tiling loop. scf::tileUsingInterface can introduce merge operations whose results are the actual replacements to use, instead of the outer most loop results.


Full diff: https://github.com/llvm/llvm-project/pull/120999.diff

1 Files Affected:

  • (modified) mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp (+59-21)
diff --git a/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp b/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp
index 90db42d479a193..2277989bf8411b 100644
--- a/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp
+++ b/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp
@@ -28,6 +28,7 @@
 #include "mlir/Interfaces/TilingInterface.h"
 #include "mlir/Rewrite/FrozenRewritePatternSet.h"
 #include "mlir/Transforms/GreedyPatternRewriteDriver.h"
+#include "llvm/ADT/ScopeExit.h"
 #include "llvm/ADT/TypeSwitch.h"
 #include "llvm/Support/Debug.h"
 #include <optional>
@@ -1467,6 +1468,47 @@ void SliceTrackingListener::notifyOperationReplaced(Operation *op,
                                                     ValueRange replacement) {
   removeOp(op);
 }
+
+//===----------------------------------------------------------------------===//
+// ReplacementListener
+//===----------------------------------------------------------------------===//
+
+/// Listener that tracks updates replacements for values which can be mutated.
+/// This listener runs on top of the existing listener for the rewriter,
+/// to make sure external users can still run listeners.
+class ReplacementListener : public RewriterBase::ForwardingListener {
+public:
+  ReplacementListener(DenseMap<Value, Value> &replacements,
+                      OpBuilder::Listener *listener)
+      : ForwardingListener(listener), replacements(replacements) {}
+
+  void updateReplacementValues(ValueRange origValues,
+                               ValueRange replaceValues) {
+    // This can probably be written better, but just iterates over the map
+    // and the new replacements for now.
+    for (auto &[key, val] : replacements) {
+      for (auto [orig, replace] : llvm::zip_equal(origValues, replaceValues)) {
+        if (val == orig) {
+          val = replace;
+        }
+      }
+    }
+  }
+
+  void notifyOperationReplaced(Operation *op, Operation *newOp) override {
+    ForwardingListener::notifyOperationReplaced(op, newOp);
+    updateReplacementValues(op->getResults(), newOp->getResults());
+  }
+
+  void notifyOperationReplaced(Operation *op, ValueRange values) override {
+    ForwardingListener::notifyOperationReplaced(op, values);
+    updateReplacementValues(op->getResults(), values);
+  }
+
+private:
+  DenseMap<Value, Value> &replacements;
+};
+
 } // namespace
 
 /// Implementation of tile consumer and fuse producer greedily.
@@ -1493,26 +1535,27 @@ mlir::scf::tileConsumerAndFuseProducersUsingSCF(
   for (auto *tiledOp : tilingResult->tiledOps)
     tiledAndFusedOps.insert(tiledOp);
 
+  DenseMap<Value, Value> replacements;
+  for (auto [origVal, replacement] : llvm::zip_equal(
+           consumer->getResults(), tilingResult->mergeResult.replacements)) {
+    replacements[origVal] = replacement;
+  }
+
   // If there are no loops generated, fusion is immaterial.
   auto &loops = tilingResult->loops;
   if (loops.empty()) {
-    DenseMap<Value, Value> replacements;
-    for (auto [origVal, replacement] : llvm::zip_equal(
-             consumer->getResults(), tilingResult->mergeResult.replacements)) {
-      replacements[origVal] = replacement;
-    }
     return scf::SCFTileAndFuseResult{fusedProducers, tiledAndFusedOps, loops,
                                      replacements};
   }
 
-  // To keep track of replacements for now just record the map from the
-  // original untiled value to the result number of the for loop. Since the
-  // loop gets potentially replaced during fusion, keeping the value directly
-  // wont work.
-  DenseMap<Value, size_t> origValToResultNumber;
-  for (auto [index, result] : llvm::enumerate(consumer->getResults())) {
-    origValToResultNumber[result] = index;
-  }
+  // Since the loop gets potentially replaced during fusion, we need to track
+  // the mutation of replacement values. To do this, we attach a listener to
+  // update the replacements as they happen.
+  OpBuilder::Listener *previousListener = rewriter.getListener();
+  auto resetListener =
+      llvm::make_scope_exit([&]() { rewriter.setListener(previousListener); });
+  ReplacementListener replaceListener(replacements, previousListener);
+  rewriter.setListener(&replaceListener);
 
   // 2. Typically, the operands of the tiled operation are slices of the
   //    operands of the untiled operation. These are expressed in IR using
@@ -1581,9 +1624,9 @@ mlir::scf::tileConsumerAndFuseProducersUsingSCF(
       worklistCandidates.append(newSlices.value());
       for (auto [index, result] :
            llvm::enumerate(fusableProducerOp->getResults())) {
-        origValToResultNumber[result] = loops.front()->getNumResults() -
-                                        fusableProducerOp->getNumResults() +
-                                        index;
+        replacements[result] = loops.front()->getResult(
+            loops.front()->getNumResults() -
+            fusableProducerOp->getNumResults() + index);
       }
     }
     if (Operation *tiledAndFusedOp =
@@ -1597,11 +1640,6 @@ mlir::scf::tileConsumerAndFuseProducersUsingSCF(
     }
   }
 
-  DenseMap<Value, Value> replacements;
-  for (auto [origVal, resultNumber] : origValToResultNumber) {
-    replacements[origVal] = loops.front()->getResult(resultNumber);
-  }
-
   return scf::SCFTileAndFuseResult{fusedProducers, tiledAndFusedOps, loops,
                                    replacements};
 }

Copy link
Contributor

@MaheshRavishankar MaheshRavishankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks.

@Groverkss Groverkss merged commit 6e3631d into llvm:main Dec 24, 2024
11 checks passed
raikonenfnu added a commit to iree-org/llvm-project that referenced this pull request Dec 26, 2024
raikonenfnu added a commit to iree-org/llvm-project that referenced this pull request Dec 26, 2024
raikonenfnu added a commit to raikonenfnu/iree that referenced this pull request Dec 27, 2024
Update LLVM to llvm/llvm-project@ac8bb735. C++ changes are related to
change in behavior of TypeConverter. It used to generate
UnrealizedConversionCastOp, during applySignatureConversion in
GenericOpTypePropagation of TypePropagationPass.cpp, however now it's
not. This causes unrealized_conversion_cast to be generated later and
hence survive the pass. To repro above behavior, try undo the C++ change
in this PR and then:

```
wget https://gist.githubusercontent.com/raikonenfnu/dfb3b274007df8c4be87daf9ee67a5f4/raw/e48cc07e5fa558cd2c450b0e3ae46568136e1be6/type_propagate_repro.mlir
iree-opt --pass-pipeline='builtin.module(func.func(iree-codegen-type-propagation))' propagate_test.mlir -o /dev/null

error: failed to legalize unresolved materialization from ('i8') to ('i1') that remained live after conversion
  ^bb0(%in: i1, %in_0: f32, %in_1: f32, %out: f32):
       ^
propagate_test.mlir:5:8: note: see current operation: %10 = "builtin.unrealized_conversion_cast"(%arg0) : (i8) -> i1
propagate_test.mlir:6:11: note: see existing live user here: %10 = arith.select %9, %in_0, %in_1 : f32
```

This PR also carries the following reverts:

llvm/llvm-project#120999
llvm/llvm-project#120115
llvm/llvm-project#119461

The main issue with this PR(12099 and 120115) is it breaks matvec codegen generating scf.if instead of scf.for(s). An issue will be pushed up for repro.

The main issue with PR 119461 is it breaks e2e riscv test by making it get stuck on infinite loop.
```
/path/to/iree-build/tools/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo --iree-input-demote-f64-to-f32 --iree-llvmcpu-target-cpu=generic /path/to/iree/tests/e2e/stablehlo_ops/three_fry.mlir -o three_fly_exec_target.mlir --iree-llvmcpu-target-triple=riscv64 --iree-llvmcpu-target-abi=lp64d --iree-llvmcpu-target-cpu-features=+m,+a,+d,+zvl512b,+v --mlir-disable-threading
> infinite loop
```

Signed-off-by: Stanley Winata <stanley.winata@amd.com>
raikonenfnu added a commit to iree-org/llvm-project that referenced this pull request Dec 27, 2024
raikonenfnu added a commit to raikonenfnu/iree that referenced this pull request Dec 28, 2024
Update LLVM to llvm/llvm-project@ac8bb735. C++ changes are related to
change in behavior of TypeConverter. It used to generate
UnrealizedConversionCastOp, during applySignatureConversion in
GenericOpTypePropagation of TypePropagationPass.cpp, however now it's
not. This causes unrealized_conversion_cast to be generated later and
hence survive the pass. To repro above behavior, try undo the C++ change
in this PR and then:

```
wget https://gist.githubusercontent.com/raikonenfnu/dfb3b274007df8c4be87daf9ee67a5f4/raw/e48cc07e5fa558cd2c450b0e3ae46568136e1be6/type_propagate_repro.mlir
iree-opt --pass-pipeline='builtin.module(func.func(iree-codegen-type-propagation))' propagate_test.mlir -o /dev/null

error: failed to legalize unresolved materialization from ('i8') to ('i1') that remained live after conversion
  ^bb0(%in: i1, %in_0: f32, %in_1: f32, %out: f32):
       ^
propagate_test.mlir:5:8: note: see current operation: %10 = "builtin.unrealized_conversion_cast"(%arg0) : (i8) -> i1
propagate_test.mlir:6:11: note: see existing live user here: %10 = arith.select %9, %in_0, %in_1 : f32
```

This PR also carries the following reverts:

llvm/llvm-project#120999
llvm/llvm-project#120115
llvm/llvm-project#119461

The main issue with this PR(12099 and 120115) is it breaks matvec codegen generating scf.if instead of scf.for(s). An issue will be pushed up for repro.

The main issue with PR 119461 is it breaks e2e riscv test by making it get stuck on infinite loop.
```
/path/to/iree-build/tools/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo --iree-input-demote-f64-to-f32 --iree-llvmcpu-target-cpu=generic /path/to/iree/tests/e2e/stablehlo_ops/three_fry.mlir -o three_fly_exec_target.mlir --iree-llvmcpu-target-triple=riscv64 --iree-llvmcpu-target-abi=lp64d --iree-llvmcpu-target-cpu-features=+m,+a,+d,+zvl512b,+v --mlir-disable-threading
> infinite loop
```

Signed-off-by: Stanley Winata <stanley.winata@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants