Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir][OpenMP] - MLIR to LLVMIR translation support for delayed privatization of allocatables in omp.target ops #116576

Merged
merged 1 commit into from
Dec 12, 2024

Conversation

ergawy
Copy link
Member

@ergawy ergawy commented Nov 18, 2024

This PR adds support to translate the private clause from MLIR to LLVMIR when used on allocatables in the context of an omp.target op.

This replaces #113208.

Parent PR: #116770. Only the latest commit is relevant to the PR.

@llvmbot llvmbot added mlir:llvm mlir flang Flang issues not falling into any other category mlir:openmp flang:fir-hlfir flang:openmp labels Nov 18, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 18, 2024

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-openmp

Author: Kareem Ergawy (ergawy)

Changes

This PR adds support to translate the private clause from MLIR to LLVMIR when used on allocatables in the context of an omp.target op.

This replaces #113208.


Patch is 37.86 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116576.diff

7 Files Affected:

  • (modified) flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp (+2-9)
  • (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+11)
  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+132-9)
  • (added) mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir (+114)
  • (added) mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir (+207)
  • (modified) mlir/test/Target/LLVMIR/openmp-target-private.mlir (+89)
  • (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-18)
diff --git a/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp b/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp
index 289e648eed8546..19bb94973e09aa 100644
--- a/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp
+++ b/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp
@@ -49,13 +49,6 @@ class MapsForPrivatizedSymbolsPass
     : public flangomp::impl::MapsForPrivatizedSymbolsPassBase<
           MapsForPrivatizedSymbolsPass> {
 
-  bool privatizerNeedsMap(omp::PrivateClauseOp &privatizer) {
-    Region &allocRegion = privatizer.getAllocRegion();
-    Value blockArg0 = allocRegion.getArgument(0);
-    if (blockArg0.use_empty())
-      return false;
-    return true;
-  }
   omp::MapInfoOp createMapInfo(Location loc, Value var,
                                fir::FirOpBuilder &builder) {
     uint64_t mapTypeTo = static_cast<
@@ -132,9 +125,9 @@ class MapsForPrivatizedSymbolsPass
         omp::PrivateClauseOp privatizer =
             SymbolTable::lookupNearestSymbolFrom<omp::PrivateClauseOp>(
                 targetOp, privatizerName);
-        if (!privatizerNeedsMap(privatizer)) {
+        if (!privatizer.needsMap())
           continue;
-        }
+
         builder.setInsertionPoint(targetOp);
         Location loc = targetOp.getLoc();
         omp::MapInfoOp mapInfoOp = createMapInfo(loc, privVar, builder);
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index a4cd81ac226b84..94fe900ee22404 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -135,6 +135,17 @@ def PrivateClauseOp : OpenMP_Op<"private", [IsolatedFromAbove, RecipeInterface]>
       auto &region = getDeallocRegion();
       return region.empty() ? nullptr : region.getArgument(0);
     }
+    /// privatizer is a PrivateClauseOp that privatizes an MLIR value.
+    /// privatizerNeedsMap returns true if the value being privatized in an
+    /// omp.target p should additionally be mapped to the target region
+    /// using a MapInfoOp. This is most common when an allocatable is privatized.
+    /// In such cases, the descriptor is use in privatization and needs to be
+    /// mapped on to the device.
+    bool needsMap() {
+      Value blockArg0 = getAllocRegion().getArgument(0);
+      return !blockArg0.use_empty();
+    }
+
   }];
 
   let hasRegionVerifier = 1;
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index a39b27aa9e12dc..2f0e4eb0b0637e 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -305,10 +305,6 @@ static LogicalResult checkImplementationStatus(Operation &op) {
             if (privatizer.getDataSharingType() ==
                 omp::DataSharingClauseType::FirstPrivate)
               result = todo("firstprivate");
-
-            if (!privatizer.getDeallocRegion().empty())
-              result = op.emitError("not yet implemented: privatization of "
-                                    "structures in omp.target operation");
           }
         }
         checkThreadLimit(op, result);
@@ -3819,7 +3815,57 @@ createDeviceArgumentAccessor(MapInfoData &mapData, llvm::Argument &arg,
 
   return builder.saveIP();
 }
+/// privatizer is a PrivateClauseOp that privatizes an MLIR value.
+/// privatizerNeedsMap returns true if the value being privatized in an
+/// omp.target p should additionally be mapped to the target region
+/// using a MapInfoOp. This is most common when an allocatable is privatized.
+/// In such cases, the descriptor is use in privatization and needs to be
+/// mapped on to the device.
+// static bool privatizerNeedsMap(omp::PrivateClauseOp &privatizer) {
+//   Region &allocRegion = privatizer.getAllocRegion();
+//   Value blockArg0 = allocRegion.getArgument(0);
+//   return !blockArg0.use_empty();
+// }
+
+/// Return the llvm::Value * corresponding to the privateVar that
+/// is being privatized. It isn't always as simple as looking up
+/// moduleTranslation with privateVar. For instance, in case of
+/// an allocatable, the descriptor for the allocatable is privatized.
+/// This descriptor is mapped using an MapInfoOp. So, this function
+/// will return a pointer to the llvm::Value corresponding to the
+/// block argument for the mapped descriptor.
+static llvm::Value *
+findHostAssociatedValue(Value privateVar, omp::TargetOp targetOp,
+                        llvm::DenseMap<Value, int> &mappedPrivateVars,
+                        llvm::IRBuilderBase &builder,
+                        LLVM::ModuleTranslation &moduleTranslation) {
+  if (mappedPrivateVars.contains(privateVar)) {
+    int blockArgIndex = mappedPrivateVars[privateVar];
+    Value blockArg = targetOp.getRegion().getArgument(blockArgIndex);
+    mlir::Type privVarType = privateVar.getType();
+    mlir::Type blockArgType = blockArg.getType();
+    assert(isa<LLVM::LLVMPointerType>(blockArgType) &&
+           "A block argument corresponding to a mapped var should have "
+           "!llvm.ptr type");
+
+    if (privVarType == blockArg.getType()) {
+      llvm::Value *v = moduleTranslation.lookupValue(blockArg);
+      return v;
+    }
 
+    if (!isa<LLVM::LLVMPointerType>(privVarType)) {
+      // This typically happens when the privatized type is lowered from
+      // boxchar<KIND> and gets lowered to !llvm.struct<(ptr, i64)>. That is the
+      // struct/pair is passed by value. But, mapped values are passed only as
+      // pointers, so before we privatize, we must load the pointer.
+      llvm::Value *load =
+          builder.CreateLoad(moduleTranslation.convertType(privVarType),
+                             moduleTranslation.lookupValue(blockArg));
+      return load;
+    }
+  }
+  return moduleTranslation.lookupValue(privateVar);
+}
 static LogicalResult
 convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
                  LLVM::ModuleTranslation &moduleTranslation) {
@@ -3831,6 +3877,19 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
   bool isTargetDevice = ompBuilder->Config.isTargetDevice();
   auto parentFn = opInst.getParentOfType<LLVM::LLVMFuncOp>();
   auto &targetRegion = targetOp.getRegion();
+  // Holds the private vars that have been mapped along with
+  // the block argument that corresponds to the MapInfoOp
+  // corresponding to the private var in question.
+  // So, for instance
+  //
+  // %10 = omp.map.info var_ptr(%6#0 : !fir.ref<!fir.box<!fir.heap<i32>>>, ..)
+  // omp.target map_entries(%10 -> %arg0) private(@box.privatizer %6#0-> %arg1)
+  //
+  // Then, %10 has been created so that the descriptor can be used by the
+  // privatizer
+  // @box.privatizer on the device side. Here we'd record {%6#0, 0} in the
+  // mappedPrivateVars map.
+  llvm::DenseMap<Value, int> mappedPrivateVars;
   DataLayout dl = DataLayout(opInst.getParentOfType<ModuleOp>());
   SmallVector<Value> mapVars = targetOp.getMapVars();
   ArrayRef<BlockArgument> mapBlockArgs =
@@ -3842,6 +3901,49 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
   bool isOffloadEntry =
       isTargetDevice || !ompBuilder->Config.TargetTriples.empty();
 
+  // For some private variables, the MapsForPrivatizedVariablesPass
+  // creates MapInfoOp instances. Go through the private variables and
+  // the mapped variables so that during codegeneration we are able
+  // to quickly look up the corresponding map variable, if any for each
+  // private variable.
+  if (!targetOp.getPrivateVars().empty() && !targetOp.getMapVars().empty()) {
+    auto argIface = llvm::cast<omp::BlockArgOpenMPOpInterface>(*targetOp);
+    unsigned lastMapBlockArgsIdx =
+        argIface.getMapBlockArgsStart() + argIface.numMapBlockArgs() - 1;
+    OperandRange privateVars = targetOp.getPrivateVars();
+    std::optional<ArrayAttr> privateSyms = targetOp.getPrivateSyms();
+    auto reverseIt = mapVars.rbegin();
+    for (auto [privVar, privSym] :
+         llvm::reverse(llvm::zip_equal(privateVars, *privateSyms))) {
+      SymbolRefAttr privatizerName = llvm::cast<SymbolRefAttr>(privSym);
+      omp::PrivateClauseOp privatizer =
+          findPrivatizer(targetOp, privatizerName);
+      if (!privatizer.needsMap())
+        continue;
+
+      // The MapInfoOp defining the map var isn't really needed later.
+      // So, we don't store it in any datastructure. Instead, we just
+      // do some sanity checks on it right now.
+      omp::MapInfoOp mapInfoOp =
+          llvm::cast<omp::MapInfoOp>((*reverseIt).getDefiningOp());
+      Type varType = mapInfoOp.getVarType();
+
+      // Check #1: Check that the type of the private variable matches
+      // the type of the variable being mapped.
+      if (!isa<LLVM::LLVMPointerType>(privVar.getType()))
+        assert(
+            varType == privVar.getType() &&
+            "Type of private var doesn't match the type of the mapped value");
+
+      // Ok, only 1 sanity check for now.
+      // Record the index of the block argument corresponding to this
+      // mapvar.
+      mappedPrivateVars.insert({privVar, lastMapBlockArgsIdx});
+      lastMapBlockArgsIdx--;
+      reverseIt++;
+    }
+  }
+
   using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy;
   auto bodyCB = [&](InsertPointTy allocaIP, InsertPointTy codeGenIP)
       -> llvm::OpenMPIRBuilder::InsertPointOrErrorTy {
@@ -3868,9 +3970,10 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
           moduleTranslation.lookupValue(mapInfoOp.getVarPtr());
       moduleTranslation.mapValue(arg, mapOpValue);
     }
-
     // Do privatization after moduleTranslation has already recorded
     // mapped values.
+    SmallVector<llvm::Value *> llvmPrivateVars;
+    SmallVector<Region *> privateCleanupRegions;
     if (!targetOp.getPrivateVars().empty()) {
       builder.restoreIP(allocaIP);
 
@@ -3886,11 +3989,13 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
         omp::PrivateClauseOp privatizer = findPrivatizer(&opInst, privSym);
         assert(privatizer.getDataSharingType() !=
                    omp::DataSharingClauseType::FirstPrivate &&
-               privatizer.getDeallocRegion().empty() &&
                "unsupported privatizer");
-        moduleTranslation.mapValue(privatizer.getAllocMoldArg(),
-                                   moduleTranslation.lookupValue(privVar));
         Region &allocRegion = privatizer.getAllocRegion();
+          BlockArgument allocRegionArg = allocRegion.getArgument(0);
+          moduleTranslation.mapValue(
+              allocRegionArg,
+              findHostAssociatedValue(privVar, targetOp, mappedPrivateVars,
+                                      builder, moduleTranslation));
         SmallVector<llvm::Value *, 1> yieldedValues;
         if (failed(inlineConvertOmpRegions(
                 allocRegion, "omp.targetop.privatizer", builder,
@@ -3899,7 +4004,12 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
               "failed to inline `alloc` region of `omp.private`");
         }
         assert(yieldedValues.size() == 1);
-        moduleTranslation.mapValue(privBlockArg, yieldedValues.front());
+        llvm::Value *llvmReplacementValue = yieldedValues.front();
+        moduleTranslation.mapValue(privBlockArg, llvmReplacementValue);
+        if (!privatizer.getDeallocRegion().empty()) {
+          llvmPrivateVars.push_back(llvmReplacementValue);
+          privateCleanupRegions.push_back(&privatizer.getDeallocRegion());
+        }
         moduleTranslation.forgetMapping(allocRegion);
         builder.restoreIP(builder.saveIP());
       }
@@ -3911,6 +4021,19 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
       return exitBlock.takeError();
 
     builder.SetInsertPoint(*exitBlock);
+    if (!llvmPrivateVars.empty()) {
+      assert(llvmPrivateVars.size() == privateCleanupRegions.size() &&
+             "Number of private variables needing cleanup not equal to number"
+             "of privatizers with dealloc regions");
+      if (failed(inlineOmpRegionCleanup(
+              privateCleanupRegions, llvmPrivateVars, moduleTranslation,
+              builder, "omp.targetop.private.cleanup",
+              /*shouldLoadCleanupRegionArg=*/false))) {
+        return llvm::createStringError(
+            "failed to inline `dealloc` region of `omp.private` "
+            "op in the target region");
+      }
+    }
     return builder.saveIP();
   };
 
diff --git a/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir b/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir
new file mode 100644
index 00000000000000..f8da37f0eeb09e
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir
@@ -0,0 +1,114 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+llvm.func @free(!llvm.ptr)
+llvm.func @malloc(i64) -> !llvm.ptr
+omp.private {type = private} @box.heap_privatizer0 : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+  %0 = llvm.mlir.constant(1 : i32) : i32
+  %10 = llvm.getelementptr %arg0[0, 1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %1 = llvm.load %10 : !llvm.ptr -> i64
+  %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>  : (i32) -> !llvm.ptr
+  %17 = llvm.call @malloc(%1) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+  %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+  omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+  %6 = llvm.mlir.constant(0 : i64) : i64
+  %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+  llvm.call @free(%9) : (!llvm.ptr) -> ()
+  omp.yield
+}
+omp.private {type = private} @box.heap_privatizer1 : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+  %0 = llvm.mlir.constant(1 : i32) : i32
+  %10 = llvm.getelementptr %arg0[0, 1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %1 = llvm.load %10 : !llvm.ptr -> i64
+  %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>  : (i32) -> !llvm.ptr
+  %17 = llvm.call @malloc(%1) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+  %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+  omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+  %6 = llvm.mlir.constant(0 : i64) : i64
+  %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+  llvm.call @free(%9) : (!llvm.ptr) -> ()
+  omp.yield
+}
+llvm.func @target_allocatable_(%arg0: !llvm.ptr {fir.bindc_name = "lb"}, %arg1: !llvm.ptr {fir.bindc_name = "ub"}, %arg2: !llvm.ptr {fir.bindc_name = "l"}) attributes {fir.internal_name = "_QPtarget_allocatable"} {
+  %6 = llvm.mlir.constant(1 : i64) : i64
+  %7 = llvm.alloca %6 x i32 {bindc_name = "mapped_var"} : (i64) -> !llvm.ptr
+  %13 = llvm.alloca %6 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var0"} : (i64) -> !llvm.ptr
+  %14 = llvm.alloca %6 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var1"} : (i64) -> !llvm.ptr
+  %53 = omp.map.info var_ptr(%7 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "mapped_var"}
+  %54 = omp.map.info var_ptr(%13 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(to) capture(ByRef) -> !llvm.ptr
+  %55 = omp.map.info var_ptr(%14 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(to) capture(ByRef) -> !llvm.ptr
+  omp.target map_entries(%53 -> %arg3, %54 -> %arg4, %55 ->%arg5 : !llvm.ptr, !llvm.ptr, !llvm.ptr) private(@box.heap_privatizer0 %13 -> %arg6, @box.heap_privatizer1 %14 -> %arg7 : !llvm.ptr, !llvm.ptr) {
+    %64 = llvm.mlir.constant(1 : i32) : i32
+    %65 = llvm.alloca %64 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+    %67 = llvm.alloca %64 x i32 : (i32) -> !llvm.ptr
+    %66 = llvm.mlir.constant(19 : i32) : i32
+    %69 = llvm.mlir.constant(10 : i32) : i32
+    llvm.store %64, %arg3 : i32, !llvm.ptr
+    llvm.store %69, %67 : i32, !llvm.ptr
+    %75 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+    %90 = llvm.insertvalue %67, %75[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+    llvm.store %90, %65 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+    %91 = llvm.mlir.zero : !llvm.ptr
+    %92 = llvm.call @_FortranAAssign(%arg6, %65, %91, %66) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
+    %93 = llvm.call @_FortranAAssign(%arg7, %65, %91, %66) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
+    omp.terminator
+  }
+  llvm.return
+}
+
+
+llvm.func @_FortranAAssign(!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()> attributes {fir.runtime, sym_visibility = "private"}
+
+// The first set of checks ensure that we are calling the offloaded function
+// with the right arguments, especially the second argument which needs to
+// be a memory reference to the descriptor for the privatized allocatable
+// CHECK: define void @target_allocatable_
+// CHECK-NOT: define internal void
+// CHECK: %[[DESC_ALLOC0:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1
+// CHECK: %[[DESC_ALLOC1:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1
+// CHECK: call void @__omp_offloading_[[OFFLOADED_FUNCTION:.*]](ptr {{[^,]+}},
+// CHECK-SAME: ptr %[[DESC_ALLOC0]], ptr %[[DESC_ALLOC1]])
+
+// CHECK: define internal void @__omp_offloading_[[OFFLOADED_FUNCTION]]
+// CHECK-SAME: (ptr {{[^,]+}}, ptr %[[DESCRIPTOR_ARG0:[^,]+]],
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG1:.*]]) {
+// CHECK: %[[I0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG0]], i32 0, i32 1
+// CHECK: %[[MALLOC_ARG0:.*]] = load i64, ptr %[[I0]]
+// CHECK: %[[PRIV_DESC0:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK: %[[HEAP_PTR0:.*]] = call ptr @malloc(i64 %[[MALLOC_ARG0]])
+// CHECK:  %[[TMP0:.*]] = insertvalue { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK-SAME: undef, ptr %[[HEAP_PTR0]], 0
+// CHECK: store { ptr, i64, i32, i8, i8, i8, i8 } %[[TMP0]], ptr %[[PRIV_DESC0]]
+
+// CHECK: %[[I1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG1]], i32 0, i32 1
+// CHECK: %[[MALLOC_ARG1:.*]] = load i64, ptr %[[I1]]
+// CHECK: %[[PRIV_DESC1:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK: %[[HEAP_PTR1:.*]] = call ptr @malloc(i64 %[[MALLOC_ARG1]])
+// CHECK:  %[[TMP1:.*]] = insertvalue { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK-SAME: undef, ptr %[[HEAP_PTR1]], 0
+// CHECK: store { ptr, i64, i32, i8, i8, i8, i8 } %[[TMP1]], ptr %[[PRIV_DESC1]]
+
+// CHECK: call {} @_FortranAAssign(ptr %[[PRIV_DESC0]]
+// CHECK: call {} @_FortranAAssign(ptr %[[PRIV_DESC1]]
+
+// CHECK:  %[[PTR0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[PRIV_DESC0]], i32 0, i32 0
+// CHECK: %[[HEAP_MEMREF0:.*]] = load ptr, ptr %[[PTR0]]
+// CHECK:  call void @free(ptr %[[HEAP_MEMREF0]])
+// CHECK:  %[[PTR1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[PRIV_DESC1]], i32 0, i32 0
+// CHECK: %[[HEAP_MEMREF1:.*]] = load ptr, ptr %[[PTR1]]
+// CHECK:  call void @free(ptr %[[HEAP_MEMREF1]])
diff --git a/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir b/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir
new file mode 100644
index 00000000000000..d94f9d3601f9ce
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir
@@ -0,0 +1,207 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+llvm.func @free(!llvm.ptr)
+llvm.func @malloc(i64) -> !llvm.ptr
+omp.private {type = private} @box.heap_privatizer : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+  %0 = llvm.mlir.constant(1 : i32) : i32
+  %7 = l...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Nov 18, 2024

@llvm/pr-subscribers-mlir-llvm

Author: Kareem Ergawy (ergawy)

Changes

This PR adds support to translate the private clause from MLIR to LLVMIR when used on allocatables in the context of an omp.target op.

This replaces #113208.


Patch is 37.86 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116576.diff

7 Files Affected:

  • (modified) flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp (+2-9)
  • (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+11)
  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+132-9)
  • (added) mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir (+114)
  • (added) mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir (+207)
  • (modified) mlir/test/Target/LLVMIR/openmp-target-private.mlir (+89)
  • (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-18)
diff --git a/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp b/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp
index 289e648eed8546..19bb94973e09aa 100644
--- a/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp
+++ b/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp
@@ -49,13 +49,6 @@ class MapsForPrivatizedSymbolsPass
     : public flangomp::impl::MapsForPrivatizedSymbolsPassBase<
           MapsForPrivatizedSymbolsPass> {
 
-  bool privatizerNeedsMap(omp::PrivateClauseOp &privatizer) {
-    Region &allocRegion = privatizer.getAllocRegion();
-    Value blockArg0 = allocRegion.getArgument(0);
-    if (blockArg0.use_empty())
-      return false;
-    return true;
-  }
   omp::MapInfoOp createMapInfo(Location loc, Value var,
                                fir::FirOpBuilder &builder) {
     uint64_t mapTypeTo = static_cast<
@@ -132,9 +125,9 @@ class MapsForPrivatizedSymbolsPass
         omp::PrivateClauseOp privatizer =
             SymbolTable::lookupNearestSymbolFrom<omp::PrivateClauseOp>(
                 targetOp, privatizerName);
-        if (!privatizerNeedsMap(privatizer)) {
+        if (!privatizer.needsMap())
           continue;
-        }
+
         builder.setInsertionPoint(targetOp);
         Location loc = targetOp.getLoc();
         omp::MapInfoOp mapInfoOp = createMapInfo(loc, privVar, builder);
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index a4cd81ac226b84..94fe900ee22404 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -135,6 +135,17 @@ def PrivateClauseOp : OpenMP_Op<"private", [IsolatedFromAbove, RecipeInterface]>
       auto &region = getDeallocRegion();
       return region.empty() ? nullptr : region.getArgument(0);
     }
+    /// privatizer is a PrivateClauseOp that privatizes an MLIR value.
+    /// privatizerNeedsMap returns true if the value being privatized in an
+    /// omp.target p should additionally be mapped to the target region
+    /// using a MapInfoOp. This is most common when an allocatable is privatized.
+    /// In such cases, the descriptor is use in privatization and needs to be
+    /// mapped on to the device.
+    bool needsMap() {
+      Value blockArg0 = getAllocRegion().getArgument(0);
+      return !blockArg0.use_empty();
+    }
+
   }];
 
   let hasRegionVerifier = 1;
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index a39b27aa9e12dc..2f0e4eb0b0637e 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -305,10 +305,6 @@ static LogicalResult checkImplementationStatus(Operation &op) {
             if (privatizer.getDataSharingType() ==
                 omp::DataSharingClauseType::FirstPrivate)
               result = todo("firstprivate");
-
-            if (!privatizer.getDeallocRegion().empty())
-              result = op.emitError("not yet implemented: privatization of "
-                                    "structures in omp.target operation");
           }
         }
         checkThreadLimit(op, result);
@@ -3819,7 +3815,57 @@ createDeviceArgumentAccessor(MapInfoData &mapData, llvm::Argument &arg,
 
   return builder.saveIP();
 }
+/// privatizer is a PrivateClauseOp that privatizes an MLIR value.
+/// privatizerNeedsMap returns true if the value being privatized in an
+/// omp.target p should additionally be mapped to the target region
+/// using a MapInfoOp. This is most common when an allocatable is privatized.
+/// In such cases, the descriptor is use in privatization and needs to be
+/// mapped on to the device.
+// static bool privatizerNeedsMap(omp::PrivateClauseOp &privatizer) {
+//   Region &allocRegion = privatizer.getAllocRegion();
+//   Value blockArg0 = allocRegion.getArgument(0);
+//   return !blockArg0.use_empty();
+// }
+
+/// Return the llvm::Value * corresponding to the privateVar that
+/// is being privatized. It isn't always as simple as looking up
+/// moduleTranslation with privateVar. For instance, in case of
+/// an allocatable, the descriptor for the allocatable is privatized.
+/// This descriptor is mapped using an MapInfoOp. So, this function
+/// will return a pointer to the llvm::Value corresponding to the
+/// block argument for the mapped descriptor.
+static llvm::Value *
+findHostAssociatedValue(Value privateVar, omp::TargetOp targetOp,
+                        llvm::DenseMap<Value, int> &mappedPrivateVars,
+                        llvm::IRBuilderBase &builder,
+                        LLVM::ModuleTranslation &moduleTranslation) {
+  if (mappedPrivateVars.contains(privateVar)) {
+    int blockArgIndex = mappedPrivateVars[privateVar];
+    Value blockArg = targetOp.getRegion().getArgument(blockArgIndex);
+    mlir::Type privVarType = privateVar.getType();
+    mlir::Type blockArgType = blockArg.getType();
+    assert(isa<LLVM::LLVMPointerType>(blockArgType) &&
+           "A block argument corresponding to a mapped var should have "
+           "!llvm.ptr type");
+
+    if (privVarType == blockArg.getType()) {
+      llvm::Value *v = moduleTranslation.lookupValue(blockArg);
+      return v;
+    }
 
+    if (!isa<LLVM::LLVMPointerType>(privVarType)) {
+      // This typically happens when the privatized type is lowered from
+      // boxchar<KIND> and gets lowered to !llvm.struct<(ptr, i64)>. That is the
+      // struct/pair is passed by value. But, mapped values are passed only as
+      // pointers, so before we privatize, we must load the pointer.
+      llvm::Value *load =
+          builder.CreateLoad(moduleTranslation.convertType(privVarType),
+                             moduleTranslation.lookupValue(blockArg));
+      return load;
+    }
+  }
+  return moduleTranslation.lookupValue(privateVar);
+}
 static LogicalResult
 convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
                  LLVM::ModuleTranslation &moduleTranslation) {
@@ -3831,6 +3877,19 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
   bool isTargetDevice = ompBuilder->Config.isTargetDevice();
   auto parentFn = opInst.getParentOfType<LLVM::LLVMFuncOp>();
   auto &targetRegion = targetOp.getRegion();
+  // Holds the private vars that have been mapped along with
+  // the block argument that corresponds to the MapInfoOp
+  // corresponding to the private var in question.
+  // So, for instance
+  //
+  // %10 = omp.map.info var_ptr(%6#0 : !fir.ref<!fir.box<!fir.heap<i32>>>, ..)
+  // omp.target map_entries(%10 -> %arg0) private(@box.privatizer %6#0-> %arg1)
+  //
+  // Then, %10 has been created so that the descriptor can be used by the
+  // privatizer
+  // @box.privatizer on the device side. Here we'd record {%6#0, 0} in the
+  // mappedPrivateVars map.
+  llvm::DenseMap<Value, int> mappedPrivateVars;
   DataLayout dl = DataLayout(opInst.getParentOfType<ModuleOp>());
   SmallVector<Value> mapVars = targetOp.getMapVars();
   ArrayRef<BlockArgument> mapBlockArgs =
@@ -3842,6 +3901,49 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
   bool isOffloadEntry =
       isTargetDevice || !ompBuilder->Config.TargetTriples.empty();
 
+  // For some private variables, the MapsForPrivatizedVariablesPass
+  // creates MapInfoOp instances. Go through the private variables and
+  // the mapped variables so that during codegeneration we are able
+  // to quickly look up the corresponding map variable, if any for each
+  // private variable.
+  if (!targetOp.getPrivateVars().empty() && !targetOp.getMapVars().empty()) {
+    auto argIface = llvm::cast<omp::BlockArgOpenMPOpInterface>(*targetOp);
+    unsigned lastMapBlockArgsIdx =
+        argIface.getMapBlockArgsStart() + argIface.numMapBlockArgs() - 1;
+    OperandRange privateVars = targetOp.getPrivateVars();
+    std::optional<ArrayAttr> privateSyms = targetOp.getPrivateSyms();
+    auto reverseIt = mapVars.rbegin();
+    for (auto [privVar, privSym] :
+         llvm::reverse(llvm::zip_equal(privateVars, *privateSyms))) {
+      SymbolRefAttr privatizerName = llvm::cast<SymbolRefAttr>(privSym);
+      omp::PrivateClauseOp privatizer =
+          findPrivatizer(targetOp, privatizerName);
+      if (!privatizer.needsMap())
+        continue;
+
+      // The MapInfoOp defining the map var isn't really needed later.
+      // So, we don't store it in any datastructure. Instead, we just
+      // do some sanity checks on it right now.
+      omp::MapInfoOp mapInfoOp =
+          llvm::cast<omp::MapInfoOp>((*reverseIt).getDefiningOp());
+      Type varType = mapInfoOp.getVarType();
+
+      // Check #1: Check that the type of the private variable matches
+      // the type of the variable being mapped.
+      if (!isa<LLVM::LLVMPointerType>(privVar.getType()))
+        assert(
+            varType == privVar.getType() &&
+            "Type of private var doesn't match the type of the mapped value");
+
+      // Ok, only 1 sanity check for now.
+      // Record the index of the block argument corresponding to this
+      // mapvar.
+      mappedPrivateVars.insert({privVar, lastMapBlockArgsIdx});
+      lastMapBlockArgsIdx--;
+      reverseIt++;
+    }
+  }
+
   using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy;
   auto bodyCB = [&](InsertPointTy allocaIP, InsertPointTy codeGenIP)
       -> llvm::OpenMPIRBuilder::InsertPointOrErrorTy {
@@ -3868,9 +3970,10 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
           moduleTranslation.lookupValue(mapInfoOp.getVarPtr());
       moduleTranslation.mapValue(arg, mapOpValue);
     }
-
     // Do privatization after moduleTranslation has already recorded
     // mapped values.
+    SmallVector<llvm::Value *> llvmPrivateVars;
+    SmallVector<Region *> privateCleanupRegions;
     if (!targetOp.getPrivateVars().empty()) {
       builder.restoreIP(allocaIP);
 
@@ -3886,11 +3989,13 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
         omp::PrivateClauseOp privatizer = findPrivatizer(&opInst, privSym);
         assert(privatizer.getDataSharingType() !=
                    omp::DataSharingClauseType::FirstPrivate &&
-               privatizer.getDeallocRegion().empty() &&
                "unsupported privatizer");
-        moduleTranslation.mapValue(privatizer.getAllocMoldArg(),
-                                   moduleTranslation.lookupValue(privVar));
         Region &allocRegion = privatizer.getAllocRegion();
+          BlockArgument allocRegionArg = allocRegion.getArgument(0);
+          moduleTranslation.mapValue(
+              allocRegionArg,
+              findHostAssociatedValue(privVar, targetOp, mappedPrivateVars,
+                                      builder, moduleTranslation));
         SmallVector<llvm::Value *, 1> yieldedValues;
         if (failed(inlineConvertOmpRegions(
                 allocRegion, "omp.targetop.privatizer", builder,
@@ -3899,7 +4004,12 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
               "failed to inline `alloc` region of `omp.private`");
         }
         assert(yieldedValues.size() == 1);
-        moduleTranslation.mapValue(privBlockArg, yieldedValues.front());
+        llvm::Value *llvmReplacementValue = yieldedValues.front();
+        moduleTranslation.mapValue(privBlockArg, llvmReplacementValue);
+        if (!privatizer.getDeallocRegion().empty()) {
+          llvmPrivateVars.push_back(llvmReplacementValue);
+          privateCleanupRegions.push_back(&privatizer.getDeallocRegion());
+        }
         moduleTranslation.forgetMapping(allocRegion);
         builder.restoreIP(builder.saveIP());
       }
@@ -3911,6 +4021,19 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
       return exitBlock.takeError();
 
     builder.SetInsertPoint(*exitBlock);
+    if (!llvmPrivateVars.empty()) {
+      assert(llvmPrivateVars.size() == privateCleanupRegions.size() &&
+             "Number of private variables needing cleanup not equal to number"
+             "of privatizers with dealloc regions");
+      if (failed(inlineOmpRegionCleanup(
+              privateCleanupRegions, llvmPrivateVars, moduleTranslation,
+              builder, "omp.targetop.private.cleanup",
+              /*shouldLoadCleanupRegionArg=*/false))) {
+        return llvm::createStringError(
+            "failed to inline `dealloc` region of `omp.private` "
+            "op in the target region");
+      }
+    }
     return builder.saveIP();
   };
 
diff --git a/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir b/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir
new file mode 100644
index 00000000000000..f8da37f0eeb09e
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir
@@ -0,0 +1,114 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+llvm.func @free(!llvm.ptr)
+llvm.func @malloc(i64) -> !llvm.ptr
+omp.private {type = private} @box.heap_privatizer0 : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+  %0 = llvm.mlir.constant(1 : i32) : i32
+  %10 = llvm.getelementptr %arg0[0, 1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %1 = llvm.load %10 : !llvm.ptr -> i64
+  %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>  : (i32) -> !llvm.ptr
+  %17 = llvm.call @malloc(%1) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+  %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+  omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+  %6 = llvm.mlir.constant(0 : i64) : i64
+  %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+  llvm.call @free(%9) : (!llvm.ptr) -> ()
+  omp.yield
+}
+omp.private {type = private} @box.heap_privatizer1 : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+  %0 = llvm.mlir.constant(1 : i32) : i32
+  %10 = llvm.getelementptr %arg0[0, 1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %1 = llvm.load %10 : !llvm.ptr -> i64
+  %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>  : (i32) -> !llvm.ptr
+  %17 = llvm.call @malloc(%1) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+  %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+  omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+  %6 = llvm.mlir.constant(0 : i64) : i64
+  %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+  %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+  llvm.call @free(%9) : (!llvm.ptr) -> ()
+  omp.yield
+}
+llvm.func @target_allocatable_(%arg0: !llvm.ptr {fir.bindc_name = "lb"}, %arg1: !llvm.ptr {fir.bindc_name = "ub"}, %arg2: !llvm.ptr {fir.bindc_name = "l"}) attributes {fir.internal_name = "_QPtarget_allocatable"} {
+  %6 = llvm.mlir.constant(1 : i64) : i64
+  %7 = llvm.alloca %6 x i32 {bindc_name = "mapped_var"} : (i64) -> !llvm.ptr
+  %13 = llvm.alloca %6 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var0"} : (i64) -> !llvm.ptr
+  %14 = llvm.alloca %6 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var1"} : (i64) -> !llvm.ptr
+  %53 = omp.map.info var_ptr(%7 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "mapped_var"}
+  %54 = omp.map.info var_ptr(%13 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(to) capture(ByRef) -> !llvm.ptr
+  %55 = omp.map.info var_ptr(%14 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(to) capture(ByRef) -> !llvm.ptr
+  omp.target map_entries(%53 -> %arg3, %54 -> %arg4, %55 ->%arg5 : !llvm.ptr, !llvm.ptr, !llvm.ptr) private(@box.heap_privatizer0 %13 -> %arg6, @box.heap_privatizer1 %14 -> %arg7 : !llvm.ptr, !llvm.ptr) {
+    %64 = llvm.mlir.constant(1 : i32) : i32
+    %65 = llvm.alloca %64 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+    %67 = llvm.alloca %64 x i32 : (i32) -> !llvm.ptr
+    %66 = llvm.mlir.constant(19 : i32) : i32
+    %69 = llvm.mlir.constant(10 : i32) : i32
+    llvm.store %64, %arg3 : i32, !llvm.ptr
+    llvm.store %69, %67 : i32, !llvm.ptr
+    %75 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+    %90 = llvm.insertvalue %67, %75[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+    llvm.store %90, %65 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+    %91 = llvm.mlir.zero : !llvm.ptr
+    %92 = llvm.call @_FortranAAssign(%arg6, %65, %91, %66) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
+    %93 = llvm.call @_FortranAAssign(%arg7, %65, %91, %66) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
+    omp.terminator
+  }
+  llvm.return
+}
+
+
+llvm.func @_FortranAAssign(!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()> attributes {fir.runtime, sym_visibility = "private"}
+
+// The first set of checks ensure that we are calling the offloaded function
+// with the right arguments, especially the second argument which needs to
+// be a memory reference to the descriptor for the privatized allocatable
+// CHECK: define void @target_allocatable_
+// CHECK-NOT: define internal void
+// CHECK: %[[DESC_ALLOC0:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1
+// CHECK: %[[DESC_ALLOC1:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1
+// CHECK: call void @__omp_offloading_[[OFFLOADED_FUNCTION:.*]](ptr {{[^,]+}},
+// CHECK-SAME: ptr %[[DESC_ALLOC0]], ptr %[[DESC_ALLOC1]])
+
+// CHECK: define internal void @__omp_offloading_[[OFFLOADED_FUNCTION]]
+// CHECK-SAME: (ptr {{[^,]+}}, ptr %[[DESCRIPTOR_ARG0:[^,]+]],
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG1:.*]]) {
+// CHECK: %[[I0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG0]], i32 0, i32 1
+// CHECK: %[[MALLOC_ARG0:.*]] = load i64, ptr %[[I0]]
+// CHECK: %[[PRIV_DESC0:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK: %[[HEAP_PTR0:.*]] = call ptr @malloc(i64 %[[MALLOC_ARG0]])
+// CHECK:  %[[TMP0:.*]] = insertvalue { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK-SAME: undef, ptr %[[HEAP_PTR0]], 0
+// CHECK: store { ptr, i64, i32, i8, i8, i8, i8 } %[[TMP0]], ptr %[[PRIV_DESC0]]
+
+// CHECK: %[[I1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG1]], i32 0, i32 1
+// CHECK: %[[MALLOC_ARG1:.*]] = load i64, ptr %[[I1]]
+// CHECK: %[[PRIV_DESC1:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK: %[[HEAP_PTR1:.*]] = call ptr @malloc(i64 %[[MALLOC_ARG1]])
+// CHECK:  %[[TMP1:.*]] = insertvalue { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK-SAME: undef, ptr %[[HEAP_PTR1]], 0
+// CHECK: store { ptr, i64, i32, i8, i8, i8, i8 } %[[TMP1]], ptr %[[PRIV_DESC1]]
+
+// CHECK: call {} @_FortranAAssign(ptr %[[PRIV_DESC0]]
+// CHECK: call {} @_FortranAAssign(ptr %[[PRIV_DESC1]]
+
+// CHECK:  %[[PTR0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[PRIV_DESC0]], i32 0, i32 0
+// CHECK: %[[HEAP_MEMREF0:.*]] = load ptr, ptr %[[PTR0]]
+// CHECK:  call void @free(ptr %[[HEAP_MEMREF0]])
+// CHECK:  %[[PTR1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[PRIV_DESC1]], i32 0, i32 0
+// CHECK: %[[HEAP_MEMREF1:.*]] = load ptr, ptr %[[PTR1]]
+// CHECK:  call void @free(ptr %[[HEAP_MEMREF1]])
diff --git a/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir b/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir
new file mode 100644
index 00000000000000..d94f9d3601f9ce
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir
@@ -0,0 +1,207 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+llvm.func @free(!llvm.ptr)
+llvm.func @malloc(i64) -> !llvm.ptr
+omp.private {type = private} @box.heap_privatizer : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+  %0 = llvm.mlir.constant(1 : i32) : i32
+  %7 = l...
[truncated]

Copy link

github-actions bot commented Nov 18, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@ergawy ergawy force-pushed the target_private_allocatables_3 branch from 9e83dcf to 59642eb Compare November 18, 2024 06:06
Copy link
Contributor

@Dinistro Dinistro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped a bunch of nit comments but I cannot do a deep review as I'm not really involved in flang or the OpenMP codegen.

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td Outdated Show resolved Hide resolved
Comment on lines 3842 to 3822
if (mappedPrivateVars.contains(privateVar)) {
int blockArgIndex = mappedPrivateVars[privateVar];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It might be slightly more efficient to use find here, as this ensure you only have on e lookup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subscript notation is used in similar situations. I will stick to it if you don't mind.

@ergawy ergawy force-pushed the target_private_allocatables_3 branch from 59642eb to 2250a7a Compare November 19, 2024 08:51
@ergawy ergawy force-pushed the target_private_allocatables_3 branch 2 times, most recently from 0e52dc7 to 63c3ff8 Compare November 19, 2024 09:02
@ergawy
Copy link
Member Author

ergawy commented Nov 20, 2024

I have opened #116770 as precursor for this one to make matching private variables with their map info ops easier. Please take a look at #116770 first.

@ergawy ergawy force-pushed the target_private_allocatables_3 branch 5 times, most recently from 51cfb93 to 6e2f068 Compare November 21, 2024 11:01
ergawy added a commit that referenced this pull request Nov 28, 2024
…16770)

This PR extends the MLIR representation for `omp.target` ops by adding a
`map_idx` to `private` vars. This annotation stores the index of the map
info operand corresponding to the private var. If the variable does not
have a map operand, the `map_idx` attribute is either not present at all
or its value is `-1`.

This makes matching the private variable to its map info op easier (see
#116576 for usage).
@ergawy ergawy requested a review from agozillon November 28, 2024 04:24
@ergawy ergawy force-pushed the target_private_allocatables_3 branch 2 times, most recently from 0b02db4 to 071ca05 Compare November 28, 2024 04:27
@ergawy
Copy link
Member Author

ergawy commented Nov 28, 2024

With #116770 merged, we can go back to this PR. Please take a look when you have time 🙏.

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ergawy
Copy link
Member Author

ergawy commented Dec 2, 2024

Thanks @tblah for the approval. @other reviewers, please let me know if you have any objections to merging this PR.

@Dinistro
Copy link
Contributor

Dinistro commented Dec 2, 2024

Thanks @tblah for the approval. @other reviewers, please let me know if you have any objections to merging this PR.

I'm fine with this, but I'm not too familiar with these parts to provide an in-depth review.

@ergawy ergawy force-pushed the target_private_allocatables_3 branch 2 times, most recently from 6549daf to 9062b45 Compare December 4, 2024 09:22
@llvmbot llvmbot added the clang:openmp OpenMP related changes to Clang label Dec 4, 2024
@ergawy ergawy force-pushed the target_private_allocatables_3 branch from 9062b45 to de5e776 Compare December 4, 2024 10:17
Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the extra code sharing here. +1 from me but please consider if you need review from other users of OMPIRBuilder.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp Show resolved Hide resolved
@ergawy ergawy requested a review from ddpagan December 4, 2024 10:29
@ergawy ergawy force-pushed the target_private_allocatables_3 branch from de5e776 to 097a869 Compare December 5, 2024 05:06
Copy link
Member

@skatrak skatrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Kareem, it generally looks good to me. Just a couple of comments.

@ergawy ergawy force-pushed the target_private_allocatables_3 branch from 097a869 to 905d343 Compare December 10, 2024 04:41
@ergawy ergawy force-pushed the target_private_allocatables_3 branch from 905d343 to fb7d67f Compare December 12, 2024 07:37
Copy link
Member

@skatrak skatrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, LGTM!

// omp.target map_entries(%10 -> %arg0) private(@box.privatizer %6#0-> %arg1)
//
// Then, %10 has been created so that the descriptor can be used by the
// privatizer @box.privatizer on the device side. Here we'd record {%6#0, 0}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// privatizer @box.privatizer on the device side. Here we'd record {%6#0, 0}
// privatizer @box.privatizer on the device side. Here we'd record {%6#0, %arg0}

Comment on lines 3931 to 3932
// Record the index of the block argument corresponding to this
// mapvar.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Record the index of the block argument corresponding to this
// mapvar.
// Record the block argument corresponding to this mapvar.

@ergawy ergawy force-pushed the target_private_allocatables_3 branch from fb7d67f to 16e1879 Compare December 12, 2024 11:44
…tization of allocatables in `omp.target` ops

This PR adds support to translate the `private` clause from MLIR to
LLVMIR when used on allocatables in the context of an `omp.target` op.
@ergawy ergawy force-pushed the target_private_allocatables_3 branch from 16e1879 to 1718bd6 Compare December 12, 2024 11:49
@ergawy ergawy merged commit f9734b9 into llvm:main Dec 12, 2024
8 checks passed
tblah added a commit to tblah/llvm-project that referenced this pull request Dec 23, 2024
…ed privatization of allocatables in `omp.target` ops (llvm#116576)"

This reverts commit f9734b9.
tblah added a commit to tblah/llvm-project that referenced this pull request Jan 9, 2025
…ed privatization of allocatables in `omp.target` ops (llvm#116576)"

This reverts commit f9734b9.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:openmp OpenMP related changes to Clang flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category mlir:llvm mlir:openmp mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants