-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mlir][OpenMP] - MLIR to LLVMIR translation support for delayed privatization of allocatables in omp.target
ops
#113208
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-flang-openmp Author: Pranav Bhandarkar (bhandarkar-pranav) ChangesThis PR adds support to translate the Patch is 34.22 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/113208.diff 4 Files Affected:
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 7c45e89cd8ac4b..2e013053a7d1c0 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -3316,7 +3316,51 @@ createDeviceArgumentAccessor(MapInfoData &mapData, llvm::Argument &arg,
return builder.saveIP();
}
+static bool privatizerNeedsMap(omp::PrivateClauseOp &privatizer) {
+ Region &allocRegion = privatizer.getAllocRegion();
+ Value blockArg0 = allocRegion.getArgument(0);
+ return !blockArg0.use_empty();
+}
+
+// Return the llvm::Value * corresponding to the privateVar that
+// is being privatized. It isn't always as simple as looking up
+// moduleTranslation with privateVar. For instance, in case of
+// an allocatable, the descriptor for the allocatable is privatized.
+// This descriptor is mapped using an MapInfoOp. So, this function
+// will return a pointer to the llvm::Value corresponding to the
+// block argument for the mapped descriptor.
+static llvm::Value *
+findHostAssociatedValue(Value privateVar, omp::TargetOp targetOp,
+ llvm::DenseMap<Value, int> &mappedPrivateVars,
+ llvm::IRBuilderBase &builder,
+ LLVM::ModuleTranslation &moduleTranslation) {
+ if (mappedPrivateVars.contains(privateVar)) {
+ int blockArgIndex = mappedPrivateVars[privateVar];
+ Value blockArg = targetOp.getRegion().getArgument(blockArgIndex);
+ mlir::Type privVarType = privateVar.getType();
+ mlir::Type blockArgType = blockArg.getType();
+ assert(isa<LLVM::LLVMPointerType>(blockArgType) &&
+ "A block argument corresponding to a mapped var should have "
+ "!llvm.ptr type");
+
+ if (privVarType == blockArg.getType()) {
+ llvm::Value *v = moduleTranslation.lookupValue(blockArg);
+ return v;
+ }
+ if (!isa<LLVM::LLVMPointerType>(privVarType)) {
+ // This typically happens when the privatized type is lowered from
+ // boxchar<KIND> and gets lowered to !llvm.struct<(ptr, i64)>. That is the
+ // struct/pair is passed by value. But, mapped values are passed only as
+ // pointers, so before we privatize, we must load the pointer.
+ llvm::Value *load =
+ builder.CreateLoad(moduleTranslation.convertType(privVarType),
+ moduleTranslation.lookupValue(blockArg));
+ return load;
+ }
+ }
+ return moduleTranslation.lookupValue(privateVar);
+}
static LogicalResult
convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
LLVM::ModuleTranslation &moduleTranslation) {
@@ -3329,6 +3373,19 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
auto parentFn = opInst.getParentOfType<LLVM::LLVMFuncOp>();
auto targetOp = cast<omp::TargetOp>(opInst);
auto &targetRegion = targetOp.getRegion();
+ // Holds the private vars that have been mapped along with
+ // the block argument that corresponds to the MapInfoOp
+ // corresponding to the private var in question.
+ // So, for instance
+ //
+ // %10 = omp.map.info var_ptr(%6#0 : !fir.ref<!fir.box<!fir.heap<i32>>>, ..)
+ // omp.target map_entries(%10 -> %arg0) private(@box.privatizer %6#0-> %arg1)
+ //
+ // Then, %10 has been created so that the descriptor can be used by the
+ // privatizer
+ // @box.privatizer on the device side. Here we'd record {%6#0, 0} in the
+ // mappedPrivateVars map.
+ llvm::DenseMap<Value, int> mappedPrivateVars;
DataLayout dl = DataLayout(opInst.getParentOfType<ModuleOp>());
SmallVector<Value> mapVars = targetOp.getMapVars();
ArrayRef<BlockArgument> mapBlockArgs =
@@ -3340,6 +3397,48 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
bool isOffloadEntry =
isTargetDevice || !ompBuilder->Config.TargetTriples.empty();
+ // For some private variables, the MapsForPrivatizedVariablesPass
+ // creates MapInfoOp instances. Go through the private variables and
+ // the mapped variables so that during codegeneration we are able
+ // to quickly look up the corresponding map variable, if any for each
+ // private variable.
+ if (!targetOp.getPrivateVars().empty() && !targetOp.getMapVars().empty()) {
+ auto argIface = llvm::cast<omp::BlockArgOpenMPOpInterface>(*targetOp);
+ unsigned lastMapBlockArgsIdx =
+ argIface.getMapBlockArgsStart() + argIface.numMapBlockArgs() - 1;
+ OperandRange privateVars = targetOp.getPrivateVars();
+ std::optional<ArrayAttr> privateSyms = targetOp.getPrivateSyms();
+ auto reverseIt = mapVars.rbegin();
+ for (auto [privVar, privSym] :
+ llvm::reverse(llvm::zip_equal(privateVars, *privateSyms))) {
+ SymbolRefAttr privatizerName = llvm::cast<SymbolRefAttr>(privSym);
+ omp::PrivateClauseOp privatizer =
+ findPrivatizer(targetOp, privatizerName);
+ if (!privatizerNeedsMap(privatizer))
+ continue;
+
+ // The MapInfoOp defining the map var isn't really needed later.
+ // So, we don't store it in any datastructure. Instead, we just
+ // do some sanity checks on it right now.
+ omp::MapInfoOp mapInfoOp =
+ llvm::cast<omp::MapInfoOp>((*reverseIt).getDefiningOp());
+ Type varType = mapInfoOp.getVarType();
+
+ // Check #1: Check that the type of the private variable matches
+ // the type of the variable being mapped.
+ if (!isa<LLVM::LLVMPointerType>(privVar.getType()))
+ assert(
+ varType == privVar.getType() &&
+ "Type of private var doesn't match the type of the mapped value");
+
+ // Ok, only 1 sanity check for now.
+ // Record the index of the block argument corresponding to this
+ // mapvar.
+ mappedPrivateVars.insert({privVar, lastMapBlockArgsIdx});
+ lastMapBlockArgsIdx--;
+ reverseIt++;
+ }
+ }
LogicalResult bodyGenStatus = success();
using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy;
auto bodyCB = [&](InsertPointTy allocaIP,
@@ -3367,9 +3466,10 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
moduleTranslation.lookupValue(mapInfoOp.getVarPtr());
moduleTranslation.mapValue(arg, mapOpValue);
}
-
// Do privatization after moduleTranslation has already recorded
// mapped values.
+ SmallVector<llvm::Value *> llvmPrivateVars;
+ SmallVector<Region *> privateCleanupRegions;
if (!targetOp.getPrivateVars().empty()) {
builder.restoreIP(allocaIP);
@@ -3384,16 +3484,18 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
SymbolRefAttr privSym = cast<SymbolRefAttr>(privatizerNameAttr);
omp::PrivateClauseOp privatizer = findPrivatizer(&opInst, privSym);
if (privatizer.getDataSharingType() ==
- omp::DataSharingClauseType::FirstPrivate ||
- !privatizer.getDeallocRegion().empty()) {
+ omp::DataSharingClauseType::FirstPrivate) {
opInst.emitError("Translation of omp.target from MLIR to LLVMIR "
- "failed because translation of firstprivate and "
- " private allocatables is not supported yet");
+ "failed because translation of firstprivate is not "
+ "supported yet");
bodyGenStatus = failure();
} else {
- moduleTranslation.mapValue(privatizer.getAllocMoldArg(),
- moduleTranslation.lookupValue(privVar));
Region &allocRegion = privatizer.getAllocRegion();
+ BlockArgument allocRegionArg = allocRegion.getArgument(0);
+ moduleTranslation.mapValue(
+ allocRegionArg,
+ findHostAssociatedValue(privVar, targetOp, mappedPrivateVars,
+ builder, moduleTranslation));
SmallVector<llvm::Value *, 1> yieldedValues;
if (failed(inlineConvertOmpRegions(
allocRegion, "omp.targetop.privatizer", builder,
@@ -3404,7 +3506,12 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
bodyGenStatus = failure();
} else {
assert(yieldedValues.size() == 1);
- moduleTranslation.mapValue(privBlockArg, yieldedValues.front());
+ llvm::Value *llvmReplacementValue = yieldedValues.front();
+ moduleTranslation.mapValue(privBlockArg, llvmReplacementValue);
+ if (!privatizer.getDeallocRegion().empty()) {
+ llvmPrivateVars.push_back(llvmReplacementValue);
+ privateCleanupRegions.push_back(&privatizer.getDeallocRegion());
+ }
}
moduleTranslation.forgetMapping(allocRegion);
builder.restoreIP(builder.saveIP());
@@ -3414,6 +3521,19 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
llvm::BasicBlock *exitBlock = convertOmpOpRegions(
targetRegion, "omp.target", builder, moduleTranslation, bodyGenStatus);
builder.SetInsertPoint(exitBlock);
+ if (!llvmPrivateVars.empty()) {
+ assert(llvmPrivateVars.size() == privateCleanupRegions.size() &&
+ "Number of private variables needing cleanup not equal to number"
+ "of privatizers with dealloc regions");
+ if (failed(inlineOmpRegionCleanup(
+ privateCleanupRegions, llvmPrivateVars, moduleTranslation,
+ builder, "omp.targetop.private.cleanup",
+ /*shouldLoadCleanupRegionArg=*/false))) {
+ opInst.emitError("failed to inline `dealloc` region of `omp.private` "
+ "op in the target region");
+ bodyGenStatus = failure();
+ }
+ }
return builder.saveIP();
};
diff --git a/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir b/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir
new file mode 100644
index 00000000000000..2491747152895c
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir
@@ -0,0 +1,116 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+llvm.func @free(!llvm.ptr)
+llvm.func @malloc(i64) -> !llvm.ptr
+omp.private {type = private} @box.heap_privatizer0 : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %10 = llvm.getelementptr %arg0[0, 1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %1 = llvm.load %10 : !llvm.ptr -> i64
+ %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> : (i32) -> !llvm.ptr
+ %17 = llvm.call @malloc(%1) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+ %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+ %6 = llvm.mlir.constant(0 : i64) : i64
+ %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+ llvm.call @free(%9) : (!llvm.ptr) -> ()
+ omp.yield
+}
+omp.private {type = private} @box.heap_privatizer1 : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %10 = llvm.getelementptr %arg0[0, 1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %1 = llvm.load %10 : !llvm.ptr -> i64
+ %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> : (i32) -> !llvm.ptr
+ %17 = llvm.call @malloc(%1) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+ %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+ %6 = llvm.mlir.constant(0 : i64) : i64
+ %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+ llvm.call @free(%9) : (!llvm.ptr) -> ()
+ omp.yield
+}
+llvm.func @target_allocatable_(%arg0: !llvm.ptr {fir.bindc_name = "lb"}, %arg1: !llvm.ptr {fir.bindc_name = "ub"}, %arg2: !llvm.ptr {fir.bindc_name = "l"}) attributes {fir.internal_name = "_QPtarget_allocatable"} {
+ %6 = llvm.mlir.constant(1 : i64) : i64
+ %7 = llvm.alloca %6 x i32 {bindc_name = "mapped_var"} : (i64) -> !llvm.ptr
+ %13 = llvm.alloca %6 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var0"} : (i64) -> !llvm.ptr
+ %14 = llvm.alloca %6 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var1"} : (i64) -> !llvm.ptr
+ %53 = omp.map.info var_ptr(%7 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "mapped_var"}
+ %54 = omp.map.info var_ptr(%13 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(to) capture(ByRef) -> !llvm.ptr
+ %55 = omp.map.info var_ptr(%14 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(to) capture(ByRef) -> !llvm.ptr
+ omp.target map_entries(%53 -> %arg3, %54 -> %arg4, %55 ->%arg5 : !llvm.ptr, !llvm.ptr, !llvm.ptr) private(@box.heap_privatizer0 %13 -> %arg6, @box.heap_privatizer1 %14 -> %arg7 : !llvm.ptr, !llvm.ptr) {
+ %64 = llvm.mlir.constant(1 : i32) : i32
+ %65 = llvm.alloca %64 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+ %67 = llvm.alloca %64 x i32 : (i32) -> !llvm.ptr
+ %66 = llvm.mlir.constant(19 : i32) : i32
+ %68 = llvm.mlir.constant(18 : i32) : i32
+ %69 = llvm.mlir.constant(10 : i32) : i32
+ %70 = llvm.mlir.constant(5 : i32) : i32
+ llvm.store %70, %arg3 : i32, !llvm.ptr
+ llvm.store %69, %67 : i32, !llvm.ptr
+ %75 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %90 = llvm.insertvalue %67, %75[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %90, %65 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ %91 = llvm.mlir.zero : !llvm.ptr
+ %92 = llvm.call @_FortranAAssign(%arg6, %65, %91, %68) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
+ %93 = llvm.call @_FortranAAssign(%arg7, %65, %91, %66) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
+ omp.terminator
+ }
+ llvm.return
+}
+
+
+llvm.func @_FortranAAssign(!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()> attributes {fir.runtime, sym_visibility = "private"}
+
+// The first set of checks ensure that we are calling the offloaded function
+// with the right arguments, especially the second argument which needs to
+// be a memory reference to the descriptor for the privatized allocatable
+// CHECK: define void @target_allocatable_
+// CHECK-NOT: define internal void
+// CHECK: %[[DESC_ALLOC0:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1
+// CHECK: %[[DESC_ALLOC1:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1
+// CHECK: call void @__omp_offloading_[[OFFLOADED_FUNCTION:.*]](ptr {{[^,]+}},
+// CHECK-SAME: ptr %[[DESC_ALLOC0]], ptr %[[DESC_ALLOC1]])
+
+// CHECK: define internal void @__omp_offloading_[[OFFLOADED_FUNCTION]]
+// CHECK-SAME: (ptr {{[^,]+}}, ptr %[[DESCRIPTOR_ARG0:[^,]+]],
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG1:.*]]) {
+// CHECK: %[[I0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG0]], i32 0, i32 1
+// CHECK: %[[MALLOC_ARG0:.*]] = load i64, ptr %[[I0]]
+// CHECK: %[[PRIV_DESC0:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK: %[[HEAP_PTR0:.*]] = call ptr @malloc(i64 %[[MALLOC_ARG0]])
+// CHECK: %[[TMP0:.*]] = insertvalue { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK-SAME: undef, ptr %[[HEAP_PTR0]], 0
+// CHECK: store { ptr, i64, i32, i8, i8, i8, i8 } %[[TMP0]], ptr %[[PRIV_DESC0]]
+
+// CHECK: %[[I1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG1]], i32 0, i32 1
+// CHECK: %[[MALLOC_ARG1:.*]] = load i64, ptr %[[I1]]
+// CHECK: %[[PRIV_DESC1:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK: %[[HEAP_PTR1:.*]] = call ptr @malloc(i64 %[[MALLOC_ARG1]])
+// CHECK: %[[TMP1:.*]] = insertvalue { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK-SAME: undef, ptr %[[HEAP_PTR1]], 0
+// CHECK: store { ptr, i64, i32, i8, i8, i8, i8 } %[[TMP1]], ptr %[[PRIV_DESC1]]
+
+// CHECK: call {} @_FortranAAssign(ptr %[[PRIV_DESC0]]
+// CHECK: call {} @_FortranAAssign(ptr %[[PRIV_DESC1]]
+
+// CHECK: %[[PTR0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[PRIV_DESC0]], i32 0, i32 0
+// CHECK: %[[HEAP_MEMREF0:.*]] = load ptr, ptr %[[PTR0]]
+// CHECK: call void @free(ptr %[[HEAP_MEMREF0]])
+// CHECK: %[[PTR1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[PRIV_DESC1]], i32 0, i32 0
+// CHECK: %[[HEAP_MEMREF1:.*]] = load ptr, ptr %[[PTR1]]
+// CHECK: call void @free(ptr %[[HEAP_MEMREF1]])
diff --git a/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir b/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir
new file mode 100644
index 00000000000000..6d65b25a7fbc01
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir
@@ -0,0 +1,207 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+llvm.func @free(!llvm.ptr)
+llvm.func @malloc(i64) -> !llvm.ptr
+omp.private {type = private} @box.heap_privatizer : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var", pinned} : (i32) -> !llvm.ptr
+ %8 = llvm.mlir.constant(0 : i64) : i64
+ %10 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %11 = llvm.load %10 : !llvm.ptr -> !llvm.ptr
+ %12 = llvm.ptrtoint %11 : !llvm.ptr to i64
+ %13 = llvm.icmp "ne" %12, %8 : i64
+ llvm.cond_br %13, ^bb1, ^bb2
+^bb1: // pred: ^bb0
+ %14 = llvm.mlir.zero : !llvm.ptr
+ %16 = llvm.ptrtoint %14 : !llvm.ptr to i64
+ %17 = llvm.call @malloc(%16) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+ %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ llvm.br ^bb3
+^bb2: // pred: ^bb0
+ %39 = llvm.mlir.zero : !llvm.ptr
+ %44 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %59 = llvm.insertvalue %39, %44[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %59, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ llvm.br ^bb3
+^bb3: // 2 preds: ^bb1, ^bb2
+ omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+ %6 = llvm.mlir.constant(0 : i64) : i64
+ %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+ %10 = llvm.ptrtoint %9 : !llvm.ptr to i64
+ %11 = llvm.icmp "ne" %10, %6 : i64
+ llvm.cond_br %11, ^bb1, ^bb2
+^bb1: // pred: ^bb0
+ llvm.call @free(%9) : (!llvm.ptr) -> ()
+ %15 = llvm.mlir.zero : !llvm.ptr
+ %16 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %17 = llvm.insertvalue %15, %16[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %17, %arg0 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ llvm.br ^bb2
+^bb2: // 2 preds: ^bb0, ^bb1
+ omp.yield
+}
+llvm.func @target_allocatable_(%arg0: !llvm.ptr {fir.bindc_name = "lb"}, %arg1: !llvm.ptr {fir.bindc_name = "ub"}, %arg2: !llvm.ptr {fir.bindc_name = "l"}) attributes {fir.internal_name = "_QPtarget_allocatable"} {
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %1 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+ %2 = llvm.mlir.constant(1 : i32) : i32
+ %3 = llvm.alloca %2 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+ %4 = llvm.mlir.constant(1 : i64) : i64
+ %5 = llvm.alloca %4 x f32 {bindc_name = "real_var"}...
[truncated]
|
@llvm/pr-subscribers-mlir-llvm Author: Pranav Bhandarkar (bhandarkar-pranav) ChangesThis PR adds support to translate the Patch is 34.22 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/113208.diff 4 Files Affected:
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 7c45e89cd8ac4b..2e013053a7d1c0 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -3316,7 +3316,51 @@ createDeviceArgumentAccessor(MapInfoData &mapData, llvm::Argument &arg,
return builder.saveIP();
}
+static bool privatizerNeedsMap(omp::PrivateClauseOp &privatizer) {
+ Region &allocRegion = privatizer.getAllocRegion();
+ Value blockArg0 = allocRegion.getArgument(0);
+ return !blockArg0.use_empty();
+}
+
+// Return the llvm::Value * corresponding to the privateVar that
+// is being privatized. It isn't always as simple as looking up
+// moduleTranslation with privateVar. For instance, in case of
+// an allocatable, the descriptor for the allocatable is privatized.
+// This descriptor is mapped using an MapInfoOp. So, this function
+// will return a pointer to the llvm::Value corresponding to the
+// block argument for the mapped descriptor.
+static llvm::Value *
+findHostAssociatedValue(Value privateVar, omp::TargetOp targetOp,
+ llvm::DenseMap<Value, int> &mappedPrivateVars,
+ llvm::IRBuilderBase &builder,
+ LLVM::ModuleTranslation &moduleTranslation) {
+ if (mappedPrivateVars.contains(privateVar)) {
+ int blockArgIndex = mappedPrivateVars[privateVar];
+ Value blockArg = targetOp.getRegion().getArgument(blockArgIndex);
+ mlir::Type privVarType = privateVar.getType();
+ mlir::Type blockArgType = blockArg.getType();
+ assert(isa<LLVM::LLVMPointerType>(blockArgType) &&
+ "A block argument corresponding to a mapped var should have "
+ "!llvm.ptr type");
+
+ if (privVarType == blockArg.getType()) {
+ llvm::Value *v = moduleTranslation.lookupValue(blockArg);
+ return v;
+ }
+ if (!isa<LLVM::LLVMPointerType>(privVarType)) {
+ // This typically happens when the privatized type is lowered from
+ // boxchar<KIND> and gets lowered to !llvm.struct<(ptr, i64)>. That is the
+ // struct/pair is passed by value. But, mapped values are passed only as
+ // pointers, so before we privatize, we must load the pointer.
+ llvm::Value *load =
+ builder.CreateLoad(moduleTranslation.convertType(privVarType),
+ moduleTranslation.lookupValue(blockArg));
+ return load;
+ }
+ }
+ return moduleTranslation.lookupValue(privateVar);
+}
static LogicalResult
convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
LLVM::ModuleTranslation &moduleTranslation) {
@@ -3329,6 +3373,19 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
auto parentFn = opInst.getParentOfType<LLVM::LLVMFuncOp>();
auto targetOp = cast<omp::TargetOp>(opInst);
auto &targetRegion = targetOp.getRegion();
+ // Holds the private vars that have been mapped along with
+ // the block argument that corresponds to the MapInfoOp
+ // corresponding to the private var in question.
+ // So, for instance
+ //
+ // %10 = omp.map.info var_ptr(%6#0 : !fir.ref<!fir.box<!fir.heap<i32>>>, ..)
+ // omp.target map_entries(%10 -> %arg0) private(@box.privatizer %6#0-> %arg1)
+ //
+ // Then, %10 has been created so that the descriptor can be used by the
+ // privatizer
+ // @box.privatizer on the device side. Here we'd record {%6#0, 0} in the
+ // mappedPrivateVars map.
+ llvm::DenseMap<Value, int> mappedPrivateVars;
DataLayout dl = DataLayout(opInst.getParentOfType<ModuleOp>());
SmallVector<Value> mapVars = targetOp.getMapVars();
ArrayRef<BlockArgument> mapBlockArgs =
@@ -3340,6 +3397,48 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
bool isOffloadEntry =
isTargetDevice || !ompBuilder->Config.TargetTriples.empty();
+ // For some private variables, the MapsForPrivatizedVariablesPass
+ // creates MapInfoOp instances. Go through the private variables and
+ // the mapped variables so that during codegeneration we are able
+ // to quickly look up the corresponding map variable, if any for each
+ // private variable.
+ if (!targetOp.getPrivateVars().empty() && !targetOp.getMapVars().empty()) {
+ auto argIface = llvm::cast<omp::BlockArgOpenMPOpInterface>(*targetOp);
+ unsigned lastMapBlockArgsIdx =
+ argIface.getMapBlockArgsStart() + argIface.numMapBlockArgs() - 1;
+ OperandRange privateVars = targetOp.getPrivateVars();
+ std::optional<ArrayAttr> privateSyms = targetOp.getPrivateSyms();
+ auto reverseIt = mapVars.rbegin();
+ for (auto [privVar, privSym] :
+ llvm::reverse(llvm::zip_equal(privateVars, *privateSyms))) {
+ SymbolRefAttr privatizerName = llvm::cast<SymbolRefAttr>(privSym);
+ omp::PrivateClauseOp privatizer =
+ findPrivatizer(targetOp, privatizerName);
+ if (!privatizerNeedsMap(privatizer))
+ continue;
+
+ // The MapInfoOp defining the map var isn't really needed later.
+ // So, we don't store it in any datastructure. Instead, we just
+ // do some sanity checks on it right now.
+ omp::MapInfoOp mapInfoOp =
+ llvm::cast<omp::MapInfoOp>((*reverseIt).getDefiningOp());
+ Type varType = mapInfoOp.getVarType();
+
+ // Check #1: Check that the type of the private variable matches
+ // the type of the variable being mapped.
+ if (!isa<LLVM::LLVMPointerType>(privVar.getType()))
+ assert(
+ varType == privVar.getType() &&
+ "Type of private var doesn't match the type of the mapped value");
+
+ // Ok, only 1 sanity check for now.
+ // Record the index of the block argument corresponding to this
+ // mapvar.
+ mappedPrivateVars.insert({privVar, lastMapBlockArgsIdx});
+ lastMapBlockArgsIdx--;
+ reverseIt++;
+ }
+ }
LogicalResult bodyGenStatus = success();
using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy;
auto bodyCB = [&](InsertPointTy allocaIP,
@@ -3367,9 +3466,10 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
moduleTranslation.lookupValue(mapInfoOp.getVarPtr());
moduleTranslation.mapValue(arg, mapOpValue);
}
-
// Do privatization after moduleTranslation has already recorded
// mapped values.
+ SmallVector<llvm::Value *> llvmPrivateVars;
+ SmallVector<Region *> privateCleanupRegions;
if (!targetOp.getPrivateVars().empty()) {
builder.restoreIP(allocaIP);
@@ -3384,16 +3484,18 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
SymbolRefAttr privSym = cast<SymbolRefAttr>(privatizerNameAttr);
omp::PrivateClauseOp privatizer = findPrivatizer(&opInst, privSym);
if (privatizer.getDataSharingType() ==
- omp::DataSharingClauseType::FirstPrivate ||
- !privatizer.getDeallocRegion().empty()) {
+ omp::DataSharingClauseType::FirstPrivate) {
opInst.emitError("Translation of omp.target from MLIR to LLVMIR "
- "failed because translation of firstprivate and "
- " private allocatables is not supported yet");
+ "failed because translation of firstprivate is not "
+ "supported yet");
bodyGenStatus = failure();
} else {
- moduleTranslation.mapValue(privatizer.getAllocMoldArg(),
- moduleTranslation.lookupValue(privVar));
Region &allocRegion = privatizer.getAllocRegion();
+ BlockArgument allocRegionArg = allocRegion.getArgument(0);
+ moduleTranslation.mapValue(
+ allocRegionArg,
+ findHostAssociatedValue(privVar, targetOp, mappedPrivateVars,
+ builder, moduleTranslation));
SmallVector<llvm::Value *, 1> yieldedValues;
if (failed(inlineConvertOmpRegions(
allocRegion, "omp.targetop.privatizer", builder,
@@ -3404,7 +3506,12 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
bodyGenStatus = failure();
} else {
assert(yieldedValues.size() == 1);
- moduleTranslation.mapValue(privBlockArg, yieldedValues.front());
+ llvm::Value *llvmReplacementValue = yieldedValues.front();
+ moduleTranslation.mapValue(privBlockArg, llvmReplacementValue);
+ if (!privatizer.getDeallocRegion().empty()) {
+ llvmPrivateVars.push_back(llvmReplacementValue);
+ privateCleanupRegions.push_back(&privatizer.getDeallocRegion());
+ }
}
moduleTranslation.forgetMapping(allocRegion);
builder.restoreIP(builder.saveIP());
@@ -3414,6 +3521,19 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
llvm::BasicBlock *exitBlock = convertOmpOpRegions(
targetRegion, "omp.target", builder, moduleTranslation, bodyGenStatus);
builder.SetInsertPoint(exitBlock);
+ if (!llvmPrivateVars.empty()) {
+ assert(llvmPrivateVars.size() == privateCleanupRegions.size() &&
+ "Number of private variables needing cleanup not equal to number"
+ "of privatizers with dealloc regions");
+ if (failed(inlineOmpRegionCleanup(
+ privateCleanupRegions, llvmPrivateVars, moduleTranslation,
+ builder, "omp.targetop.private.cleanup",
+ /*shouldLoadCleanupRegionArg=*/false))) {
+ opInst.emitError("failed to inline `dealloc` region of `omp.private` "
+ "op in the target region");
+ bodyGenStatus = failure();
+ }
+ }
return builder.saveIP();
};
diff --git a/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir b/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir
new file mode 100644
index 00000000000000..2491747152895c
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir
@@ -0,0 +1,116 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+llvm.func @free(!llvm.ptr)
+llvm.func @malloc(i64) -> !llvm.ptr
+omp.private {type = private} @box.heap_privatizer0 : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %10 = llvm.getelementptr %arg0[0, 1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %1 = llvm.load %10 : !llvm.ptr -> i64
+ %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> : (i32) -> !llvm.ptr
+ %17 = llvm.call @malloc(%1) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+ %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+ %6 = llvm.mlir.constant(0 : i64) : i64
+ %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+ llvm.call @free(%9) : (!llvm.ptr) -> ()
+ omp.yield
+}
+omp.private {type = private} @box.heap_privatizer1 : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %10 = llvm.getelementptr %arg0[0, 1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %1 = llvm.load %10 : !llvm.ptr -> i64
+ %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> : (i32) -> !llvm.ptr
+ %17 = llvm.call @malloc(%1) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+ %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+ %6 = llvm.mlir.constant(0 : i64) : i64
+ %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+ llvm.call @free(%9) : (!llvm.ptr) -> ()
+ omp.yield
+}
+llvm.func @target_allocatable_(%arg0: !llvm.ptr {fir.bindc_name = "lb"}, %arg1: !llvm.ptr {fir.bindc_name = "ub"}, %arg2: !llvm.ptr {fir.bindc_name = "l"}) attributes {fir.internal_name = "_QPtarget_allocatable"} {
+ %6 = llvm.mlir.constant(1 : i64) : i64
+ %7 = llvm.alloca %6 x i32 {bindc_name = "mapped_var"} : (i64) -> !llvm.ptr
+ %13 = llvm.alloca %6 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var0"} : (i64) -> !llvm.ptr
+ %14 = llvm.alloca %6 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var1"} : (i64) -> !llvm.ptr
+ %53 = omp.map.info var_ptr(%7 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "mapped_var"}
+ %54 = omp.map.info var_ptr(%13 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(to) capture(ByRef) -> !llvm.ptr
+ %55 = omp.map.info var_ptr(%14 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(to) capture(ByRef) -> !llvm.ptr
+ omp.target map_entries(%53 -> %arg3, %54 -> %arg4, %55 ->%arg5 : !llvm.ptr, !llvm.ptr, !llvm.ptr) private(@box.heap_privatizer0 %13 -> %arg6, @box.heap_privatizer1 %14 -> %arg7 : !llvm.ptr, !llvm.ptr) {
+ %64 = llvm.mlir.constant(1 : i32) : i32
+ %65 = llvm.alloca %64 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+ %67 = llvm.alloca %64 x i32 : (i32) -> !llvm.ptr
+ %66 = llvm.mlir.constant(19 : i32) : i32
+ %68 = llvm.mlir.constant(18 : i32) : i32
+ %69 = llvm.mlir.constant(10 : i32) : i32
+ %70 = llvm.mlir.constant(5 : i32) : i32
+ llvm.store %70, %arg3 : i32, !llvm.ptr
+ llvm.store %69, %67 : i32, !llvm.ptr
+ %75 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %90 = llvm.insertvalue %67, %75[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %90, %65 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ %91 = llvm.mlir.zero : !llvm.ptr
+ %92 = llvm.call @_FortranAAssign(%arg6, %65, %91, %68) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
+ %93 = llvm.call @_FortranAAssign(%arg7, %65, %91, %66) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
+ omp.terminator
+ }
+ llvm.return
+}
+
+
+llvm.func @_FortranAAssign(!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()> attributes {fir.runtime, sym_visibility = "private"}
+
+// The first set of checks ensure that we are calling the offloaded function
+// with the right arguments, especially the second argument which needs to
+// be a memory reference to the descriptor for the privatized allocatable
+// CHECK: define void @target_allocatable_
+// CHECK-NOT: define internal void
+// CHECK: %[[DESC_ALLOC0:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1
+// CHECK: %[[DESC_ALLOC1:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1
+// CHECK: call void @__omp_offloading_[[OFFLOADED_FUNCTION:.*]](ptr {{[^,]+}},
+// CHECK-SAME: ptr %[[DESC_ALLOC0]], ptr %[[DESC_ALLOC1]])
+
+// CHECK: define internal void @__omp_offloading_[[OFFLOADED_FUNCTION]]
+// CHECK-SAME: (ptr {{[^,]+}}, ptr %[[DESCRIPTOR_ARG0:[^,]+]],
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG1:.*]]) {
+// CHECK: %[[I0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG0]], i32 0, i32 1
+// CHECK: %[[MALLOC_ARG0:.*]] = load i64, ptr %[[I0]]
+// CHECK: %[[PRIV_DESC0:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK: %[[HEAP_PTR0:.*]] = call ptr @malloc(i64 %[[MALLOC_ARG0]])
+// CHECK: %[[TMP0:.*]] = insertvalue { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK-SAME: undef, ptr %[[HEAP_PTR0]], 0
+// CHECK: store { ptr, i64, i32, i8, i8, i8, i8 } %[[TMP0]], ptr %[[PRIV_DESC0]]
+
+// CHECK: %[[I1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG1]], i32 0, i32 1
+// CHECK: %[[MALLOC_ARG1:.*]] = load i64, ptr %[[I1]]
+// CHECK: %[[PRIV_DESC1:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK: %[[HEAP_PTR1:.*]] = call ptr @malloc(i64 %[[MALLOC_ARG1]])
+// CHECK: %[[TMP1:.*]] = insertvalue { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK-SAME: undef, ptr %[[HEAP_PTR1]], 0
+// CHECK: store { ptr, i64, i32, i8, i8, i8, i8 } %[[TMP1]], ptr %[[PRIV_DESC1]]
+
+// CHECK: call {} @_FortranAAssign(ptr %[[PRIV_DESC0]]
+// CHECK: call {} @_FortranAAssign(ptr %[[PRIV_DESC1]]
+
+// CHECK: %[[PTR0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[PRIV_DESC0]], i32 0, i32 0
+// CHECK: %[[HEAP_MEMREF0:.*]] = load ptr, ptr %[[PTR0]]
+// CHECK: call void @free(ptr %[[HEAP_MEMREF0]])
+// CHECK: %[[PTR1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[PRIV_DESC1]], i32 0, i32 0
+// CHECK: %[[HEAP_MEMREF1:.*]] = load ptr, ptr %[[PTR1]]
+// CHECK: call void @free(ptr %[[HEAP_MEMREF1]])
diff --git a/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir b/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir
new file mode 100644
index 00000000000000..6d65b25a7fbc01
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir
@@ -0,0 +1,207 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+llvm.func @free(!llvm.ptr)
+llvm.func @malloc(i64) -> !llvm.ptr
+omp.private {type = private} @box.heap_privatizer : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var", pinned} : (i32) -> !llvm.ptr
+ %8 = llvm.mlir.constant(0 : i64) : i64
+ %10 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %11 = llvm.load %10 : !llvm.ptr -> !llvm.ptr
+ %12 = llvm.ptrtoint %11 : !llvm.ptr to i64
+ %13 = llvm.icmp "ne" %12, %8 : i64
+ llvm.cond_br %13, ^bb1, ^bb2
+^bb1: // pred: ^bb0
+ %14 = llvm.mlir.zero : !llvm.ptr
+ %16 = llvm.ptrtoint %14 : !llvm.ptr to i64
+ %17 = llvm.call @malloc(%16) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+ %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ llvm.br ^bb3
+^bb2: // pred: ^bb0
+ %39 = llvm.mlir.zero : !llvm.ptr
+ %44 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %59 = llvm.insertvalue %39, %44[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %59, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ llvm.br ^bb3
+^bb3: // 2 preds: ^bb1, ^bb2
+ omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+ %6 = llvm.mlir.constant(0 : i64) : i64
+ %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+ %10 = llvm.ptrtoint %9 : !llvm.ptr to i64
+ %11 = llvm.icmp "ne" %10, %6 : i64
+ llvm.cond_br %11, ^bb1, ^bb2
+^bb1: // pred: ^bb0
+ llvm.call @free(%9) : (!llvm.ptr) -> ()
+ %15 = llvm.mlir.zero : !llvm.ptr
+ %16 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %17 = llvm.insertvalue %15, %16[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %17, %arg0 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ llvm.br ^bb2
+^bb2: // 2 preds: ^bb0, ^bb1
+ omp.yield
+}
+llvm.func @target_allocatable_(%arg0: !llvm.ptr {fir.bindc_name = "lb"}, %arg1: !llvm.ptr {fir.bindc_name = "ub"}, %arg2: !llvm.ptr {fir.bindc_name = "l"}) attributes {fir.internal_name = "_QPtarget_allocatable"} {
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %1 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+ %2 = llvm.mlir.constant(1 : i32) : i32
+ %3 = llvm.alloca %2 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+ %4 = llvm.mlir.constant(1 : i64) : i64
+ %5 = llvm.alloca %4 x f32 {bindc_name = "real_var"}...
[truncated]
|
@llvm/pr-subscribers-mlir-openmp Author: Pranav Bhandarkar (bhandarkar-pranav) ChangesThis PR adds support to translate the Patch is 34.22 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/113208.diff 4 Files Affected:
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 7c45e89cd8ac4b..2e013053a7d1c0 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -3316,7 +3316,51 @@ createDeviceArgumentAccessor(MapInfoData &mapData, llvm::Argument &arg,
return builder.saveIP();
}
+static bool privatizerNeedsMap(omp::PrivateClauseOp &privatizer) {
+ Region &allocRegion = privatizer.getAllocRegion();
+ Value blockArg0 = allocRegion.getArgument(0);
+ return !blockArg0.use_empty();
+}
+
+// Return the llvm::Value * corresponding to the privateVar that
+// is being privatized. It isn't always as simple as looking up
+// moduleTranslation with privateVar. For instance, in case of
+// an allocatable, the descriptor for the allocatable is privatized.
+// This descriptor is mapped using an MapInfoOp. So, this function
+// will return a pointer to the llvm::Value corresponding to the
+// block argument for the mapped descriptor.
+static llvm::Value *
+findHostAssociatedValue(Value privateVar, omp::TargetOp targetOp,
+ llvm::DenseMap<Value, int> &mappedPrivateVars,
+ llvm::IRBuilderBase &builder,
+ LLVM::ModuleTranslation &moduleTranslation) {
+ if (mappedPrivateVars.contains(privateVar)) {
+ int blockArgIndex = mappedPrivateVars[privateVar];
+ Value blockArg = targetOp.getRegion().getArgument(blockArgIndex);
+ mlir::Type privVarType = privateVar.getType();
+ mlir::Type blockArgType = blockArg.getType();
+ assert(isa<LLVM::LLVMPointerType>(blockArgType) &&
+ "A block argument corresponding to a mapped var should have "
+ "!llvm.ptr type");
+
+ if (privVarType == blockArg.getType()) {
+ llvm::Value *v = moduleTranslation.lookupValue(blockArg);
+ return v;
+ }
+ if (!isa<LLVM::LLVMPointerType>(privVarType)) {
+ // This typically happens when the privatized type is lowered from
+ // boxchar<KIND> and gets lowered to !llvm.struct<(ptr, i64)>. That is the
+ // struct/pair is passed by value. But, mapped values are passed only as
+ // pointers, so before we privatize, we must load the pointer.
+ llvm::Value *load =
+ builder.CreateLoad(moduleTranslation.convertType(privVarType),
+ moduleTranslation.lookupValue(blockArg));
+ return load;
+ }
+ }
+ return moduleTranslation.lookupValue(privateVar);
+}
static LogicalResult
convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
LLVM::ModuleTranslation &moduleTranslation) {
@@ -3329,6 +3373,19 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
auto parentFn = opInst.getParentOfType<LLVM::LLVMFuncOp>();
auto targetOp = cast<omp::TargetOp>(opInst);
auto &targetRegion = targetOp.getRegion();
+ // Holds the private vars that have been mapped along with
+ // the block argument that corresponds to the MapInfoOp
+ // corresponding to the private var in question.
+ // So, for instance
+ //
+ // %10 = omp.map.info var_ptr(%6#0 : !fir.ref<!fir.box<!fir.heap<i32>>>, ..)
+ // omp.target map_entries(%10 -> %arg0) private(@box.privatizer %6#0-> %arg1)
+ //
+ // Then, %10 has been created so that the descriptor can be used by the
+ // privatizer
+ // @box.privatizer on the device side. Here we'd record {%6#0, 0} in the
+ // mappedPrivateVars map.
+ llvm::DenseMap<Value, int> mappedPrivateVars;
DataLayout dl = DataLayout(opInst.getParentOfType<ModuleOp>());
SmallVector<Value> mapVars = targetOp.getMapVars();
ArrayRef<BlockArgument> mapBlockArgs =
@@ -3340,6 +3397,48 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
bool isOffloadEntry =
isTargetDevice || !ompBuilder->Config.TargetTriples.empty();
+ // For some private variables, the MapsForPrivatizedVariablesPass
+ // creates MapInfoOp instances. Go through the private variables and
+ // the mapped variables so that during codegeneration we are able
+ // to quickly look up the corresponding map variable, if any for each
+ // private variable.
+ if (!targetOp.getPrivateVars().empty() && !targetOp.getMapVars().empty()) {
+ auto argIface = llvm::cast<omp::BlockArgOpenMPOpInterface>(*targetOp);
+ unsigned lastMapBlockArgsIdx =
+ argIface.getMapBlockArgsStart() + argIface.numMapBlockArgs() - 1;
+ OperandRange privateVars = targetOp.getPrivateVars();
+ std::optional<ArrayAttr> privateSyms = targetOp.getPrivateSyms();
+ auto reverseIt = mapVars.rbegin();
+ for (auto [privVar, privSym] :
+ llvm::reverse(llvm::zip_equal(privateVars, *privateSyms))) {
+ SymbolRefAttr privatizerName = llvm::cast<SymbolRefAttr>(privSym);
+ omp::PrivateClauseOp privatizer =
+ findPrivatizer(targetOp, privatizerName);
+ if (!privatizerNeedsMap(privatizer))
+ continue;
+
+ // The MapInfoOp defining the map var isn't really needed later.
+ // So, we don't store it in any datastructure. Instead, we just
+ // do some sanity checks on it right now.
+ omp::MapInfoOp mapInfoOp =
+ llvm::cast<omp::MapInfoOp>((*reverseIt).getDefiningOp());
+ Type varType = mapInfoOp.getVarType();
+
+ // Check #1: Check that the type of the private variable matches
+ // the type of the variable being mapped.
+ if (!isa<LLVM::LLVMPointerType>(privVar.getType()))
+ assert(
+ varType == privVar.getType() &&
+ "Type of private var doesn't match the type of the mapped value");
+
+ // Ok, only 1 sanity check for now.
+ // Record the index of the block argument corresponding to this
+ // mapvar.
+ mappedPrivateVars.insert({privVar, lastMapBlockArgsIdx});
+ lastMapBlockArgsIdx--;
+ reverseIt++;
+ }
+ }
LogicalResult bodyGenStatus = success();
using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy;
auto bodyCB = [&](InsertPointTy allocaIP,
@@ -3367,9 +3466,10 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
moduleTranslation.lookupValue(mapInfoOp.getVarPtr());
moduleTranslation.mapValue(arg, mapOpValue);
}
-
// Do privatization after moduleTranslation has already recorded
// mapped values.
+ SmallVector<llvm::Value *> llvmPrivateVars;
+ SmallVector<Region *> privateCleanupRegions;
if (!targetOp.getPrivateVars().empty()) {
builder.restoreIP(allocaIP);
@@ -3384,16 +3484,18 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
SymbolRefAttr privSym = cast<SymbolRefAttr>(privatizerNameAttr);
omp::PrivateClauseOp privatizer = findPrivatizer(&opInst, privSym);
if (privatizer.getDataSharingType() ==
- omp::DataSharingClauseType::FirstPrivate ||
- !privatizer.getDeallocRegion().empty()) {
+ omp::DataSharingClauseType::FirstPrivate) {
opInst.emitError("Translation of omp.target from MLIR to LLVMIR "
- "failed because translation of firstprivate and "
- " private allocatables is not supported yet");
+ "failed because translation of firstprivate is not "
+ "supported yet");
bodyGenStatus = failure();
} else {
- moduleTranslation.mapValue(privatizer.getAllocMoldArg(),
- moduleTranslation.lookupValue(privVar));
Region &allocRegion = privatizer.getAllocRegion();
+ BlockArgument allocRegionArg = allocRegion.getArgument(0);
+ moduleTranslation.mapValue(
+ allocRegionArg,
+ findHostAssociatedValue(privVar, targetOp, mappedPrivateVars,
+ builder, moduleTranslation));
SmallVector<llvm::Value *, 1> yieldedValues;
if (failed(inlineConvertOmpRegions(
allocRegion, "omp.targetop.privatizer", builder,
@@ -3404,7 +3506,12 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
bodyGenStatus = failure();
} else {
assert(yieldedValues.size() == 1);
- moduleTranslation.mapValue(privBlockArg, yieldedValues.front());
+ llvm::Value *llvmReplacementValue = yieldedValues.front();
+ moduleTranslation.mapValue(privBlockArg, llvmReplacementValue);
+ if (!privatizer.getDeallocRegion().empty()) {
+ llvmPrivateVars.push_back(llvmReplacementValue);
+ privateCleanupRegions.push_back(&privatizer.getDeallocRegion());
+ }
}
moduleTranslation.forgetMapping(allocRegion);
builder.restoreIP(builder.saveIP());
@@ -3414,6 +3521,19 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
llvm::BasicBlock *exitBlock = convertOmpOpRegions(
targetRegion, "omp.target", builder, moduleTranslation, bodyGenStatus);
builder.SetInsertPoint(exitBlock);
+ if (!llvmPrivateVars.empty()) {
+ assert(llvmPrivateVars.size() == privateCleanupRegions.size() &&
+ "Number of private variables needing cleanup not equal to number"
+ "of privatizers with dealloc regions");
+ if (failed(inlineOmpRegionCleanup(
+ privateCleanupRegions, llvmPrivateVars, moduleTranslation,
+ builder, "omp.targetop.private.cleanup",
+ /*shouldLoadCleanupRegionArg=*/false))) {
+ opInst.emitError("failed to inline `dealloc` region of `omp.private` "
+ "op in the target region");
+ bodyGenStatus = failure();
+ }
+ }
return builder.saveIP();
};
diff --git a/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir b/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir
new file mode 100644
index 00000000000000..2491747152895c
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-multiple-private.mlir
@@ -0,0 +1,116 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+llvm.func @free(!llvm.ptr)
+llvm.func @malloc(i64) -> !llvm.ptr
+omp.private {type = private} @box.heap_privatizer0 : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %10 = llvm.getelementptr %arg0[0, 1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %1 = llvm.load %10 : !llvm.ptr -> i64
+ %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> : (i32) -> !llvm.ptr
+ %17 = llvm.call @malloc(%1) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+ %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+ %6 = llvm.mlir.constant(0 : i64) : i64
+ %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+ llvm.call @free(%9) : (!llvm.ptr) -> ()
+ omp.yield
+}
+omp.private {type = private} @box.heap_privatizer1 : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %10 = llvm.getelementptr %arg0[0, 1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %1 = llvm.load %10 : !llvm.ptr -> i64
+ %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> : (i32) -> !llvm.ptr
+ %17 = llvm.call @malloc(%1) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+ %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+ %6 = llvm.mlir.constant(0 : i64) : i64
+ %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+ llvm.call @free(%9) : (!llvm.ptr) -> ()
+ omp.yield
+}
+llvm.func @target_allocatable_(%arg0: !llvm.ptr {fir.bindc_name = "lb"}, %arg1: !llvm.ptr {fir.bindc_name = "ub"}, %arg2: !llvm.ptr {fir.bindc_name = "l"}) attributes {fir.internal_name = "_QPtarget_allocatable"} {
+ %6 = llvm.mlir.constant(1 : i64) : i64
+ %7 = llvm.alloca %6 x i32 {bindc_name = "mapped_var"} : (i64) -> !llvm.ptr
+ %13 = llvm.alloca %6 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var0"} : (i64) -> !llvm.ptr
+ %14 = llvm.alloca %6 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var1"} : (i64) -> !llvm.ptr
+ %53 = omp.map.info var_ptr(%7 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "mapped_var"}
+ %54 = omp.map.info var_ptr(%13 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(to) capture(ByRef) -> !llvm.ptr
+ %55 = omp.map.info var_ptr(%14 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(to) capture(ByRef) -> !llvm.ptr
+ omp.target map_entries(%53 -> %arg3, %54 -> %arg4, %55 ->%arg5 : !llvm.ptr, !llvm.ptr, !llvm.ptr) private(@box.heap_privatizer0 %13 -> %arg6, @box.heap_privatizer1 %14 -> %arg7 : !llvm.ptr, !llvm.ptr) {
+ %64 = llvm.mlir.constant(1 : i32) : i32
+ %65 = llvm.alloca %64 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+ %67 = llvm.alloca %64 x i32 : (i32) -> !llvm.ptr
+ %66 = llvm.mlir.constant(19 : i32) : i32
+ %68 = llvm.mlir.constant(18 : i32) : i32
+ %69 = llvm.mlir.constant(10 : i32) : i32
+ %70 = llvm.mlir.constant(5 : i32) : i32
+ llvm.store %70, %arg3 : i32, !llvm.ptr
+ llvm.store %69, %67 : i32, !llvm.ptr
+ %75 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %90 = llvm.insertvalue %67, %75[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %90, %65 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ %91 = llvm.mlir.zero : !llvm.ptr
+ %92 = llvm.call @_FortranAAssign(%arg6, %65, %91, %68) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
+ %93 = llvm.call @_FortranAAssign(%arg7, %65, %91, %66) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
+ omp.terminator
+ }
+ llvm.return
+}
+
+
+llvm.func @_FortranAAssign(!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()> attributes {fir.runtime, sym_visibility = "private"}
+
+// The first set of checks ensure that we are calling the offloaded function
+// with the right arguments, especially the second argument which needs to
+// be a memory reference to the descriptor for the privatized allocatable
+// CHECK: define void @target_allocatable_
+// CHECK-NOT: define internal void
+// CHECK: %[[DESC_ALLOC0:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1
+// CHECK: %[[DESC_ALLOC1:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1
+// CHECK: call void @__omp_offloading_[[OFFLOADED_FUNCTION:.*]](ptr {{[^,]+}},
+// CHECK-SAME: ptr %[[DESC_ALLOC0]], ptr %[[DESC_ALLOC1]])
+
+// CHECK: define internal void @__omp_offloading_[[OFFLOADED_FUNCTION]]
+// CHECK-SAME: (ptr {{[^,]+}}, ptr %[[DESCRIPTOR_ARG0:[^,]+]],
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG1:.*]]) {
+// CHECK: %[[I0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG0]], i32 0, i32 1
+// CHECK: %[[MALLOC_ARG0:.*]] = load i64, ptr %[[I0]]
+// CHECK: %[[PRIV_DESC0:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK: %[[HEAP_PTR0:.*]] = call ptr @malloc(i64 %[[MALLOC_ARG0]])
+// CHECK: %[[TMP0:.*]] = insertvalue { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK-SAME: undef, ptr %[[HEAP_PTR0]], 0
+// CHECK: store { ptr, i64, i32, i8, i8, i8, i8 } %[[TMP0]], ptr %[[PRIV_DESC0]]
+
+// CHECK: %[[I1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[DESCRIPTOR_ARG1]], i32 0, i32 1
+// CHECK: %[[MALLOC_ARG1:.*]] = load i64, ptr %[[I1]]
+// CHECK: %[[PRIV_DESC1:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK: %[[HEAP_PTR1:.*]] = call ptr @malloc(i64 %[[MALLOC_ARG1]])
+// CHECK: %[[TMP1:.*]] = insertvalue { ptr, i64, i32, i8, i8, i8, i8 }
+// CHECK-SAME: undef, ptr %[[HEAP_PTR1]], 0
+// CHECK: store { ptr, i64, i32, i8, i8, i8, i8 } %[[TMP1]], ptr %[[PRIV_DESC1]]
+
+// CHECK: call {} @_FortranAAssign(ptr %[[PRIV_DESC0]]
+// CHECK: call {} @_FortranAAssign(ptr %[[PRIV_DESC1]]
+
+// CHECK: %[[PTR0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[PRIV_DESC0]], i32 0, i32 0
+// CHECK: %[[HEAP_MEMREF0:.*]] = load ptr, ptr %[[PTR0]]
+// CHECK: call void @free(ptr %[[HEAP_MEMREF0]])
+// CHECK: %[[PTR1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 },
+// CHECK-SAME: ptr %[[PRIV_DESC1]], i32 0, i32 0
+// CHECK: %[[HEAP_MEMREF1:.*]] = load ptr, ptr %[[PTR1]]
+// CHECK: call void @free(ptr %[[HEAP_MEMREF1]])
diff --git a/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir b/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir
new file mode 100644
index 00000000000000..6d65b25a7fbc01
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-private-allocatable.mlir
@@ -0,0 +1,207 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+llvm.func @free(!llvm.ptr)
+llvm.func @malloc(i64) -> !llvm.ptr
+omp.private {type = private} @box.heap_privatizer : !llvm.ptr alloc {
+^bb0(%arg0: !llvm.ptr):
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %7 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "alloc_var", pinned} : (i32) -> !llvm.ptr
+ %8 = llvm.mlir.constant(0 : i64) : i64
+ %10 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %11 = llvm.load %10 : !llvm.ptr -> !llvm.ptr
+ %12 = llvm.ptrtoint %11 : !llvm.ptr to i64
+ %13 = llvm.icmp "ne" %12, %8 : i64
+ llvm.cond_br %13, ^bb1, ^bb2
+^bb1: // pred: ^bb0
+ %14 = llvm.mlir.zero : !llvm.ptr
+ %16 = llvm.ptrtoint %14 : !llvm.ptr to i64
+ %17 = llvm.call @malloc(%16) {fir.must_be_heap = true, in_type = i32} : (i64) -> !llvm.ptr
+ %22 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %37 = llvm.insertvalue %17, %22[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %37, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ llvm.br ^bb3
+^bb2: // pred: ^bb0
+ %39 = llvm.mlir.zero : !llvm.ptr
+ %44 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %59 = llvm.insertvalue %39, %44[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %59, %7 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ llvm.br ^bb3
+^bb3: // 2 preds: ^bb1, ^bb2
+ omp.yield(%7 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+ %6 = llvm.mlir.constant(0 : i64) : i64
+ %8 = llvm.getelementptr %arg0[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %9 = llvm.load %8 : !llvm.ptr -> !llvm.ptr
+ %10 = llvm.ptrtoint %9 : !llvm.ptr to i64
+ %11 = llvm.icmp "ne" %10, %6 : i64
+ llvm.cond_br %11, ^bb1, ^bb2
+^bb1: // pred: ^bb0
+ llvm.call @free(%9) : (!llvm.ptr) -> ()
+ %15 = llvm.mlir.zero : !llvm.ptr
+ %16 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ %17 = llvm.insertvalue %15, %16[0] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
+ llvm.store %17, %arg0 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>, !llvm.ptr
+ llvm.br ^bb2
+^bb2: // 2 preds: ^bb0, ^bb1
+ omp.yield
+}
+llvm.func @target_allocatable_(%arg0: !llvm.ptr {fir.bindc_name = "lb"}, %arg1: !llvm.ptr {fir.bindc_name = "ub"}, %arg2: !llvm.ptr {fir.bindc_name = "l"}) attributes {fir.internal_name = "_QPtarget_allocatable"} {
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ %1 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+ %2 = llvm.mlir.constant(1 : i32) : i32
+ %3 = llvm.alloca %2 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+ %4 = llvm.mlir.constant(1 : i64) : i64
+ %5 = llvm.alloca %4 x f32 {bindc_name = "real_var"}...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped a few drive by comments.
Regarding the testing: Generating inputs with flang or clang is undesired, as this results in an unfocused test that is hard to maintain. These tests should contain as little unrelated stuff as possible.
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
Outdated
Show resolved
Hide resolved
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not generate test inputs from actual input programs. These tests are bloated with way too much unrelated stuff that make them a maintenance burden when changing anything that is not related to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some drive-by comments. It would be good if somebody more familiar with target stuff and mapping could take a look
|
||
// Check #1: Check that the type of the private variable matches | ||
// the type of the variable being mapped. | ||
if (!isa<LLVM::LLVMPointerType>(privVar.getType())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we assert the types match when the private variable is a pointer?
llvm::Value *llvmReplacementValue = yieldedValues.front(); | ||
moduleTranslation.mapValue(privBlockArg, llvmReplacementValue); | ||
if (!privatizer.getDeallocRegion().empty()) { | ||
llvmPrivateVars.push_back(llvmReplacementValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe this should be called something like llvmPrivateVarsNeedingCleanup
so it isn't confused with similarly named vectors in other directives that do contain all of the private variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Pranav, one question about the approach and a few small nits.
if (mappedPrivateVars.contains(privateVar)) { | ||
int blockArgIndex = mappedPrivateVars[privateVar]; | ||
Value blockArg = targetOp.getRegion().getArgument(blockArgIndex); | ||
mlir::Type privVarType = privateVar.getType(); | ||
mlir::Type blockArgType = blockArg.getType(); | ||
assert(isa<LLVM::LLVMPointerType>(blockArgType) && | ||
"A block argument corresponding to a mapped var should have " | ||
"!llvm.ptr type"); | ||
|
||
if (privVarType == blockArg.getType()) { | ||
llvm::Value *v = moduleTranslation.lookupValue(blockArg); | ||
return v; | ||
} | ||
|
||
if (!isa<LLVM::LLVMPointerType>(privVarType)) { | ||
// This typically happens when the privatized type is lowered from | ||
// boxchar<KIND> and gets lowered to !llvm.struct<(ptr, i64)>. That is the | ||
// struct/pair is passed by value. But, mapped values are passed only as | ||
// pointers, so before we privatize, we must load the pointer. | ||
llvm::Value *load = | ||
builder.CreateLoad(moduleTranslation.convertType(privVarType), | ||
moduleTranslation.lookupValue(blockArg)); | ||
return load; | ||
} | ||
} | ||
return moduleTranslation.lookupValue(privateVar); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Early return, just to follow the code style.
if (mappedPrivateVars.contains(privateVar)) { | |
int blockArgIndex = mappedPrivateVars[privateVar]; | |
Value blockArg = targetOp.getRegion().getArgument(blockArgIndex); | |
mlir::Type privVarType = privateVar.getType(); | |
mlir::Type blockArgType = blockArg.getType(); | |
assert(isa<LLVM::LLVMPointerType>(blockArgType) && | |
"A block argument corresponding to a mapped var should have " | |
"!llvm.ptr type"); | |
if (privVarType == blockArg.getType()) { | |
llvm::Value *v = moduleTranslation.lookupValue(blockArg); | |
return v; | |
} | |
if (!isa<LLVM::LLVMPointerType>(privVarType)) { | |
// This typically happens when the privatized type is lowered from | |
// boxchar<KIND> and gets lowered to !llvm.struct<(ptr, i64)>. That is the | |
// struct/pair is passed by value. But, mapped values are passed only as | |
// pointers, so before we privatize, we must load the pointer. | |
llvm::Value *load = | |
builder.CreateLoad(moduleTranslation.convertType(privVarType), | |
moduleTranslation.lookupValue(blockArg)); | |
return load; | |
} | |
} | |
return moduleTranslation.lookupValue(privateVar); | |
if (!mappedPrivateVars.contains(privateVar)) | |
return moduleTranslation.lookupValue(privateVar); | |
int blockArgIndex = mappedPrivateVars[privateVar]; | |
Value blockArg = targetOp.getRegion().getArgument(blockArgIndex); | |
mlir::Type privVarType = privateVar.getType(); | |
mlir::Type blockArgType = blockArg.getType(); | |
assert(isa<LLVM::LLVMPointerType>(blockArgType) && | |
"A block argument corresponding to a mapped var should have " | |
"!llvm.ptr type"); | |
if (privVarType == blockArg.getType()) { | |
llvm::Value *v = moduleTranslation.lookupValue(blockArg); | |
return v; | |
} | |
if (!isa<LLVM::LLVMPointerType>(privVarType)) { | |
// This typically happens when the privatized type is lowered from | |
// boxchar<KIND> and gets lowered to !llvm.struct<(ptr, i64)>. That is the | |
// struct/pair is passed by value. But, mapped values are passed only as | |
// pointers, so before we privatize, we must load the pointer. | |
llvm::Value *load = | |
builder.CreateLoad(moduleTranslation.convertType(privVarType), | |
moduleTranslation.lookupValue(blockArg)); | |
return load; | |
} |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
Outdated
Show resolved
Hide resolved
OperandRange privateVars = targetOp.getPrivateVars(); | ||
std::optional<ArrayAttr> privateSyms = targetOp.getPrivateSyms(); | ||
auto reverseIt = mapVars.rbegin(); | ||
for (auto [privVar, privSym] : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the following point should have been raised in an earlier PR probably, but I think this algorithms is a bit brittle since it always assumes the last mappings are the ones corresponding the privates vars. Can we instead record that info on the private
op level? So each private
op holds an optional MLIR value that corresponds to its map if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I agonized over this scheme as I wasnt too happy about this. I like your suggestion and I'll work on it.
/// privatizer is a PrivateClauseOp that privatizes an MLIR value. | ||
/// privatizerNeedsMap returns true if the value being privatized in an | ||
/// omp.target p should additionally be mapped to the target region | ||
/// using a MapInfoOp. This is most common when an allocatable is privatized. | ||
/// In such cases, the descriptor is use in privatization and needs to be | ||
/// mapped on to the device. | ||
// static bool privatizerNeedsMap(omp::PrivateClauseOp &privatizer) { | ||
// Region &allocRegion = privatizer.getAllocRegion(); | ||
// Value blockArg0 = allocRegion.getArgument(0); | ||
// return !blockArg0.use_empty(); | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes, I'll take care of this as part of my next update.
Hey all, I was out for a few days but back in the office now until the 11th of Nov. I hope to address all comments on this PR before I leave for an extended break. |
This PR will be replaced by #116576 since Pranav is out of office for some time. So I am taking this over. I will handle the current open comments on the new PR. |
This PR adds support to translate the
private
clause from MLIR to LLVMIR when used on allocatables in the context of anomp.target
op.