-
Notifications
You must be signed in to change notification settings - Fork 12.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flang][OpenMP] Support lowering of simd reductions #112194
Conversation
This patch enables lowering to MLIR of the reduction clause of `simd` constructs. Lowering from MLIR to LLVM IR remains unimplemented, so at that stage it will result in errors being emitted rather than silently ignoring it as it is currently done. On composite `do simd` constructs, this lowering error will remain untriggered, as the `omp.simd` operation in that case is currently ignored. The MLIR representation, however, will now contain `reduction` information.
@llvm/pr-subscribers-mlir-llvm @llvm/pr-subscribers-flang-openmp Author: Sergio Afonso (skatrak) ChangesThis patch enables lowering to MLIR of the reduction clause of On composite Full diff: https://github.com/llvm/llvm-project/pull/112194.diff 3 Files Affected:
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index a89029b720e788..83bd18bc9fbe31 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -2073,7 +2073,9 @@ static void genStandaloneSimd(lower::AbstractConverter &converter,
loopNestClauseOps, iv);
EntryBlockArgs simdArgs;
- // TODO: Add private, reduction syms and vars.
+ // TODO: Add private syms and vars.
+ simdArgs.reduction.syms = simdReductionSyms;
+ simdArgs.reduction.vars = simdClauseOps.reductionVars;
auto simdOp =
genWrapperOp<mlir::omp::SimdOp>(converter, loc, simdClauseOps, simdArgs);
@@ -2231,7 +2233,9 @@ static void genCompositeDistributeParallelDoSimd(
wsloopOp.setComposite(/*val=*/true);
EntryBlockArgs simdArgs;
- // TODO: Add private, reduction syms and vars.
+ // TODO: Add private syms and vars.
+ simdArgs.reduction.syms = simdReductionSyms;
+ simdArgs.reduction.vars = simdClauseOps.reductionVars;
auto simdOp =
genWrapperOp<mlir::omp::SimdOp>(converter, loc, simdClauseOps, simdArgs);
simdOp.setComposite(/*val=*/true);
@@ -2288,7 +2292,9 @@ static void genCompositeDistributeSimd(lower::AbstractConverter &converter,
distributeOp.setComposite(/*val=*/true);
EntryBlockArgs simdArgs;
- // TODO: Add private, reduction syms and vars.
+ // TODO: Add private syms and vars.
+ simdArgs.reduction.syms = simdReductionSyms;
+ simdArgs.reduction.vars = simdClauseOps.reductionVars;
auto simdOp =
genWrapperOp<mlir::omp::SimdOp>(converter, loc, simdClauseOps, simdArgs);
simdOp.setComposite(/*val=*/true);
@@ -2345,7 +2351,9 @@ static void genCompositeDoSimd(lower::AbstractConverter &converter,
wsloopOp.setComposite(/*val=*/true);
EntryBlockArgs simdArgs;
- // TODO: Add private, reduction syms and vars.
+ // TODO: Add private syms and vars.
+ simdArgs.reduction.syms = simdReductionSyms;
+ simdArgs.reduction.vars = simdClauseOps.reductionVars;
auto simdOp =
genWrapperOp<mlir::omp::SimdOp>(converter, loc, simdClauseOps, simdArgs);
simdOp.setComposite(/*val=*/true);
diff --git a/flang/test/Lower/OpenMP/simd.f90 b/flang/test/Lower/OpenMP/simd.f90
index 3b2aeceb4c3f9f..36b26cffaa765f 100644
--- a/flang/test/Lower/OpenMP/simd.f90
+++ b/flang/test/Lower/OpenMP/simd.f90
@@ -274,3 +274,25 @@ subroutine lastprivate_with_simd
sum = i + 1
end do
end subroutine
+
+!CHECK-LABEL: func @_QPsimd_with_reduction_clause()
+subroutine simd_with_reduction_clause
+ integer :: i, x
+ x = 0
+ ! CHECK: %[[LB:.*]] = arith.constant 1 : i32
+ ! CHECK-NEXT: %[[UB:.*]] = arith.constant 9 : i32
+ ! CHECK-NEXT: %[[STEP:.*]] = arith.constant 1 : i32
+ ! CHECK-NEXT: omp.simd reduction(@{{.*}} %[[X:.*]]#0 -> %[[X_RED:.*]] : !fir.ref<i32>) {
+ ! CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[LB]]) to (%[[UB]]) inclusive step (%[[STEP]]) {
+ !$omp simd reduction(+:x)
+ do i=1, 9
+ ! CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_RED]] {uniq_name = "_QFsimd_with_reduction_clauseEx"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+ ! CHECK: fir.store %[[I]] to %[[LOCAL:.*]]#1 : !fir.ref<i32>
+ ! CHECK: %[[X_LD:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref<i32>
+ ! CHECK: %[[I_LD:.*]] = fir.load %[[LOCAL]]#0 : !fir.ref<i32>
+ ! CHECK: %[[SUM:.*]] = arith.addi %[[X_LD]], %[[I_LD]] : i32
+ ! CHECK: hlfir.assign %[[SUM]] to %[[X_DECL]]#0 : i32, !fir.ref<i32>
+ x = x+i
+ end do
+ !$OMP end simd
+end subroutine
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index f552221bbdcaf9..fb11cd5f258285 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -2016,14 +2016,16 @@ void SimdOp::build(OpBuilder &builder, OperationState &state,
const SimdOperands &clauses) {
MLIRContext *ctx = builder.getContext();
// TODO Store clauses in op: linearVars, linearStepVars, privateVars,
- // privateSyms, reductionVars, reductionByref, reductionSyms.
+ // privateSyms.
SimdOp::build(builder, state, clauses.alignedVars,
makeArrayAttr(ctx, clauses.alignments), clauses.ifExpr,
/*linear_vars=*/{}, /*linear_step_vars=*/{},
clauses.nontemporalVars, clauses.order, clauses.orderMod,
/*private_vars=*/{}, /*private_syms=*/nullptr,
- /*reduction_vars=*/{}, /*reduction_byref=*/nullptr,
- /*reduction_syms=*/nullptr, clauses.safelen, clauses.simdlen);
+ clauses.reductionVars,
+ makeDenseBoolArrayAttr(ctx, clauses.reductionByref),
+ makeArrayAttr(ctx, clauses.reductionSyms), clauses.safelen,
+ clauses.simdlen);
}
LogicalResult SimdOp::verify() {
|
@llvm/pr-subscribers-mlir-openmp Author: Sergio Afonso (skatrak) ChangesThis patch enables lowering to MLIR of the reduction clause of On composite Full diff: https://github.com/llvm/llvm-project/pull/112194.diff 3 Files Affected:
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index a89029b720e788..83bd18bc9fbe31 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -2073,7 +2073,9 @@ static void genStandaloneSimd(lower::AbstractConverter &converter,
loopNestClauseOps, iv);
EntryBlockArgs simdArgs;
- // TODO: Add private, reduction syms and vars.
+ // TODO: Add private syms and vars.
+ simdArgs.reduction.syms = simdReductionSyms;
+ simdArgs.reduction.vars = simdClauseOps.reductionVars;
auto simdOp =
genWrapperOp<mlir::omp::SimdOp>(converter, loc, simdClauseOps, simdArgs);
@@ -2231,7 +2233,9 @@ static void genCompositeDistributeParallelDoSimd(
wsloopOp.setComposite(/*val=*/true);
EntryBlockArgs simdArgs;
- // TODO: Add private, reduction syms and vars.
+ // TODO: Add private syms and vars.
+ simdArgs.reduction.syms = simdReductionSyms;
+ simdArgs.reduction.vars = simdClauseOps.reductionVars;
auto simdOp =
genWrapperOp<mlir::omp::SimdOp>(converter, loc, simdClauseOps, simdArgs);
simdOp.setComposite(/*val=*/true);
@@ -2288,7 +2292,9 @@ static void genCompositeDistributeSimd(lower::AbstractConverter &converter,
distributeOp.setComposite(/*val=*/true);
EntryBlockArgs simdArgs;
- // TODO: Add private, reduction syms and vars.
+ // TODO: Add private syms and vars.
+ simdArgs.reduction.syms = simdReductionSyms;
+ simdArgs.reduction.vars = simdClauseOps.reductionVars;
auto simdOp =
genWrapperOp<mlir::omp::SimdOp>(converter, loc, simdClauseOps, simdArgs);
simdOp.setComposite(/*val=*/true);
@@ -2345,7 +2351,9 @@ static void genCompositeDoSimd(lower::AbstractConverter &converter,
wsloopOp.setComposite(/*val=*/true);
EntryBlockArgs simdArgs;
- // TODO: Add private, reduction syms and vars.
+ // TODO: Add private syms and vars.
+ simdArgs.reduction.syms = simdReductionSyms;
+ simdArgs.reduction.vars = simdClauseOps.reductionVars;
auto simdOp =
genWrapperOp<mlir::omp::SimdOp>(converter, loc, simdClauseOps, simdArgs);
simdOp.setComposite(/*val=*/true);
diff --git a/flang/test/Lower/OpenMP/simd.f90 b/flang/test/Lower/OpenMP/simd.f90
index 3b2aeceb4c3f9f..36b26cffaa765f 100644
--- a/flang/test/Lower/OpenMP/simd.f90
+++ b/flang/test/Lower/OpenMP/simd.f90
@@ -274,3 +274,25 @@ subroutine lastprivate_with_simd
sum = i + 1
end do
end subroutine
+
+!CHECK-LABEL: func @_QPsimd_with_reduction_clause()
+subroutine simd_with_reduction_clause
+ integer :: i, x
+ x = 0
+ ! CHECK: %[[LB:.*]] = arith.constant 1 : i32
+ ! CHECK-NEXT: %[[UB:.*]] = arith.constant 9 : i32
+ ! CHECK-NEXT: %[[STEP:.*]] = arith.constant 1 : i32
+ ! CHECK-NEXT: omp.simd reduction(@{{.*}} %[[X:.*]]#0 -> %[[X_RED:.*]] : !fir.ref<i32>) {
+ ! CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[LB]]) to (%[[UB]]) inclusive step (%[[STEP]]) {
+ !$omp simd reduction(+:x)
+ do i=1, 9
+ ! CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_RED]] {uniq_name = "_QFsimd_with_reduction_clauseEx"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+ ! CHECK: fir.store %[[I]] to %[[LOCAL:.*]]#1 : !fir.ref<i32>
+ ! CHECK: %[[X_LD:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref<i32>
+ ! CHECK: %[[I_LD:.*]] = fir.load %[[LOCAL]]#0 : !fir.ref<i32>
+ ! CHECK: %[[SUM:.*]] = arith.addi %[[X_LD]], %[[I_LD]] : i32
+ ! CHECK: hlfir.assign %[[SUM]] to %[[X_DECL]]#0 : i32, !fir.ref<i32>
+ x = x+i
+ end do
+ !$OMP end simd
+end subroutine
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index f552221bbdcaf9..fb11cd5f258285 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -2016,14 +2016,16 @@ void SimdOp::build(OpBuilder &builder, OperationState &state,
const SimdOperands &clauses) {
MLIRContext *ctx = builder.getContext();
// TODO Store clauses in op: linearVars, linearStepVars, privateVars,
- // privateSyms, reductionVars, reductionByref, reductionSyms.
+ // privateSyms.
SimdOp::build(builder, state, clauses.alignedVars,
makeArrayAttr(ctx, clauses.alignments), clauses.ifExpr,
/*linear_vars=*/{}, /*linear_step_vars=*/{},
clauses.nontemporalVars, clauses.order, clauses.orderMod,
/*private_vars=*/{}, /*private_syms=*/nullptr,
- /*reduction_vars=*/{}, /*reduction_byref=*/nullptr,
- /*reduction_syms=*/nullptr, clauses.safelen, clauses.simdlen);
+ clauses.reductionVars,
+ makeDenseBoolArrayAttr(ctx, clauses.reductionByref),
+ makeArrayAttr(ctx, clauses.reductionSyms), clauses.safelen,
+ clauses.simdlen);
}
LogicalResult SimdOp::verify() {
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look okay, but please either add a proper TODO for the LLVM lowering or don't merge this until after the LLVM lowering is merged.
Thanks for the review! There was already a check there, but I updated it to trigger different errors based on the unsupported clause found. Hopefully this addresses your concerns. |
!$omp simd reduction(+:x) | ||
do i=1, 9 | ||
! CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_RED]] {uniq_name = "_QFsimd_with_reduction_clauseEx"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) | ||
! CHECK: fir.store %[[I]] to %[[LOCAL:.*]]#1 : !fir.ref<i32> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not this PR, but reading the full IR locally it looks like the loop iteration variable was not privatized.
As I understand it, OpenMP 5.2 section 5.1.1 line 16 seems to say that it should be implicitly lastprivate.
Did I miss something here? If not I'll create an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, I think the previous statement in the spec is the one that applies, since it has a single associated loop. So I guess it should actually be linear with a linear-step equal to the increment of the loop. But I believe you're right, since neither that nor lastprivatization in the case of multiple associated loops seem to be currently implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. That isn't the normal "not yet implemented" message generated by the TODO macro but I think the error message now looks clean and easy enough to understand:
error: loc(...): reduction clause not yet supported
error: loc(...): LLVM Translation failed for operation: omp.simd
error: failed to create the LLVM module
Yes, we're triggering different errors when something is not yet supported in Flang lowering vs. in omp dialect translation to LLVM IR. Should we create an equivalent TODO macro for use there too? |
Does the Driver report this as a crash?
Why don't we issue the TODO in flang lowering? This is best and uniform from the Flang point of view. We can remove the TODO when translation support is added. |
The driver reports an error exit value but does not print a stack trace. |
If it is not a crash it is probably OK. If you do not want to change for already implemented lowering, it is fine. But going forward we should try to error out in Flang vs translation. |
What is the reason for this preference? In my opinion, it has the disadvantage that we wouldn't be able to independently work on Flang lowering before the translation to LLVM IR for that operation/clause is implemented. I understand that we don't want to silently ignore clauses by lowering them to MLIR and then not taking them into account in the LLVM IR translation pass. But if we emit an error and return early when we find these operations/clauses there, the result from a user perspective would be the same. At the moment the error message is different, but if that's important it wouldn't take much to add a similar TODO macro to the MLIR to LLVM IR translation as well. Either way we should trigger errors there because if another dialect or frontend decides to introduce operations/clauses for which we have no translation to LLVM IR implemented, we should emit errors instead of crashing/ignoring them. |
This is to present a uniform way in the flang compiler to message to the user that a feature is not implemented. As you know the user does not care about which part of the compiler does not support the feature. Lowering is currently the place where this is implemented for all features including OpenMP features and we have full control over this.
Yes, we should provide appropriate errors in translation if a feature/clause/directive/option is not supported. Adding a similar TODO macro to the MLIR to LLVM IR translation would also work. But I am not sure whether it will be accepted since it is not the way things are done in MLIR. But since at the moment it is only Flang work that uses this it might be OK. Would it be possible to catch this in the call to the translation and issue a TODO in Flang codegen or Driver? We can see the worst-case issue in this distribute construct translation issue where it presents as a crash. |
I'll try to explain a bit better my concern with triggering TODOs in Flang lowering for everything that's not fully supported. Let's say we define the MLIR representation for a given clause. At that point we should be unblocking two lines of work: translation of that clause to LLVM IR, as part of operations that accept it, and Flang lowering to MLIR of that clause and attaching it to the applicable operations. These are two independent features that both need to come together to actually support the new clause by Flang. However, if we must raise TODO errors in Flang lowering for that clause until it is implemented in both places, then we're artificially creating a dependency. We aren't able to implement and test the lowering to MLIR in Flang because we can't remove the TODO until the other part is also implemented. My thinking is that, by triggering the errors at the point we're actually missing an implementation, we're able to work on both parts independently, since the "contract" between them is the definition of the MLIR operation / clause. From a user perspective, from where we trigger errors doesn't matter, as long as error messages convey the same information.
Looking a bit into this crash, I think the problem here is not only that we introduced an Adding the macro would be mostly to ensure every TODO message is formatted the same way, possibly like Flang lowering, but it's not really necessary if printing the compiler source location of where the error was triggered or defining it as a macro goes against conventions in MLIR.
I can't think of a way to do that. We can detect whether translation to LLVM IR failed and then raise an error, but we wouldn't know if it was a TODO that we hit or for what operation it was. I think the functionally equivalent version to that would be to fix the current error handling issues in the translation stage and update error messages to state that it's not yet implemented. One thing I'd like to propose is to merge this PR so that there is an error triggered for 'simd' but not for 'do simd' reductions (even if it's not a Flang lowering TODO), and work on improving the state of error propagation in the MLIR to LLVM IR translation stage to prevent compiler crashes. This has to be done either way eventually, but it also perhaps helps with concerns of letting Flang lowering produce MLIR that is potentially not yet translatable to LLVM IR. |
+1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'm okay as well with merging this PR.
This PR triggered a Fujistu testsuite regression: LLVM-1403. I've looked at it and the problem is related to the fact that we currently ignore the inner loop wrapper in 'do simd' during MLIR to LLVM IR translation. I'll create another PR to address this shortly. |
This patch enables lowering to MLIR of the reduction clause of `simd` constructs. Lowering from MLIR to LLVM IR remains unimplemented, so at that stage it will result in errors being emitted rather than silently ignoring it as it is currently done. On composite `do simd` constructs, this lowering error will remain untriggered, as the `omp.simd` operation in that case is currently ignored. The MLIR representation, however, will now contain `reduction` information.
This patch enables lowering to MLIR of the reduction clause of `simd` constructs. Lowering from MLIR to LLVM IR remains unimplemented, so at that stage it will result in errors being emitted rather than silently ignoring it as it is currently done. On composite `do simd` constructs, this lowering error will remain untriggered, as the `omp.simd` operation in that case is currently ignored. The MLIR representation, however, will now contain `reduction` information.
This patch enables lowering to MLIR of the reduction clause of
simd
constructs. Lowering from MLIR to LLVM IR remains unimplemented, so at that stage it will result in errors being emitted rather than silently ignoring it as it is currently done.On composite
do simd
constructs, this lowering error will remain untriggered, as theomp.simd
operation in that case is currently ignored. The MLIR representation, however, will now containreduction
information.