Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang] Enable alias tags pass by default #73111

Merged
merged 8 commits into from
Nov 27, 2023

Conversation

tblah
Copy link
Contributor

@tblah tblah commented Nov 22, 2023

Enable by default for optimization levels higher than 0 (same behavior as clang).

For simplicity, only forward the flag to the frontend driver when it contradicts what is implied by the optimization level.

Since #72903 there are now no known performance regressions.

Original PR was #68597

Enable by default when optimizing for speed.

For simplicity, only forward the flag to the frontend driver when it
contradicts what is implied by the optimization level.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' flang:driver flang Flang issues not falling into any other category labels Nov 22, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 22, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-flang-driver

Author: Tom Eccles (tblah)

Changes

Enable by default when optimizing for speed.

For simplicity, only forward the flag to the frontend driver when it contradicts what is implied by the optimization level.

Since #72903 there are now no known performance regressions.

Original PR was #68597


Full diff: https://github.com/llvm/llvm-project/pull/73111.diff

8 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/Flang.cpp (+20)
  • (modified) flang/include/flang/Tools/CLOptions.inc (+4-4)
  • (modified) flang/lib/Frontend/CompilerInvocation.cpp (+18-4)
  • (modified) flang/test/Driver/falias-analysis.f90 (+4)
  • (modified) flang/test/Driver/mlir-pass-pipeline.f90 (+2)
  • (modified) flang/test/Driver/optimization-remark.f90 (+9-13)
  • (modified) flang/test/Fir/basic-program.fir (+4)
  • (modified) flang/tools/tco/tco.cpp (+1)
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index 8bdd920c3dcbb796..9382433b94dadfd4 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -142,6 +142,26 @@ void Flang::addCodegenOptions(const ArgList &Args,
   if (shouldLoopVersion(Args))
     CmdArgs.push_back("-fversion-loops-for-stride");
 
+  Arg *aliasAnalysis = Args.getLastArg(options::OPT_falias_analysis,
+                                       options::OPT_fno_alias_analysis);
+  Arg *optLevel =
+      Args.getLastArg(options::OPT_Ofast, options::OPT_O, options::OPT_O4);
+  if (aliasAnalysis) {
+    bool falias_analysis =
+        aliasAnalysis->getOption().matches(options::OPT_falias_analysis);
+    // only pass on the argument if it does not match that implied by the
+    // optimization level
+    if (optLevel) {
+      if (!falias_analysis) {
+        CmdArgs.push_back("-fno-alias-analysis");
+      }
+    } else {
+      if (falias_analysis)
+        // requested alias analysis but no optimization enabled
+        CmdArgs.push_back("-falias-analysis");
+    }
+  }
+
   Args.addAllArgs(CmdArgs, {options::OPT_flang_experimental_hlfir,
                             options::OPT_flang_deprecated_no_hlfir,
                             options::OPT_flang_experimental_polymorphism,
diff --git a/flang/include/flang/Tools/CLOptions.inc b/flang/include/flang/Tools/CLOptions.inc
index c452c023b4a80ce1..5a17385fb3dae87a 100644
--- a/flang/include/flang/Tools/CLOptions.inc
+++ b/flang/include/flang/Tools/CLOptions.inc
@@ -157,11 +157,11 @@ inline void addDebugFoundationPass(mlir::PassManager &pm) {
       [&]() { return fir::createAddDebugFoundationPass(); });
 }
 
-inline void addFIRToLLVMPass(
-    mlir::PassManager &pm, llvm::OptimizationLevel optLevel = defaultOptLevel) {
+inline void addFIRToLLVMPass(mlir::PassManager &pm,
+    llvm::OptimizationLevel optLevel = defaultOptLevel, bool applyTbaa = true) {
   fir::FIRToLLVMPassOptions options;
   options.ignoreMissingTypeDescriptors = ignoreMissingTypeDescriptors;
-  options.applyTBAA = optLevel.isOptimizingForSpeed();
+  options.applyTBAA = applyTbaa;
   options.forceUnifiedTBAATree = useOldAliasTags;
   addPassConditionally(pm, disableFirToLlvmIr,
       [&]() { return fir::createFIRToLLVMPass(options); });
@@ -311,7 +311,7 @@ inline void createDefaultFIRCodeGenPassPipeline(
   if (config.VScaleMin != 0)
     pm.addPass(fir::createVScaleAttrPass({config.VScaleMin, config.VScaleMax}));
 
-  fir::addFIRToLLVMPass(pm, config.OptLevel);
+  fir::addFIRToLLVMPass(pm, config.OptLevel, config.AliasAnalysis);
 }
 
 /// Create a pass pipeline for lowering from MLIR to LLVM IR
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index cb4f2d6a6225205b..cfb1dd91ead30564 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -242,10 +242,24 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
                    clang::driver::options::OPT_fno_loop_versioning, false))
     opts.LoopVersioning = 1;
 
-  opts.AliasAnalysis =
-      args.hasFlag(clang::driver::options::OPT_falias_analysis,
-                   clang::driver::options::OPT_fno_alias_analysis,
-                   /*default=*/false);
+  bool aliasAnalysis = false;
+  bool noAliasAnalysis = false;
+  if (auto *arg =
+          args.getLastArg(clang::driver::options::OPT_falias_analysis,
+                          clang::driver::options::OPT_fno_alias_analysis)) {
+    if (arg->getOption().matches(clang::driver::options::OPT_falias_analysis))
+      aliasAnalysis = true;
+    else
+      noAliasAnalysis = true;
+  }
+  opts.AliasAnalysis = 0;
+  if (opts.OptimizationLevel > 0) {
+    if (!noAliasAnalysis)
+      opts.AliasAnalysis = 1;
+  } else {
+    if (aliasAnalysis)
+      opts.AliasAnalysis = 1;
+  }
 
   for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ))
     opts.LLVMPassPlugins.push_back(a->getValue());
diff --git a/flang/test/Driver/falias-analysis.f90 b/flang/test/Driver/falias-analysis.f90
index f2c5dbde6d2c878c..1c74276974d47204 100644
--- a/flang/test/Driver/falias-analysis.f90
+++ b/flang/test/Driver/falias-analysis.f90
@@ -4,10 +4,14 @@
 ! RUN: %flang -c -emit-llvm -falias-analysis %s -o - | llvm-dis | FileCheck %s --check-prefix=CHECK-AA --check-prefix=CHECK-ALL
 ! RUN: %flang -c -emit-llvm -falias-analysis -fno-alias-analysis %s -o - | llvm-dis | FileCheck %s --check-prefix=CHECK-NOAA --check-prefix=CHECK-ALL
 ! RUN: %flang -c -emit-llvm %s -o - | llvm-dis | FileCheck %s --check-prefix=CHECK-NOAA --check-prefix=CHECK-ALL
+! RUN: %flang -c -emit-llvm -Ofast %s -o - | llvm-dis | FileCheck %s --check-prefix=CHECK-AA --check-prefix=CHECK-ALL
+! RUN: %flang -c -emit-llvm -Ofast -fno-alias-analysis %s -o - | llvm-dis | FileCheck %s --check-prefix=CHECK-NOAA --check-prefix=CHECK-ALL
 
 ! RUN: %flang -fc1 -emit-llvm -falias-analysis %s -o - | FileCheck %s --check-prefix=CHECK-AA --check-prefix=CHECK-ALL
 ! RUN: %flang -fc1 -emit-llvm -falias-analysis -fno-alias-analysis %s -o - | FileCheck %s --check-prefix=CHECK-NOAA --check-prefix=CHECK-ALL
 ! RUN: %flang -fc1 -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK-NOAA --check-prefix=CHECK-ALL
+! RUN: %flang -fc1 -emit-llvm -O3 %s -o - | FileCheck %s --check-prefix=CHECK-AA --check-prefix=CHECK-ALL
+! RUN: %flang -fc1 -emit-llvm -O3 -fno-alias-analysis %s -o - | FileCheck %s --check-prefix=CHECK-NOAA --check-prefix=CHECK-ALL
 
 subroutine simple(a)
   integer, intent(inout) :: a(:)
diff --git a/flang/test/Driver/mlir-pass-pipeline.f90 b/flang/test/Driver/mlir-pass-pipeline.f90
index 7f92ec25bef98ec7..3d8c42f123e2eb06 100644
--- a/flang/test/Driver/mlir-pass-pipeline.f90
+++ b/flang/test/Driver/mlir-pass-pipeline.f90
@@ -51,6 +51,8 @@
 
 ! ALL-NEXT: 'func.func' Pipeline
 ! ALL-NEXT:   PolymorphicOpConversion
+! O2-NEXT:  AddAliasTags
+! O2-NEXT:  'func.func' Pipeline
 ! ALL-NEXT:   CFGConversion
 
 ! ALL-NEXT: SCFToControlFlow
diff --git a/flang/test/Driver/optimization-remark.f90 b/flang/test/Driver/optimization-remark.f90
index 13fc24346eac68b8..20ff9eb59a6702d6 100644
--- a/flang/test/Driver/optimization-remark.f90
+++ b/flang/test/Driver/optimization-remark.f90
@@ -41,28 +41,24 @@
 ! Once we start filtering, this is reduced to 1 one of the loop passes.
 
 ! PASS-REGEX-LOOP-ONLY-NOT:     optimization-remark.f90:77:7: remark: hoisting load [-Rpass=licm]
-! PASS-REGEX-LOOP-ONLY:         optimization-remark.f90:83:5: remark: Loop deleted because it is invariant [-Rpass=loop-delete]
+! PASS-REGEX-LOOP-ONLY:         optimization-remark.f90:79:5: remark: Loop deleted because it is invariant [-Rpass=loop-delete]
 
 ! MISSED-REGEX-LOOP-ONLY-NOT:   optimization-remark.f90:77:7: remark: failed to hoist load with loop-invariant address because load is conditionally executed [-Rpass-missed=licm]
-! MISSED-REGEX-LOOP-ONLY:       optimization-remark.f90:76:4: remark: loop not vectorized [-Rpass-missed=loop-vectorize]
+! MISSED-REGEX-LOOP-ONLY:       optimization-remark.f90:72:4: remark: loop not vectorized [-Rpass-missed=loop-vectorize]
 
 
-! ANALYSIS-REGEX-LOOP-ONLY:     optimization-remark.f90:79:7: remark: loop not vectorized: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
-! ANALYSIS-REGEX-LOOP-ONLY:     Unknown data dependence. Memory location is the same as accessed at optimization-remark.f90:78:7 [-Rpass-analysis=loop-vectorize]
+! ANALYSIS-REGEX-LOOP-ONLY:     optimization-remark.f90:73:7: remark: loop not vectorized: cannot identify array bounds [-Rpass-analysis=loop-vectorize]
 ! ANALYSIS-REGEX-LOOP-ONLY-NOT: remark: {{.*}}: IR instruction count changed from {{[0-9]+}} to {{[0-9]+}}; Delta: {{-?[0-9]+}} [-Rpass-analysis=size-info]
 
-! PASS:                         optimization-remark.f90:77:7: remark: hoisting load [-Rpass=licm]
-! PASS:                         optimization-remark.f90:83:5: remark: Loop deleted because it is invariant [-Rpass=loop-delete]
+! PASS:                         optimization-remark.f90:79:5: remark: Loop deleted because it is invariant [-Rpass=loop-delete]
 
-! MISSED:                       optimization-remark.f90:77:7: remark: failed to hoist load with loop-invariant address because load is conditionally executed [-Rpass-missed=licm]
-! MISSED:                       optimization-remark.f90:76:4: remark: loop not vectorized [-Rpass-missed=loop-vectorize]
-! MISSED-NOT:                   optimization-remark.f90:79:7: remark: loop not vectorized: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
+! MISSED:                       optimization-remark.f90:73:7: remark: failed to move load with loop-invariant address because the loop may invalidate its value [-Rpass-missed=licm]
+! MISSED:                       optimization-remark.f90:72:4: remark: loop not vectorized [-Rpass-missed=loop-vectorize]
+! MISSED-NOT:                   optimization-remark.f90:75:7: remark: loop not vectorized: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
 ! MISSED-NOT:                   Unknown data dependence. Memory location is the same as accessed at optimization-remark.f90:78:7 [-Rpass-analysis=loop-vectorize]
 
-! ANALYSIS:                     optimization-remark.f90:79:7: remark: loop not vectorized: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
-! ANALYSIS:                     Unknown data dependence. Memory location is the same as accessed at optimization-remark.f90:78:7 [-Rpass-analysis=loop-vectorize]
-! ANALYSIS:                     remark: {{.*}}: IR instruction count changed from {{[0-9]+}} to {{[0-9]+}}; Delta: {{-?[0-9]+}} [-Rpass-analysis=size-info]
-! ANALYSIS-NOT:                 optimization-remark.f90:77:7: remark: failed to hoist load with loop-invariant address because load is conditionally executed [-Rpass-missed=licm]
+! ANALYSIS:                     optimization-remark.f90:74:7: remark: loop not vectorized: unsafe dependent memory operations in loop.
+! ANALYSIS:                     remark: {{.*}} instructions in function [-Rpass-analysis=asm-printer]
 
 subroutine swap_real(a1, a2)
    implicit none
diff --git a/flang/test/Fir/basic-program.fir b/flang/test/Fir/basic-program.fir
index 0e82f7dfdedb447d..d8a9e74c318ce186 100644
--- a/flang/test/Fir/basic-program.fir
+++ b/flang/test/Fir/basic-program.fir
@@ -57,6 +57,10 @@ func.func @_QQmain() {
 
 // PASSES-NEXT: 'func.func' Pipeline
 // PASSES-NEXT:   PolymorphicOpConversion
+
+// PASSES-NEXT: AddAliasTags
+
+// PASSES-NEXT: 'func.func' Pipeline
 // PASSES-NEXT:   CFGConversion
 
 // PASSES-NEXT: SCFToControlFlow
diff --git a/flang/tools/tco/tco.cpp b/flang/tools/tco/tco.cpp
index 31d6bac142dc421b..a649535a39b74b31 100644
--- a/flang/tools/tco/tco.cpp
+++ b/flang/tools/tco/tco.cpp
@@ -120,6 +120,7 @@ compileFIR(const mlir::PassPipelineCLParser &passPipeline) {
       return mlir::failure();
   } else {
     MLIRToLLVMPassPipelineConfig config(llvm::OptimizationLevel::O2);
+    config.AliasAnalysis = true; // enabled when optimizing for speed
     if (codeGenLLVM) {
       // Run only CodeGen passes.
       fir::createDefaultFIRCodeGenPassPipeline(pm, config);

Copy link
Contributor

@Leporacanthicus Leporacanthicus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Enable by default when optimizing for speed.

Please, can you be more specific and define what qualifies as "optimizing for speed"?

Comment on lines 245 to 246
bool aliasAnalysis = false;
bool noAliasAnalysis = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need two bools to model one thing? What's the logic that we trying to model here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If -falias-analysis is specified then we should enable alias analysis even when it would not be enabled by the optimization level.

If -fno-alias-analysis is specified then we should not enable enable analysis even if it would be enabled by the optimization level.

This doesn't fit neatly into a single boolean, because we also need to support the state where both of these are false (indicating that we should follow the default behavior).

An alternative implementation would be a single boolean inside a std::option. Would that be clearer?

Comment on lines +147 to +148
Arg *optLevel =
Args.getLastArg(options::OPT_Ofast, options::OPT_O, options::OPT_O4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about other opt levels? Do we enable or disable alias analysis for these opt levels?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be disabled unless -falias-analysis is specified directly.

What I am trying to accomplish here is for the frontend driver to usually just do what you expect, without having to remember to use this option. But I want there to still be a separate flag available to override this default behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose alias analysis could make sense at -Os too, because it could enable better common sub-expression elimination and hoisting. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like clang generates tbaa metadata at all opt levels, except -O0. I think this makes sense: the optimization themselves need to decide how to use it, e.g. for improving performance/code-size/etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I want there to still be a separate flag available to override this default behavior.

That's fine, but then one has to decide whether -f{no}-alias-analysis overrides -O{n} or not? I think that "explicit" request from a user should always take precedence. This leads to (pseudo code):

opts.AliasAnalysis = 0;
if (opt level requiring alias analysis)
  opts.AliasAnalysis  = 1;

// User request takes precedence when it comes to alias analysis.
if (-falias-analysis or -fno-alias-analysis) then
  "do whatever the user requested"

Separately, could you check what Clang does and make sure that that would be consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, but then one has to decide whether -f{no}-alias-analysis overrides -O{n} or not? I think that "explicit" request from a user should always take precedence. This leads to (pseudo code):

opts.AliasAnalysis = 0;
if (opt level requiring alias analysis)
opts.AliasAnalysis = 1;

/ / User request takes precedence when it comes to alias analysis.
if (-falias-analysis or -fno-alias-analysis) then
"do whatever the user requested"
Separately, could you check what Clang does and make sure that that would be consistent?

@banach-space This is exactly the handling in the front-end driver as given below (and in lib/Frontend/CompilerInvocation). The flang driver is only deciding whether to forward or not.

  opts.AliasAnalysis = opts.OptimizationLevel > 0;
  if (auto *arg =
          args.getLastArg(clang::driver::options::OPT_falias_analysis,
                          clang::driver::options::OPT_fno_alias_analysis))
    opts.AliasAnalysis =
        arg->getOption().matches(clang::driver::options::OPT_falias_analysis);

clang/lib/Driver/ToolChains/Flang.cpp Outdated Show resolved Hide resolved
clang/lib/Driver/ToolChains/Flang.cpp Outdated Show resolved Hide resolved
Copy link

github-actions bot commented Nov 22, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@vzakhari vzakhari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes, Tom!

I have one minor comment, but I would like to ask to merge this after US holidays, if possible. Could you please postpone the merging until Monday GMT?

Comment on lines +147 to +148
Arg *optLevel =
Args.getLastArg(options::OPT_Ofast, options::OPT_O, options::OPT_O4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like clang generates tbaa metadata at all opt levels, except -O0. I think this makes sense: the optimization themselves need to decide how to use it, e.g. for improving performance/code-size/etc.

@tblah
Copy link
Contributor Author

tblah commented Nov 22, 2023

Thank you for the changes, Tom!

I have one minor comment, but I would like to ask to merge this after US holidays, if possible. Could you please postpone the merging until Monday GMT?

Sure. I'll wait until Monday.

Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, but the summary should document when exactly the alias analysis is enabled/disabled. And the relationship between -f{no}-alias-analysis and the optimisation flags.

Could you also add a note whether the implemented behaviour is consistent with Clang?

flang/lib/Frontend/CompilerInvocation.cpp Outdated Show resolved Hide resolved
Comment on lines +147 to +148
Arg *optLevel =
Args.getLastArg(options::OPT_Ofast, options::OPT_O, options::OPT_O4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I want there to still be a separate flag available to override this default behavior.

That's fine, but then one has to decide whether -f{no}-alias-analysis overrides -O{n} or not? I think that "explicit" request from a user should always take precedence. This leads to (pseudo code):

opts.AliasAnalysis = 0;
if (opt level requiring alias analysis)
  opts.AliasAnalysis  = 1;

// User request takes precedence when it comes to alias analysis.
if (-falias-analysis or -fno-alias-analysis) then
  "do whatever the user requested"

Separately, could you check what Clang does and make sure that that would be consistent?

@tblah tblah requested a review from banach-space November 27, 2023 10:22
Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments - this is looking really good now!

flang/test/Driver/falias-analysis.f90 Show resolved Hide resolved
clang/lib/Driver/ToolChains/Flang.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Comment on lines +147 to +148
Arg *optLevel =
Args.getLastArg(options::OPT_Ofast, options::OPT_O, options::OPT_O4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, but then one has to decide whether -f{no}-alias-analysis overrides -O{n} or not? I think that "explicit" request from a user should always take precedence. This leads to (pseudo code):

opts.AliasAnalysis = 0;
if (opt level requiring alias analysis)
opts.AliasAnalysis = 1;

/ / User request takes precedence when it comes to alias analysis.
if (-falias-analysis or -fno-alias-analysis) then
"do whatever the user requested"
Separately, could you check what Clang does and make sure that that would be consistent?

@banach-space This is exactly the handling in the front-end driver as given below (and in lib/Frontend/CompilerInvocation). The flang driver is only deciding whether to forward or not.

  opts.AliasAnalysis = opts.OptimizationLevel > 0;
  if (auto *arg =
          args.getLastArg(clang::driver::options::OPT_falias_analysis,
                          clang::driver::options::OPT_fno_alias_analysis))
    opts.AliasAnalysis =
        arg->getOption().matches(clang::driver::options::OPT_falias_analysis);

Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that what's being proposed here is quite non-standard. In particular, what should happen here:

flang-new {A very long list of options copied from somewhere, including -fno-alias-analysis) -O3 file.f90

How is the user meant to know that they need to add -falias-analysis at the end to enable alias analysis? And in general, how are they supposed to know that -fno-alias-analysis overrides -O3?

If it was the case of "the last relevant option takes priority" (as is the case with most/all options) then that would be easy - identical logic would always apply.

This should be easy to fix if you use this instead of what's currently implemented (apologies for GitHub being unable to format this properly):

Args.getLastArg(options::OPT_falias_analysis,
                              options::OPT_fno_alias_analysis,
                               options::OPT_Ofast,
                               options::OPT_O,
                               options::OPT_O4);

(I'm skipping other changes that would also be required - hopefully this is clear enough). This should still give you all the flexibility that you need for testing and be less surprising for end users.

If you are in a rush to land this then this LGTM, but I would like the relationship between -O{1|2|3|4} and -f{no}-alias-analysis to be refined in a follow-up patch. Unless there's a good reason to avoid that? WDYT?

Sorry, I've only really realised this after @tblah updated falias-analysis.f90.

@tblah
Copy link
Contributor Author

tblah commented Nov 27, 2023

If you are in a rush to land this then this LGTM, but I would like the relationship between -O{1|2|3|4} and -f{no}-alias-analysis to be refined in a follow-up patch. Unless there's a good reason to avoid that? WDYT?

Thanks @banach-space, I will land this now and follow up later today.

The behavior you're commenting on was deliberate because to me, it feels wrong to enable alias analysis if there is an -fno-alias-analysis flag anywhere (and no -falias-analysis) as it it may not be obvious that -O implies -falias-analysis, and what -O does isn't documented anywhere.

But I don't feel strongly about it so I will follow up with a new patch later if you still feel that is worthwhile.

@tblah tblah merged commit caba031 into llvm:main Nov 27, 2023
2 checks passed
@banach-space
Copy link
Contributor

I don't feel strongly about it

ACK.

I mostly care about consistency in the interface exposed to the end-user. TBH, I've never really investigated the relationship of -O{0|1|2|3|4} with various feature flags. What I described definitely holds for -f{no-}some_feature flags. But not 100% sure about -O{0|1|2|3|4}.

TBH, this feels like just too much control in users' hands. So I would keep this option as hidden and use strictly for compiler development.

You could also just update the relevant help text:

  PosFlag<SetTrue, [], [], "Pass alias information on to LLVM (overrides -O0 which disables alias analysis)">,
  NegFlag<SetFalse, [], [], "Do not pass alias information on to LLVM (overrides -O{1|2|3|4} which enable alias analysis)">>;

Btw, thanks for working on this - really great to see progress on this front! 🙏🏻

tblah added a commit to tblah/llvm-project that referenced this pull request Nov 27, 2023
As requested by @branach-space on llvm#73111. This makes it clearer that
-f[no-]alias-analysis will always override -O flags, no matter their
ordering.
tblah added a commit to tblah/llvm-project that referenced this pull request Nov 29, 2023
This reverts commit caba031.

Serious performance regressions were reported by @vzakhari
llvm#58303 (comment)

Fixing this doesn't look quick so I will revert for now.
tblah added a commit that referenced this pull request Nov 29, 2023
This reverts commit caba031.

Serious performance regressions were reported by @vzakhari
#58303 (comment)

Fixing this doesn't look quick so I will revert for now.
tblah added a commit to tblah/llvm-project that referenced this pull request Dec 3, 2023
Enable by default for optimization levels higher than 0 (same behavior
as clang).

For simplicity, only forward the flag to the frontend driver when it
contradicts what is implied by the optimization level.

This was first landed in
llvm#73111 but was later reverted
due to a performance regression. That regression was fixed by
llvm#74065.
tblah added a commit that referenced this pull request Dec 4, 2023
Enable by default for optimization levels higher than 0 (same behavior
as clang).

For simplicity, only forward the flag to the frontend driver when it
contradicts what is implied by the optimization level.

This was first landed in
#73111 but was later reverted
due to a performance regression. That regression was fixed by
#74065.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category flang:driver flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants