Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WPD]Regard unreachable function as a possible devirtualizable target #115668

Merged
merged 4 commits into from
Nov 13, 2024

Conversation

minglotus-6
Copy link
Contributor

@minglotus-6 minglotus-6 commented Nov 10, 2024

https://reviews.llvm.org/D115492 skips unreachable functions and potentially allows more static de-virtualizations. The motivation is to ignore virtual deleting destructor of abstract class (e.g., Base::~Base() in https://gcc.godbolt.org/z/dWMsdT9Kz).

  • Note WPD already handles most pure virtual functions (like Base::x() in the godbolt example above), which becomes a __cxa_pure_virtual in the vtable slot.

This PR proposes to undo the change, because it turns out there are other unreachable functions that a general program wants to run and fail intentionally, with LOG(FATAL) or CHECK [1] for example. While many real-world applications are encouraged to check-fail sparingly, they are allowed to do so on critical errors (e.g., misconfiguration or bug is detected during server startup).

  • Implementation-wise, this PR keeps the one-bit 'unreachable' state in bitcode and updates WPD analysis.

https://gcc.godbolt.org/z/T1aMhczYr is a minimum reproducible example extracted from unit test. Base::func is a one-liner of LOG(FATAL) << "message", and lowered to one basic block ending with unreachable. A real-world program is allowed to invoke Base::func to terminate the program as a way to report errors (in server initialization stage for example), even if errors on the serving path should be handled more gracefully.

[1] https://abseil.io/docs/cpp/guides/logging#CHECK and https://abseil.io/docs/cpp/guides/logging#configuration-and-flags

@llvmbot
Copy link

llvmbot commented Nov 10, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Mingming Liu (minglotus-6)

Changes

https://reviews.llvm.org/D115492 skips unreachable functions (pure virtual functions) and potentially allows more static de-virtualizations.

This PR proposes to undo the change, because it turns out there are other unreachable functions that a general program wants to run and fail intentionally, with LOG(FATAL) or CHECK [0] for example. While many real-world applications are encouraged to check-fail sparingly, they are allowed to do so on critical errors (e.g., misconfiguration or bug is detected during server startup).

  • Implementation-wise, this PR keeps the one-bit 'unreachable' state in bitcode and updates WPD analysis.

Where the observation comes from:

  • To rollout WPD to a wider set of production binaries, we did a pre-release test to catch source code UB [1] that might be exposed by WPD if any.
  • With WPD enabled, a couple of unit tests failed its death tests [2], where the test program is expected to terminate but it doesn't. The test failure is consistently reproducible (i.e., not flaky) and the pattern is illustrated as
    Death test: ptr-&gt;VirtualMethod(args)
      Result: failed to die.
    Error msg:
      ...
    
  • Debugging shows the test doesn't terminate as expected because the virtual method that contains LOG(FATAL) or CHECK failures are unreachable and thereby not counted as WPD de-virtualizable target.

[0] https://abseil.io/docs/cpp/guides/logging#CHECK and https://abseil.io/docs/cpp/guides/logging#configuration-and-flags
[1] For example, the source code has a virtual call that is performed on a pointer with a different type from reinterpret_cast. This relies on undefined behavior.
[2] https://google.github.io/googletest/reference/assertions.html#death


Full diff: https://github.com/llvm/llvm-project/pull/115668.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp (+17)
  • (modified) llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll (+8-4)
diff --git a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
index 78516cadcf2313..f72cffcf75d2dd 100644
--- a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
+++ b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
@@ -167,6 +167,18 @@ static cl::list<std::string>
                       cl::desc("Prevent function(s) from being devirtualized"),
                       cl::Hidden, cl::CommaSeparated);
 
+/// A function is unreachable if its entry block ends with 'unreachable' IR
+/// instruction. In some cases, the program intends to run such functions and
+/// terminate, for instance, a unit test may run a death test. A non-test
+/// program might (or allowed to) invoke such functions to report failures
+/// (whether/when it's a good practice or not is a different topic). Regard
+/// unreachable function as possible devirtualize targets to keep the program
+/// behavior.
+static cl::opt<bool> WholeProgramDevirtKeepUnreachableFunction(
+    "wholeprogramdevirt-keep-unreachable-function",
+    cl::desc("Regard unreachable functions as possible devirtualize targets."),
+    cl::Hidden, cl::init(true));
+
 /// If explicitly specified, the devirt module pass will stop transformation
 /// once the total number of devirtualizations reach the cutoff value. Setting
 /// this option to 0 explicitly will do 0 devirtualization.
@@ -386,6 +398,9 @@ template <> struct DenseMapInfo<VTableSlotSummary> {
 //   2) All function summaries indicate it's unreachable
 //   3) There is no non-function with the same GUID (which is rare)
 static bool mustBeUnreachableFunction(ValueInfo TheFnVI) {
+  if (WholeProgramDevirtKeepUnreachableFunction)
+    return false;
+
   if ((!TheFnVI) || TheFnVI.getSummaryList().empty()) {
     // Returns false if ValueInfo is absent, or the summary list is empty
     // (e.g., function declarations).
@@ -2241,6 +2256,8 @@ DevirtModule::lookUpFunctionValueInfo(Function *TheFn,
 
 bool DevirtModule::mustBeUnreachableFunction(
     Function *const F, ModuleSummaryIndex *ExportSummary) {
+  if (WholeProgramDevirtKeepUnreachableFunction)
+    return false;
   // First, learn unreachability by analyzing function IR.
   if (!F->isDeclaration()) {
     // A function must be unreachable if its entry block ends with an
diff --git a/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll b/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll
index 457120b9c6f410..599d3296bb163b 100644
--- a/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll
+++ b/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll
@@ -1,11 +1,15 @@
+; Test that static devirtualization doesn't happen because there are two
+; devirtualizable targets. Unreachable functions are kept in the devirtualizable
+; target set by default.
+; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt %s 2>&1 | FileCheck %s  --implicit-check-not="single-impl"
+
 ; Test that regular LTO will analyze IR, detect unreachable functions and discard unreachable functions
 ; when finding virtual call targets.
 ; In this test case, the unreachable function is the virtual deleting destructor of an abstract class.
+; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt -wholeprogramdevirt-keep-unreachable-function=false %s 2>&1 | FileCheck %s --check-prefix=DEVIRT
 
-; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt %s 2>&1 | FileCheck %s
-
-; CHECK: remark: tmp.cc:21:3: single-impl: devirtualized a call to _ZN7DerivedD0Ev
-; CHECK: remark: <unknown>:0:0: devirtualized _ZN7DerivedD0Ev
+; DEVIRT: remark: tmp.cc:21:3: single-impl: devirtualized a call to _ZN7DerivedD0Ev
+; DEVIRT: remark: <unknown>:0:0: devirtualized _ZN7DerivedD0Ev
 
 source_filename = "tmp.cc"
 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with a couple of comment suggestions

; devirtualizable targets. Unreachable functions are kept in the devirtualizable
; target set by default.
; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt %s 2>&1 | FileCheck %s --implicit-check-not="single-impl"

; Test that regular LTO will analyze IR, detect unreachable functions and discard unreachable functions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is only done under the -wholeprogramdevirt-keep-unreachable-function=false option

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -167,6 +167,18 @@ static cl::list<std::string>
cl::desc("Prevent function(s) from being devirtualized"),
cl::Hidden, cl::CommaSeparated);

/// A function is unreachable if its entry block ends with 'unreachable' IR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you note that this is useful for removing abstract class virtual destructors, which are emitted with a trap and unreachable instruction, and add a TODO to identify these more precisely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related - probably update the description which currently says that this is about pure virtual functions, but most of those are already handled due to the __cxa_pure_virtual in the vtable slot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the surrounding comment with the motivating case and a TODO for Clang's codegen, and updated PR description as suggested.

Copy link
Contributor Author

@minglotus-6 minglotus-6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for looking into this together and figuring out a viable solution forward!

; devirtualizable targets. Unreachable functions are kept in the devirtualizable
; target set by default.
; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt %s 2>&1 | FileCheck %s --implicit-check-not="single-impl"

; Test that regular LTO will analyze IR, detect unreachable functions and discard unreachable functions
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -167,6 +167,18 @@ static cl::list<std::string>
cl::desc("Prevent function(s) from being devirtualized"),
cl::Hidden, cl::CommaSeparated);

/// A function is unreachable if its entry block ends with 'unreachable' IR
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the surrounding comment with the motivating case and a TODO for Clang's codegen, and updated PR description as suggested.

@minglotus-6
Copy link
Contributor Author

I didn't find a test failure in the buildkite log (with cat <buildkite>.log | grep -B 10 -A 10 "Failed Tests ").

Local tests pass. Will merge.

@minglotus-6 minglotus-6 merged commit 47cc9db into main Nov 13, 2024
6 of 8 checks passed
@minglotus-6 minglotus-6 deleted the users/minglotus-6/spr/wpd branch November 13, 2024 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants