Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IR lower bug] Apple's clang generated code will not abort when empty std::string call pop_back(), while llvm/clang do #7920

Open
dreampiggy opened this issue Dec 26, 2023 · 0 comments

Comments

@dreampiggy
Copy link

dreampiggy commented Dec 26, 2023

Background

Recently we found that when using an empty C++ std::string and call pop_back on the instance. The program abort.
Code:

int main(int argc, const char * argv[]) {
    std::string str_audioEffect;
    str_audioEffect.pop_back(); // llvm/clang will abort; while Apple's will no-op
}

Our internal project may choose to use open-sourced llvm/clang instead of apple's Xcode bundled clang, so any behavior difference is important for us and we need to know the details.

compiler version:

  • llvm/clang
Apple clang version 13.0.0 (https://github.com/apple/llvm-project.git 8ee3f51668ac68de50d541a815f00859f4922f98)
Target: arm64-apple-darwin22.6.0
Thread model: posix
InstalledDir: /Users/Anonymous/Library/Developer/Toolchains/swift-5.9-RELEASE.xctoolchain/usr/bin/.
  • apple/clang
Apple clang version 15.0.0 (clang-1500.0.40.1)
Target: arm64-apple-darwin22.6.0
Thread model: posix
InstalledDir: /Applications/Xcode-15.0.0.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

compiler options (ignore all local paths, demo reproducible project provided in attachment):

-target arm64-apple-macos13.5 -fmessage-length\=0 -fdiagnostics-show-note-include-stack -fmacro-backtrace-limit\=0 -fno-color-diagnostics -std\=c++17 -fmodules -gmodules -fmodules-prune-interval\=86400 -fmodules-prune-after\=345600 -fmodules-validate-once-per-build-session -Wnon-modular-include-in-framework-module -Werror\=non-modular-include-in-framework-module -Wno-trigraphs -fpascal-strings -Oz -fno-common -Wno-missing-field-initializers -Wno-missing-prototypes -Werror\=return-type -Wdocumentation -Wunreachable-code -Wquoted-include-in-framework-header -Werror\=deprecated-objc-isa-usage -Werror\=objc-root-class -Wno-non-virtual-dtor -Wno-overloaded-virtual -Wno-exit-time-destructors -Wno-missing-braces -Wparentheses -Wswitch -Wunused-function -Wno-unused-label -Wno-unused-parameter -Wunused-variable -Wunused-value -Wempty-body -Wuninitialized -Wconditional-uninitialized -Wno-unknown-pragmas -Wno-shadow -Wno-four-char-constants -Wno-conversion -Wconstant-conversion -Wint-conversion -Wbool-conversion -Wenum-conversion -Wno-float-conversion -Wnon-literal-null-conversion -Wobjc-literal-conversion -Wshorten-64-to-32 -Wno-newline-eof -Wno-c++11-extensions -Wno-implicit-fallthrough -DDEBUG\=1 -isysroot /Applications/Xcode-15.0.0.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk -fstrict-aliasing -Wdeprecated-declarations -Winvalid-offsetof -g -fvisibility-inlines-hidden -Wno-sign-conversion -Winfinite-recursion -Wmove -Wcomma -Wblock-capture-autoreleasing -Wstrict-prototypes -Wrange-loop-analysis -Wno-semicolon-before-method-body -Wunguarded-availability a -c main.cpp

llvm/clang behavior

When using llvm/clang open-sourced compiler, it will generate brk instruction in binary, so the program abort at the line.

ASM (llvm/clang):

testLibCxx`main:
->  0x100003fa4 <+0>: brk    #0x1

apple/clang behavior

When trying to use Apple's clang (from Xcode 15.0.0 bundled xctoolchain), we surprisingly found that the brk is disappear and compiler generate correct call to the pop_back. And does not hit any runtime abort.

ASM (apple/clang):

testLibCxx`main:
    0x100003f64 <+0>:  sub    sp, sp, #0x30
    0x100003f68 <+4>:  stp    x29, x30, [sp, #0x20]
    0x100003f6c <+8>:  add    x29, sp, #0x20
    0x100003f70 <+12>: stp    xzr, xzr, [sp, #0x10]
    0x100003f74 <+16>: str    xzr, [sp, #0x8]
    0x100003f78 <+20>: mov    w8, #0xff
    0x100003f7c <+24>: strb   w8, [sp, #0x1f]
    0x100003f80 <+28>: sturb  wzr, [sp, #0x7]
    0x100003f84 <+32>: add    x0, sp, #0x8
    0x100003f88 <+36>: bl     0x100003f9c               ; symbol stub for: std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::~basic_string()
->  0x100003f8c <+40>: mov    w0, #0x0
    0x100003f90 <+44>: ldp    x29, x30, [sp, #0x20]
    0x100003f94 <+48>: add    sp, sp, #0x30
    0x100003f98 <+52>: ret    

Expected behavior

From C++ documentation about pop_back, it seems an undefined behavior. So both of this is strictly OK.

If the string is empty, it causes undefined behavior.
Otherwise, the function never throws exceptions (no-throw guarantee).

But I'm still curious about why the generated code is so much different (llvm/clang does not even throw C++ exception, just brk).

Is this strange behavior, just for this specify libcpp code, or more general something bugs/differences between upstream llvm with apple's internal llvm ? I need to knwo what actually happends and provide better correct behavior fixes.

So, I'm trying to use the -mllvm -print-after-all -mllvm -filter-print-funcs='main' to debug between apple/clang and llvm/clang.

IR dump differences

  • apple/clang

The IR dump shows that apple/clang will do something magic in the CorrelatedValuePropagationPass IR pass.

before CorrelatedValuePropagationPass:

%3 = i8 0 // This value is from previous basic block

%12 = zext i8 %3 to i64, !dbg !2274
%13 = add i64 %12, -1, !dbg !2218
%14 = icmp ult i64 %13, 23, !dbg !2281
call void @llvm.assume(i1 %14), !dbg !2281
%15 = trunc i64 %13 to i8, !dbg !2282
store i8 %15, ptr %2, align 1, !dbg !2283
br label %16

after CorrelatedValuePropagationPass:

%3 = i8 0 // This value is from previous basic block

%12 = zext i8 %3 to i64, !dbg !2274
%13 = add nsw i64 %12, -1, !dbg !2218
call void @llvm.assume(i1 true), !dbg !2281
%14 = trunc i64 %13 to i8, !dbg !2282
store i8 %14, ptr %2, align 1, !dbg !2283
br label %15

Note about the add -> add nsw changes, which seems effect the IR semantic.

  • llvm/clang

However, llvm/clang does not do anything like apple's one in CorrelatedValuePropagationPass pass. It keeps the same IR in and out:

before and after CorrelatedValuePropagationPass:

%8 = i8 0 // This value is from previous basic block
%9 = icmp slt i8 %8, 0, !dbg !2204

%12 = zext i8 %8 to i64, !dbg !2205
%13 = select i1 %9, i64 %11, i64 %12, !dbg !2205
%14 = add i64 %13, -1, !dbg !2206
%18 = icmp ult i64 %14, 23, !dbg !2260
call void @llvm.assume(i1 %18), !dbg !2260
%19 = trunc i64 %14 to i8, !dbg !2261
store i8 %19, ptr %7, align 1, !dbg !2262
br label %20

So, where does the brk (actually, IR unreachable) comes in ?

After some investigate, I found that another pass GVNPass in llvm/clang do the transform. The input IR before GVNPass actually the same as the above one.

before GVNPass:

%8 = i8 0 // This value is from previous basic block
%9 = icmp slt i8 %8, 0, !dbg !2204

%12 = zext i8 %8 to i64, !dbg !2205
%13 = select i1 %9, i64 %11, i64 %12, !dbg !2205
%14 = add i64 %13, -1, !dbg !2206
%18 = icmp ult i64 %14, 23, !dbg !2260
tail call void @llvm.assume(i1 %18), !dbg !2260
%19 = trunc i64 %14 to i8, !dbg !2261
store i8 %19, ptr %7, align 1, !dbg !2262
br label %20

after GVNPass:

store i8 poison, ptr null, align 1
store i8 -1, ptr %7, align 1, !dbg !2258
br label %12

It's seems by expected. Because the %18 is -1, when using ult (unsigned less than), it will becomes UINT64_MAX, which is not less than 23, so the %18 become false.

ult: interprets the operands as unsigned values and yields true if op1 is less than op2.

For that @llvm.assume(i1, false), it will generate poison instruction, then transform to unreachable by SimplifyCFGPass, and finally lower to brk in arm64.

@llvm.assume The intrinsic allows the optimizer to assume that the provided condition is always true whenever the control flow reaches the intrinsic call. No code is generated for this intrinsic, and instructions that contribute only to the provided condition are not used for code generation. If the condition is violated during execution, the behavior is undefined.

Conclusion

So, from the above investigation result. we can draw a conclusion with the followings:

  1. apple/clang CorrelatedValuePropagationPass seems doing something optimization not correct, which turn a add instruction into add nsw instruction. Can anyone explain the reason and details about this behavior ?
  2. llvm/clang seems do more arbitrary optimization about @llvm.assume. The lower code from libcpp's std::string::pop_back implementation, which use the __builtin_assume
# define _LIBCPP_ASSERT(expression, message)                                        \
    (_LIBCPP_DIAGNOSTIC_PUSH                                                        \
    _LIBCPP_CLANG_DIAGNOSTIC_IGNORED("-Wassume")                                    \
    __builtin_assume(static_cast<bool>(expression))                                 \
    _LIBCPP_DIAGNOSTIC_POP)

template <class _CharT, class _Traits, class _Allocator>
inline _LIBCPP_CONSTEXPR_SINCE_CXX20
void
basic_string<_CharT, _Traits, _Allocator>::pop_back()
{
    _LIBCPP_ASSERT(!empty(), "string::pop_back(): string is already empty");
    __erase_to_end(size() - 1);
}

Hope for anyone who are in Apple, or someone who is expert in llvm/clang optimization pass topic for reply. Thanks.

Attachment

testLibCxx.zip
Apple's radar ID: FB13493508

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant