Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] Apply Rust patches on release/8.x #1

Closed
wants to merge 25 commits into from

Conversation

cuviper
Copy link
Member

@cuviper cuviper commented Jan 11, 2019

This PR should not actually be merged, but pushed to a new branch. I can do this once we're happy with the changes, and decide on a branch name. FWIW, upstream is naming their stable branches like release/7.x, so I suggest a rust/ prefix here, and indicating it's pre-8.0 would be nice too.

  • llvm/: All patches applied cleanly, or were already upstream, except 8b036feacf91 from Misc  llvm#131. That was a revert which I believe has since been fixed properly by D54997 -- cc @nikic for verification.

  • clang/: Rust has no patches AFAICS.

  • lld/: The one MSVC cmake patch applied fine.

  • lldb/: All patches applied cleanly except for 8e114ffe6b6a, now 40e425a255be, but I think I resolved that OK. @tromey, are these patches headed upstream?

I haven't actually tested that this works yet, nor integrated it into the rust repo. 😅

@nikic
Copy link

nikic commented Jan 11, 2019

llvm/: All patches applied cleanly, or were already upstream, except 8b036feacf91 from rust-lang/llvm#131. That was a revert which I believe has since been fixed properly by D54997 -- cc @nikic for verification.

That's correct, the revert is no longer necessary.

@alexcrichton
Copy link
Member

🎊 thanks @cuviper!

For naming, how about:

  • rustc/pre-8.0-2019-01-11 - used when we're between releases
  • rustc/8.0 - used when we've forked off an actual release

(of course open to any other suggestions)

@nikic
Copy link

nikic commented Jan 11, 2019

@alexcrichton I'd go for always using the rustc/8.0-2019-01-11 format. Even if we switch to following a release branch rather than master, we'd likely still want to rebase at some point (to switch to the point release if nothing else).

@cuviper
Copy link
Member Author

cuviper commented Jan 11, 2019

@nikic without any kind of pre-release signifier? I guess I'm OK with that if we're including the snapshot date, as opposed to our current opaque v1/v2 branches.

@tromey
Copy link

tromey commented Jan 11, 2019

[... lldb patches ...]

are these patches headed upstream?

No, upstream did not want new language plugins, and instead preferred us to fork. I'm not completely sure that the troublesome Python config patch is even needed, it might have only been for Linux.

@cuviper
Copy link
Member Author

cuviper commented Jan 15, 2019

Status: This branch is working well with my rust llvm-monorepo branch (compare). LLVM is scheduled to create the 8.x branch tomorrow, so I plan to rebase once more on that and then submit a rust PR.

bitshifter and others added 20 commits January 16, 2019 09:41
This is needed for `-C target-cpu=help` and `-C target-feature=help` in rustc
If this lines are present then we apparently get errors [1] when compiling in
the current [2] dist-i686-linux container. Attempts to upgrade both gcc and
binutils did not fix the error, so it appears that this may just be a bug in the
super old glibc we're using on the dist-i686-linux container.

We don't actually need this code anyway, so just work around these issues by
removing references to the `*64` functions. This'll get things compiling
locally and shouldn't be a regression in functionality.

[1]: https://travis-ci.org/rust-lang/rust/jobs/257578199
[2]: https://github.com/rust-lang/rust/tree/eba9d7f08ce5c90549ee52337aca0010ad566f0d/src/ci/docker/dist-i686-linux
For whatever reason this is failing the i686-freebsd builder in the Rust repo
as-of this red-hot moment. The build seems to work fine without it so let's just
remove it for now and pray there's a better fix later.

Although if you're reading this and know of a better fix, we'd love to remove
this!
Apparently glibc is so old it doesn't have the _POSIX_ARG_MAX constant. This
shouldn't affect anything we use anyway though.

https://travis-ci.org/rust-lang/rust/jobs/399333071
Can't seem to figure out how to do this without this patch...
This adds Rust support to Mangled.  I am not completely certain that
this is needed (or alternatively that it does enough, maybe
Mangled::GuessLanguage needs a Rust case).  This should be checked
before attempting to upstream.
This was needed for the Rust plugin
Add a TypeAndOrName constructor that was declared but not defined.
This is used in the Rust plugin.  See https://reviews.llvm.org/D44752
Introduce LLDB_PY_LIB_SUFFIX and use it in various places in the
build.  This lets the x.py-based build work properly without having to
set LLVM_LIBDIR_SUFFIX.

See https://bugs.llvm.org/show_bug.cgi?id=18957 for some discussion.
Sometimes the DWARF can omit information about a discriminant, for
example when an Option shares a discriminant slot with an enum that it
wraps.  In this case, lldb could crash, because the discriminant was
not found and because there was no default variant.

No test case because this relies on a compiler bug that will soon be
fixed.

Fixes llvm#16
While rebasing to master, I missed a spot where an include file was
moved.  I believe my local build was picking up an installed copy of
the header, causing it to succeed locally.
This adds "rust-enabled" to the --version output, so it's easier to
tell if lldb has rust support.
This fixes a couple of problems noticed while debugging the rust
compiler change to use DW_TAG_variant_part:

* IterableDIEChildren returned one extra DIE, because it did not
  preserve the CU in end()

* The entire block dealing with DW_TAG_variant_part was erroneously
  inside the DW_TAG_member case.
This gives numeric names to tuple fields, because lldb clients expect
fields to have names, and because using plain numbers seemed most
rust-like.

Closes llvm#21
When the discriminant is removed from an enum's members, be sure to
rename the fields of any tuple type.  This fixes a bug introduced in
yesterday's patch.
Prepend an underscore to field names when emitting a C structure, to
ensure that tuple fields have valid names.
Remove the by-name cache from RustASTContext.  This was not needed and
could interact badly with the DWARF parser.  Closes llvm#22
nikic referenced this pull request in nikic/llvm-project Jul 11, 2023
The original MFS work D85368 shows good performance improvement with
Instrumented FDO. However, AutoFDO or Flow-Sensitive AutoFDO (FSAFDO)
does not show performance gain. This is mainly caused by a less
accurate profile compared to the iFDO profile.

For the past few months, we have been working to improve FSAFDO
quality, like in D145171. Taking advantage of this improvement, MFS
now shows performance improvements over FSAFDO profiles.

That being said, 2 minor changes need to be made, 1) An FS-AutoFDO
profile generation pass needs to be added right before MFS pass and an
FSAFDO profile load pass is needed when FS-AutoFDO is enabled and the
MFS flag is present. 2) MFS only applies to hot functions, because we
believe (and experiment also shows) FS-AutoFDO is more accurate about
functions that have plenty of samples than those with no or very few
samples.

With this improvement, we see a 1.2% performance improvement in clang
benchmark, 0.9% QPS improvement in our internal search benchmark, and
3%-5% improvement in internal storage benchmark.

This is #1 of the two patches that enables the improvement.

Reviewed By: wenlei, snehasish, xur

Differential Revision: https://reviews.llvm.org/D152399
nikic referenced this pull request in nikic/llvm-project Jul 13, 2023
…tput

The crash happens in clang::driver::tools::SplitDebugName when Output is
InputInfo::Nothing. It doesn't happen with standalone clang driver because
output is created in Driver::BuildJobsForActionNoCache.

Example backtrace:
```
* thread #1, name = 'clangd', stop reason = hit program assert
  * frame #0: 0x00007ffff5c4eacf libc.so.6`raise + 271
    frame #1: 0x00007ffff5c21ea5 libc.so.6`abort + 295
    frame rust-lang#2: 0x00007ffff5c21d79 libc.so.6`__assert_fail_base.cold.0 + 15
    frame rust-lang#3: 0x00007ffff5c47426 libc.so.6`__assert_fail + 70
    frame rust-lang#4: 0x000055555dc0923c clangd`clang::driver::InputInfo::getFilename(this=0x00007fffffff9398) const at InputInfo.h:84:5
    frame rust-lang#5: 0x000055555dcd0d8d clangd`clang::driver::tools::SplitDebugName(JA=0x000055555f6c6a50, Args=0x000055555f6d0b80, Input=0x00007fffffff9678, Output=0x00007fffffff9398) at CommonArgs.cpp:1275:40
    frame rust-lang#6: 0x000055555dc955a5 clangd`clang::driver::tools::Clang::ConstructJob(this=0x000055555f6c69d0, C=0x000055555f6c64a0, JA=0x000055555f6c6a50, Output=0x00007fffffff9398, Inputs=0x00007fffffff9668, Args=0x000055555f6d0b80, LinkingOutput=0x0000000000000000) const at Clang.cpp:5690:33
    frame rust-lang#7: 0x000055555dbf6b54 clangd`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fffffffb5e0, C=0x000055555f6c64a0, A=0x000055555f6c6a50, TC=0x000055555f6c4be0, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=1, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:5618:10
    frame rust-lang#8: 0x000055555dbf4ef0 clangd`clang::driver::Driver::BuildJobsForAction(this=0x00007fffffffb5e0, C=0x000055555f6c64a0, A=0x000055555f6c6a50, TC=0x000055555f6c4be0, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=1, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:5306:26
    frame rust-lang#9: 0x000055555dbeb590 clangd`clang::driver::Driver::BuildJobs(this=0x00007fffffffb5e0, C=0x000055555f6c64a0) const at Driver.cpp:4844:5
    frame rust-lang#10: 0x000055555dbe6b0f clangd`clang::driver::Driver::BuildCompilation(this=0x00007fffffffb5e0, ArgList=ArrayRef<const char *> @ 0x00007fffffffb268) at Driver.cpp:1496:3
    frame rust-lang#11: 0x000055555b0cc0d9 clangd`clang::createInvocation(ArgList=ArrayRef<const char *> @ 0x00007fffffffbb38, Opts=CreateInvocationOptions @ 0x00007fffffffbb90) at CreateInvocationFromCommandLine.cpp:53:52
    frame rust-lang#12: 0x000055555b378e7b clangd`clang::clangd::buildCompilerInvocation(Inputs=0x00007fffffffca58, D=0x00007fffffffc158, CC1Args=size=0) at Compiler.cpp:116:44
    frame rust-lang#13: 0x000055555895a6c8 clangd`clang::clangd::(anonymous namespace)::Checker::buildInvocation(this=0x00007fffffffc760, TFS=0x00007fffffffe570, Contents= Has Value=false ) at Check.cpp:212:9
    frame rust-lang#14: 0x0000555558959cec clangd`clang::clangd::check(File=(Data = "build/test.cpp", Length = 64), TFS=0x00007fffffffe570, Opts=0x00007fffffffe600) at Check.cpp:486:34
    frame rust-lang#15: 0x000055555892164a clangd`main(argc=4, argv=0x00007fffffffecd8) at ClangdMain.cpp:993:12
    frame rust-lang#16: 0x00007ffff5c3ad85 libc.so.6`__libc_start_main + 229
    frame rust-lang#17: 0x00005555585bbe9e clangd`_start + 46
```

Test Plan: ninja ClangDriverTests && tools/clang/unittests/Driver/ClangDriverTests

Differential Revision: https://reviews.llvm.org/D154602
nikic referenced this pull request in nikic/llvm-project Jul 25, 2023
BlockDecl should be invalidated because of its invalid ParmVarDecl.

Fixes #1 of llvm#64005

Differential Revision: https://reviews.llvm.org/D155984
nikic referenced this pull request in nikic/llvm-project Aug 9, 2023
TSan reports the following data race:

  Write of size 4 at 0x000109e0b160 by thread T2 (mutexes: write M0, write M1):
    #0 NativeFile::Close() File.cpp:329
    #1 ConnectionFileDescriptor::Disconnect(lldb_private::Status*) ConnectionFileDescriptorPosix.cpp:232
    rust-lang#2 Communication::Disconnect(lldb_private::Status*) Communication.cpp:61
    rust-lang#3 process_gdb_remote::ProcessGDBRemote::DidExit() ProcessGDBRemote.cpp:1164
    rust-lang#4 Process::SetExitStatus(int, char const*) Process.cpp:1097
    rust-lang#5 process_gdb_remote::ProcessGDBRemote::MonitorDebugserverProcess(...) ProcessGDBRemote.cpp:3387

  Previous read of size 4 at 0x000109e0b160 by main thread (mutexes: write M2):
    #0 NativeFile::IsValid() const File.h:393
    #1 ConnectionFileDescriptor::IsConnected() const ConnectionFileDescriptorPosix.cpp:121
    rust-lang#2 Communication::IsConnected() const Communication.cpp:79
    rust-lang#3 process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...) GDBRemoteCommunication.cpp:256
    rust-lang#4 process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...l) GDBRemoteCommunication.cpp:244
    rust-lang#5 process_gdb_remote::GDBRemoteClientBase::SendPacketAndWaitForResponseNoLock(llvm::StringRef, StringExtractorGDBRemote&) GDBRemoteClientBase.cpp:246

The problem is that in WaitForPacketNoLock's run loop, it checks that
the connection is still connected. This races with the
ConnectionFileDescriptor disconnecting. Most (but not all) access to the
IOObject in ConnectionFileDescriptorPosix is already gated by the mutex.
This patch just protects IsConnected in the same way.

Differential revision: https://reviews.llvm.org/D157347
nikic referenced this pull request in nikic/llvm-project Aug 11, 2023
TSan reports the following race:

  Write of size 8 at 0x000107707ee8 by main thread:
    #0 lldb_private::ThreadedCommunication::StartReadThread(...) ThreadedCommunication.cpp:175
    #1 lldb_private::Process::SetSTDIOFileDescriptor(...) Process.cpp:4533
    rust-lang#2 lldb_private::Platform::DebugProcess(...) Platform.cpp:1121
    rust-lang#3 lldb_private::PlatformDarwin::DebugProcess(...) PlatformDarwin.cpp:711
    rust-lang#4 lldb_private::Target::Launch(...) Target.cpp:3235
    rust-lang#5 CommandObjectProcessLaunch::DoExecute(...) CommandObjectProcess.cpp:256
    rust-lang#6 lldb_private::CommandObjectParsed::Execute(...) CommandObject.cpp:751
    rust-lang#7 lldb_private::CommandInterpreter::HandleCommand(...) CommandInterpreter.cpp:2054

  Previous read of size 8 at 0x000107707ee8 by thread T5:
    #0 lldb_private::HostThread::IsJoinable(...) const HostThread.cpp:30
    #1 lldb_private::ThreadedCommunication::StopReadThread(...) ThreadedCommunication.cpp:192
    rust-lang#2 lldb_private::Process::ShouldBroadcastEvent(...) Process.cpp:3420
    rust-lang#3 lldb_private::Process::HandlePrivateEvent(...) Process.cpp:3728
    rust-lang#4 lldb_private::Process::RunPrivateStateThread(...) Process.cpp:3914
    rust-lang#5 std::__1::__function::__func<lldb_private::Process::StartPrivateStateThread(...) function.h:356
    rust-lang#6 lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(...) HostNativeThreadBase.cpp:62
    rust-lang#7 lldb_private::HostThreadMacOSX::ThreadCreateTrampoline(...) HostThreadMacOSX.mm:18

The problem is the lack of synchronization between starting and stopping
the read thread. This patch fixes that by protecting those operations
with a mutex.

Differential revision: https://reviews.llvm.org/D157361
nikic referenced this pull request in nikic/llvm-project Aug 11, 2023
TSan reports the following data race:

  Write of size 4 at 0x000109e0b160 by thread T2 (...):
    #0 lldb_private::NativeFile::Close() File.cpp:329
    #1 lldb_private::ConnectionFileDescriptor::Disconnect(...) ConnectionFileDescriptorPosix.cpp:232
    rust-lang#2 lldb_private::Communication::Disconnect(...) Communication.cpp:61
    rust-lang#3 lldb_private::process_gdb_remote::ProcessGDBRemote::DidExit() ProcessGDBRemote.cpp:1164
    rust-lang#4 lldb_private::Process::SetExitStatus(...) Process.cpp:1097
    rust-lang#5 lldb_private::process_gdb_remote::ProcessGDBRemote::MonitorDebugserverProcess(...) ProcessGDBRemote.cpp:3387

  Previous read of size 4 at 0x000109e0b160 by main thread (...):
    #0 lldb_private::NativeFile::IsValid() const File.h:393
    #1 lldb_private::ConnectionFileDescriptor::IsConnected() const ConnectionFileDescriptorPosix.cpp:121
    rust-lang#2 lldb_private::Communication::IsConnected() const Communication.cpp:79
    rust-lang#3 lldb_private::process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...) GDBRemoteCommunication.cpp:256
    rust-lang#4 lldb_private::process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...) GDBRemoteCommunication.cpp:244
    rust-lang#5 lldb_private::process_gdb_remote::GDBRemoteClientBase::SendPacketAndWaitForResponseNoLock(...) GDBRemoteClientBase.cpp:246

I originally tried fixing the problem at the ConnectionFileDescriptor
level, but that operates on an IOObject which can have different thread
safety guarantees depending on its implementation.

For this particular issue, the problem is specific to NativeFile.
NativeFile can hold a file descriptor and/or a file stream. Throughout
its implementation, it checks if the descriptor or stream is valid and
do some operation on it if it is. While that works in a single threaded
environment, nothing prevents another thread from modifying the
descriptor or stream between the IsValid check and when it's actually
being used.

This patch prevents such issues by returning a ValueGuard RAII object.
As long as the object is in scope, the value is guaranteed by a lock.

Differential revision: https://reviews.llvm.org/D157347
nikic referenced this pull request in nikic/llvm-project Aug 17, 2023
Thread sanitizer reports the following data race:

```
WARNING: ThreadSanitizer: data race (pid=43201)
  Write of size 4 at 0x00010520c474 by thread T1 (mutexes: write M0, write M1):
    #0 lldb_private::PipePosix::CloseWriteFileDescriptor() PipePosix.cpp:242 (liblldb.18.0.0git.dylib:arm64+0x414700) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #1 lldb_private::PipePosix::Close() PipePosix.cpp:217 (liblldb.18.0.0git.dylib:arm64+0x4144e8) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    rust-lang#2 lldb_private::ConnectionFileDescriptor::Disconnect(lldb_private::Status*) ConnectionFileDescriptorPosix.cpp:239 (liblldb.18.0.0git.dylib:arm64+0x40a620) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    rust-lang#3 lldb_private::Communication::Disconnect(lldb_private::Status*) Communication.cpp:61 (liblldb.18.0.0git.dylib:arm64+0x2a9318) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    rust-lang#4 lldb_private::process_gdb_remote::ProcessGDBRemote::DidExit() ProcessGDBRemote.cpp:1167 (liblldb.18.0.0git.dylib:arm64+0x8ed984) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)

  Previous read of size 4 at 0x00010520c474 by main thread (mutexes: write M2, write M3):
    #0 lldb_private::PipePosix::CanWrite() const PipePosix.cpp:229 (liblldb.18.0.0git.dylib:arm64+0x4145e4) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #1 lldb_private::ConnectionFileDescriptor::Disconnect(lldb_private::Status*) ConnectionFileDescriptorPosix.cpp:212 (liblldb.18.0.0git.dylib:arm64+0x40a4a8) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    rust-lang#2 lldb_private::Communication::Disconnect(lldb_private::Status*) Communication.cpp:61 (liblldb.18.0.0git.dylib:arm64+0x2a9318) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    rust-lang#3 lldb_private::process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(StringExtractorGDBRemote&, lldb_private::Timeout<std::__1::ratio<1l, 1000000l>>, bool) GDBRemoteCommunication.cpp:373 (liblldb.18.0.0git.dylib:arm64+0x8b9c48) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    rust-lang#4 lldb_private::process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(StringExtractorGDBRemote&, lldb_private::Timeout<std::__1::ratio<1l, 1000000l>>, bool) GDBRemoteCommunication.cpp:243 (liblldb.18.0.0git.dylib:arm64+0x8b9904) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
```

Fix this by adding a mutex to PipePosix.

Differential Revision: https://reviews.llvm.org/D157654
nikic referenced this pull request in nikic/llvm-project Aug 25, 2023
ThreadSanitizer reports the following issue:

```
  Write of size 8 at 0x00010a70abb0 by thread T3 (mutexes: write M0):
    #0 lldb_private::ThreadList::Update(lldb_private::ThreadList&) ThreadList.cpp:741 (liblldb.18.0.0git.dylib:arm64+0x5dedf4) (BuildId: 9bced2aafa373580ae9d750d9cf79a8f32000000200000000100000000000e00)
    #1 lldb_private::Process::UpdateThreadListIfNeeded() Process.cpp:1212 (liblldb.18.0.0git.dylib:arm64+0x53bbec) (BuildId: 9bced2aafa373580ae9d750d9cf79a8f32000000200000000100000000000e00)

  Previous read of size 8 at 0x00010a70abb0 by main thread (mutexes: write M1):
    #0 lldb_private::ThreadList::GetMutex() const ThreadList.cpp:785 (liblldb.18.0.0git.dylib:arm64+0x5df138) (BuildId: 9bced2aafa373580ae9d750d9cf79a8f32000000200000000100000000000e00)
    #1 lldb_private::ThreadList::DidResume() ThreadList.cpp:656 (liblldb.18.0.0git.dylib:arm64+0x5de5c0) (BuildId: 9bced2aafa373580ae9d750d9cf79a8f32000000200000000100000000000e00)
    rust-lang#2 lldb_private::Process::PrivateResume() Process.cpp:3130 (liblldb.18.0.0git.dylib:arm64+0x53cd7c) (BuildId: 9bced2aafa373580ae9d750d9cf79a8f32000000200000000100000000000e00)
```

Fix this by only using the mutex in ThreadList and removing the one in
process entirely.

Differential Revision: https://reviews.llvm.org/D158034
nikic referenced this pull request in nikic/llvm-project Aug 25, 2023
Replace `BPFMIPeepholeTruncElim` by adding an overload for
`TargetLowering::isZExtFree()` aware that zero extension is
free for `ISD::LOAD`.

Short description
=================

The `BPFMIPeepholeTruncElim` handles two patterns:

Pattern #1:

    %1 = LDB %0, ...              %1 = LDB %0, ...
    %2 = AND_ri %1, 0xff      ->  %2 = MOV_ri %1    <-- (!)

Pattern rust-lang#2:

    bb.1:                         bb.1:
      %a = LDB %0, ...              %a = LDB %0, ...
      br %bb3                       br %bb3
    bb.2:                         bb.2:
      %b = LDB %0, ...        ->    %b = LDB %0, ...
      br %bb3                       br %bb3
    bb.3:                         bb.3:
      %1 = PHI %a, %b               %1 = PHI %a, %b
      %2 = AND_ri %1, 0xff          %2 = MOV_ri %1  <-- (!)

Plus variations:
- AND_ri_32 instead of AND_ri
- SLL/SLR instead of AND_ri
- LDH, LDW, LDB32, LDH32, LDW32

Both patterns could be handled by built-in transformations at
instruction selection phase if suitable `isZExtFree()` implementation
is provided. The idea is borrowed from `ARMTargetLowering::isZExtFree`.

When evaluating on BPF kernel selftests and remove_truncate_*.ll LLVM
test cases this revisions performs slightly better than
BPFMIPeepholeTruncElim, see "Impact" section below for details.

Commit also adds a few test cases to make sure that patterns in
question are handled.

Long description
================

Why this works: Pattern #1
--------------------------

Consider the following example:

    define i1 @foo(ptr %p) {
    entry:
      %a = load i8, ptr %p, align 1
      %cond = icmp eq i8 %a, 0
      ret i1 %cond
    }

Log for `llc -mcpu=v2 -mtriple=bpfel -debug-only=isel` command:

    ...
    Type-legalized selection DAG: %bb.0 'foo:entry'
    SelectionDAG has 13 nodes:
      t0: ch,glue = EntryToken
              t2: i64,ch = CopyFromReg t0, Register:i64 %0
            t16: i64,ch = load<(load (s8) from %ir.p), anyext from i8> t0, t2, undef:i64
          t19: i64 = and t16, Constant:i64<255>
        t17: i64 = setcc t19, Constant:i64<0>, seteq:ch
      t11: ch,glue = CopyToReg t0, Register:i64 $r0, t17
      t12: ch = BPFISD::RET_GLUE t11, Register:i64 $r0, t11:1
    ...
    Replacing.1 t19: i64 = and t16, Constant:i64<255>
    With: t16: i64,ch = load<(load (s8) from %ir.p), anyext from i8> t0, t2, undef:i64
     and 0 other values
    ...
    Optimized type-legalized selection DAG: %bb.0 'foo:entry'
    SelectionDAG has 11 nodes:
      t0: ch,glue = EntryToken
            t2: i64,ch = CopyFromReg t0, Register:i64 %0
          t20: i64,ch = load<(load (s8) from %ir.p), zext from i8> t0, t2, undef:i64
        t17: i64 = setcc t20, Constant:i64<0>, seteq:ch
      t11: ch,glue = CopyToReg t0, Register:i64 $r0, t17
      t12: ch = BPFISD::RET_GLUE t11, Register:i64 $r0, t11:1
    ...

Note:
- Optimized type-legalized selection DAG:
  - `t19 = and t16, 255` had been replaced by `t16` (load).
  - Patterns like `(and (load ... i8), 255)` are replaced by `load`
    in `DAGCombiner::BackwardsPropagateMask` called from
    `DAGCombiner::visitAND`.
  - Similarly patterns like `(shl (srl ..., 56), 56)` are replaced by
    `(and ..., 255)` in `DAGCombiner::visitSRL` (this function is huge,
    look for `TLI.shouldFoldConstantShiftPairToMask()` call).

Why this works: Pattern rust-lang#2
--------------------------

Consider the following example:

    define i1 @foo(ptr %p) {
    entry:
      %a = load i8, ptr %p, align 1
      br label %next

    next:
      %cond = icmp eq i8 %a, 0
      ret i1 %cond
    }

Consider log for `llc -mcpu=v2 -mtriple=bpfel -debug-only=isel` command.
Log for first basic block:

    Initial selection DAG: %bb.0 'foo:entry'
    SelectionDAG has 9 nodes:
      t0: ch,glue = EntryToken
      t3: i64 = Constant<0>
            t2: i64,ch = CopyFromReg t0, Register:i64 %1
          t5: i8,ch = load<(load (s8) from %ir.p)> t0, t2, undef:i64
        t6: i64 = zero_extend t5
      t8: ch = CopyToReg t0, Register:i64 %0, t6
    ...
    Replacing.1 t6: i64 = zero_extend t5
    With: t9: i64,ch = load<(load (s8) from %ir.p), zext from i8> t0, t2, undef:i64
     and 0 other values
    ...
    Optimized lowered selection DAG: %bb.0 'foo:entry'
    SelectionDAG has 7 nodes:
      t0: ch,glue = EntryToken
          t2: i64,ch = CopyFromReg t0, Register:i64 %1
        t9: i64,ch = load<(load (s8) from %ir.p), zext from i8> t0, t2, undef:i64
      t8: ch = CopyToReg t0, Register:i64 %0, t9

Note:
- Initial selection DAG:
  - `%a = load ...` is lowered as `t6 = (zero_extend (load ...))`
    w/o special `isZExtFree()` overload added by this commit
    it is instead lowered as `t6 = (any_extend (load ...))`.
  - The decision to generate `zero_extend` or `any_extend` is
    done in `RegsForValue::getCopyToRegs` called from
    `SelectionDAGBuilder::CopyValueToVirtualRegister`:
    - if `isZExtFree()` for load returns true `zero_extend` is used;
    - `any_extend` is used otherwise.
- Optimized lowered selection DAG:
  - `t6 = (any_extend (load ...))` is replaced by
    `t9 = load ..., zext from i8`
    This is done by `DagCombiner.cpp:tryToFoldExtOfLoad()` called from
    `DAGCombiner::visitZERO_EXTEND`.

Log for second basic block:

    Initial selection DAG: %bb.1 'foo:next'
    SelectionDAG has 13 nodes:
      t0: ch,glue = EntryToken
                t2: i64,ch = CopyFromReg t0, Register:i64 %0
              t4: i64 = AssertZext t2, ValueType:ch:i8
            t5: i8 = truncate t4
          t8: i1 = setcc t5, Constant:i8<0>, seteq:ch
        t9: i64 = any_extend t8
      t11: ch,glue = CopyToReg t0, Register:i64 $r0, t9
      t12: ch = BPFISD::RET_GLUE t11, Register:i64 $r0, t11:1
    ...
    Replacing.2 t18: i64 = and t4, Constant:i64<255>
    With: t4: i64 = AssertZext t2, ValueType:ch:i8
    ...
    Type-legalized selection DAG: %bb.1 'foo:next'
    SelectionDAG has 13 nodes:
      t0: ch,glue = EntryToken
              t2: i64,ch = CopyFromReg t0, Register:i64 %0
            t4: i64 = AssertZext t2, ValueType:ch:i8
          t18: i64 = and t4, Constant:i64<255>
        t16: i64 = setcc t18, Constant:i64<0>, seteq:ch
      t11: ch,glue = CopyToReg t0, Register:i64 $r0, t16
      t12: ch = BPFISD::RET_GLUE t11, Register:i64 $r0, t11:1
    ...
    Optimized type-legalized selection DAG: %bb.1 'foo:next'
    SelectionDAG has 11 nodes:
      t0: ch,glue = EntryToken
            t2: i64,ch = CopyFromReg t0, Register:i64 %0
          t4: i64 = AssertZext t2, ValueType:ch:i8
        t16: i64 = setcc t4, Constant:i64<0>, seteq:ch
      t11: ch,glue = CopyToReg t0, Register:i64 $r0, t16
      t12: ch = BPFISD::RET_GLUE t11, Register:i64 $r0, t11:1
    ...

Note:
- Initial selection DAG:
  - `t0` is an input value for this basic block, it corresponds load
    instruction (`t9`) from the first basic block.
  - It is accessed within basic block via
    `t4` (AssertZext (CopyFromReg t0, ...)).
  - The `AssertZext` is generated by RegsForValue::getCopyFromRegs
    called from SelectionDAGBuilder::getCopyFromRegs, it is generated
    only when `LiveOutInfo` with known number of leading zeros is
    present for `t0`.
  - Known register bits in `LiveOutInfo` are computed by
    `SelectionDAG::computeKnownBits` called from
    `SelectionDAGISel::ComputeLiveOutVRegInfo`.
  - `computeKnownBits()` generates leading zeros information for
    `(load ..., zext from ...)` but *does not* generate leading zeros
    information for `(load ..., anyext from ...)`.
    This is why `isZExtFree()` added in this commit is important.
- Type-legalized selection DAG:
  - `t5 = truncate t4` is replaced by `t18 = and t4, 255`
- Optimized type-legalized selection DAG:
  - `t18 = and t4, 255` is replaced by `t4`, this is done by
    `DAGCombiner::SimplifyDemandedBits` called from
    `DAGCombiner::visitAND`, which simplifies patterns like
    `(and (assertzext ...))`

Impact
------

This change covers all remove_truncate_*.ll test cases:
- for -mcpu=v4 there are no changes in the generated code;
- for -mcpu=v2 code generated for remove_truncate_7 and
  remove_truncate_8 improved slightly, for other tests it is
  unchanged.

For remove_truncate_7:

    Before this revision                 After this revision
    --------------------                 -------------------
        r1 <<= 0x20                          r1 <<= 0x20
        r1 >>= 0x20                          r1 >>= 0x20
        if r1 == 0x0 goto +0x2 <LBB0_2>      if r1 == 0x0 goto +0x2 <LBB0_2>
        r1 = *(u32 *)(r2 + 0x0)              r0 = *(u32 *)(r2 + 0x0)
        goto +0x1 <LBB0_3>                   goto +0x1 <LBB0_3>
    <LBB0_2>:                            <LBB0_2>:
        r1 = *(u32 *)(r2 + 0x4)              r0 = *(u32 *)(r2 + 0x4)
    <LBB0_3>:                            <LBB0_3>:
        r0 = r1                              exit
        exit

For remove_truncate_8:

    Before this revision                 After this revision
    --------------------                 -------------------
        r2 = *(u32 *)(r1 + 0x0)              r2 = *(u32 *)(r1 + 0x0)
        r3 = r2                              r3 = r2
        r3 <<= 0x20                          r3 <<= 0x20
        r4 = r3                              r3 s>>= 0x20
        r4 s>>= 0x20
        if r4 s> 0x2 goto +0x5 <LBB0_3>      if r3 s> 0x2 goto +0x4 <LBB0_3>
        r4 = *(u32 *)(r1 + 0x4)              r3 = *(u32 *)(r1 + 0x4)
        r3 >>= 0x20
        if r3 >= r4 goto +0x2 <LBB0_3>       if r2 >= r3 goto +0x2 <LBB0_3>
        r2 += 0x2                            r2 += 0x2
        *(u32 *)(r1 + 0x0) = r2              *(u32 *)(r1 + 0x0) = r2
    <LBB0_3>:                            <LBB0_3>:
        r0 = 0x3                             r0 = 0x3
        exit                                 exit

For kernel BPF selftests statistics is as follows: (-mcpu=v4):
- For -mcpu=v4: 9 out of 655 object files have differences,
  in all cases total number of instructions marginally decreased
  (-27 instructions).
- For -mcpu=v2: 9 out of 655 object files have differences:
  - For 19 object files number of instruction decreased
    (-129 instruction in total): some redundant `rX &= 0xffff`
    and register to register assignments removed;
  - For 2 object files number of instructions increased +2
    instructions in each file.

Both -mcpu=v2 instruction increases could be reduced to the same
example:

    define void @foo(ptr %p) {
    entry:
      %a = load i32, ptr %p, align 4
      %b = sext i32 %a to i64
      %c = icmp ult i64 1, %b
      br i1 %c, label %next, label %end

    next:
      call void inttoptr (i64 62 to ptr)(i32 %a)
      br label %end

    end:
      ret void
    }

Note that this example uses value loaded to `%a` both as a sign
extended (`%b`) and as zero extended (`%a` passed as parameter).
Here is the difference in final assembly code:

    Before this revision          After this revision
    --------------------          -------------------
        r1 = *(u32 *)(r1 + 0)         r1 = *(u32 *)(r1 + 0)
        r1 <<= 32                     r1 <<= 32
        r1 s>>= 32                    r1 s>>= 32
        if r1 < 2 goto <LBB0_2>       if r1 < 2 goto <LBB0_2>
                                      r1 <<= 32
                                      r1 >>= 32
        call 62                       call 62
    <LBB0_2>:                     <LBB0_2>:
        exit                          exit

Before this commit `%a` is passed to call as a sign extended value,
after this commit `%a` is passed to call as a zero extended value,
both are correct as 32-bit sub-register is the same.

The difference comes from `DAGCombiner` operation on the initial DAG:

Initial selection DAG before this commit:

    t5: i32,ch = load<(load (s32) from %ir.p)> t0, t2, undef:i64
          t6: i64 = any_extend t5         <--------------------- (1)
        t8: ch = CopyToReg t0, Register:i64 %0, t6
            t9: i64 = sign_extend t5
          t12: i1 = setcc Constant:i64<1>, t9, setult:ch

Initial selection DAG after this commit:

    t5: i32,ch = load<(load (s32) from %ir.p)> t0, t2, undef:i64
          t6: i64 = zero_extend t5        <--------------------- (2)
        t8: ch = CopyToReg t0, Register:i64 %0, t6
            t9: i64 = sign_extend t5
          t12: i1 = setcc Constant:i64<1>, t9, setult:ch

The node `t9` is processed before node `t6` and `load` instruction is
combined to load with sign extension:

    Replacing.1 t9: i64 = sign_extend t5
    With: t30: i64,ch = load<(load (s32) from %ir.p), sext from i32> t0, t2, undef:i64
     and 0 other values
    Replacing.1 t5: i32,ch = load<(load (s32) from %ir.p)> t0, t2, undef:i64
    With: t31: i32 = truncate t30
     and 1 other values

This is done by `DAGCombiner.cpp:tryToFoldExtOfLoad` called from
`DAGCombiner::visitSIGN_EXTEND`. Note that `t5` is used by `t6` which
is `any_extend` in (1) and `zero_extend` in (2).
`tryToFoldExtOfLoad()` rewrites such uses of `t5` differently:
- `any_extend` is simply removed
- `zero_extend` is replaced by `and t30, 0xffffffff`, which is later
  converted to a pair of shifts. This pair of shifts survives till the
  end of translation.

Differential Revision: https://reviews.llvm.org/D157870
nikic referenced this pull request in nikic/llvm-project Aug 29, 2023
This reverts commit 0e63f1a.

clang-format started to crash with contents like:
a.h:
```
```
$ clang-format a.h
```
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: ../llvm/build/bin/clang-format a.h
 #0 0x0000560b689fe177 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x0000560b689fbfbe llvm::sys::RunSignalHandlers() /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Signals.cpp:106:18
 rust-lang#2 0x0000560b689feaca SignalHandler(int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:413:1
 rust-lang#3 0x00007f030405a540 (/lib/x86_64-linux-gnu/libc.so.6+0x3c540)
 rust-lang#4 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/include/clang/Lex/Token.h:98:44
 rust-lang#5 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:562:51
 rust-lang#6 0x0000560b68a9a980 startsSequenceInternal<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:831:9
 rust-lang#7 0x0000560b68a9a980 startsSequence<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:600:12
 rust-lang#8 0x0000560b68a9a980 getFunctionName /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3131:17
 rust-lang#9 0x0000560b68a9a980 clang::format::TokenAnnotator::annotate(clang::format::AnnotatedLine&) /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3191:17
Segmentation fault
```
nikic referenced this pull request in nikic/llvm-project Sep 14, 2023
…ttempting to dereferencing iterators.

Runnign some tests with asan built of LLD would throw errors similar to the following:

AddressSanitizer:DEADLYSIGNAL
    #0 0x55d8e6da5df7 in operator() /mnt/ssd/repo/lld/llvm-project/lld/MachO/Arch/ARM64.cpp:612
    #1 0x55d8e6daa514 in operator() /mnt/ssd/repo/lld/llvm-project/lld/MachO/Arch/ARM64.cpp:650

Differential Revision: https://reviews.llvm.org/D157027
nikic referenced this pull request in nikic/llvm-project Sep 26, 2023
Summary:
Thread sanitizer reports the following data race:

```
  Write of size 8 at 0x000103303e70 by thread T1 (mutexes: write M0):
    #0 RNBRemote::CommDataReceived(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) RNBRemote.cpp:1075 (debugserver:arm64+0x100038db8) (BuildId: f130b34f693c4f3eba96139104af2b7132000000200000000100000000000e00)
    #1 RNBRemote::ThreadFunctionReadRemoteData(void*) RNBRemote.cpp:1180 (debugserver:arm64+0x1000391dc) (BuildId: f130b34f693c4f3eba96139104af2b7132000000200000000100000000000e00)

  Previous read of size 8 at 0x000103303e70 by main thread:
    #0 RNBRemote::GetPacketPayload(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) RNBRemote.cpp:797 (debugserver:arm64+0x100037c5c) (BuildId: f130b34f693c4f3eba96139104af2b7132000000200000000100000000000e00)
    #1 RNBRemote::GetPacket(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, RNBRemote::Packet&, bool) RNBRemote.cpp:907 (debugserver:arm64+0x1000378cc) (BuildId: f130b34f693c4f3eba96139104af2b7132000000200000000100000000000e00)
```

RNBRemote already has a mutex, extend its usage to protect the read of
m_rx_packets.

Reviewers: jdevlieghere, bulbazord, jingham

Subscribers:
nikic referenced this pull request in nikic/llvm-project Oct 2, 2023
…fine.parallel verifier

This patch updates AffineParallelOp::verify() to check each result type matches
its corresponding reduction op (i.e, the result type must be a `FloatType` if
the reduction attribute is `addf`)

affine.parallel will crash on --lower-affine if the corresponding result type
cannot match the reduction attribute.

```
      %128 = affine.parallel (%arg2, %arg3) = (0, 0) to (8, 7) reduce ("maxf") -> (memref<8x7xf32>) {
        %alloc_33 = memref.alloc() : memref<8x7xf32>
        affine.yield %alloc_33 : memref<8x7xf32>
      }
```
This will crash and report a type conversion issue when we run `mlir-opt --lower-affine`

```
Assertion failed: (isa<To>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file Casting.h, line 572.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: mlir-opt --lower-affine temp.mlir
 #0 0x0000000102a18f18 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/workspacebin/mlir-opt+0x1002f8f18)
 #1 0x0000000102a171b4 llvm::sys::RunSignalHandlers() (/workspacebin/mlir-opt+0x1002f71b4)
 rust-lang#2 0x0000000102a195c4 SignalHandler(int) (/workspacebin/mlir-opt+0x1002f95c4)
 rust-lang#3 0x00000001be7894c4 (/usr/lib/system/libsystem_platform.dylib+0x1803414c4)
 rust-lang#4 0x00000001be771ee0 (/usr/lib/system/libsystem_pthread.dylib+0x180329ee0)
 rust-lang#5 0x00000001be6ac340 (/usr/lib/system/libsystem_c.dylib+0x180264340)
 rust-lang#6 0x00000001be6ab754 (/usr/lib/system/libsystem_c.dylib+0x180263754)
 rust-lang#7 0x0000000106864790 mlir::arith::getIdentityValueAttr(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (.cold.4) (/workspacebin/mlir-opt+0x104144790)
 rust-lang#8 0x0000000102ba66ac mlir::arith::getIdentityValueAttr(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (/workspacebin/mlir-opt+0x1004866ac)
 rust-lang#9 0x0000000102ba6910 mlir::arith::getIdentityValue(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (/workspacebin/mlir-opt+0x100486910)
...
```

Fixes llvm#64068

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D157985
nikic referenced this pull request in nikic/llvm-project Oct 5, 2023
This reverts commit a1e81d2.

Revert "Fix test hip-offload-compress-zlib.hip"

This reverts commit ba01ce6.

Revert due to sanity fail at

https://lab.llvm.org/buildbot/#/builders/5/builds/37188

https://lab.llvm.org/buildbot/#/builders/238/builds/5955

/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/clang/lib/Driver/OffloadBundler.cpp:1012:25: runtime error: load of misaligned address 0xaaaae2d90e7c for type 'const uint64_t' (aka 'const unsigned long'), which requires 8 byte alignment
0xaaaae2d90e7c: note: pointer points here
  bc 00 00 00 94 dc 29 9a  89 fb ca 2b 78 9c 8b 8f  77 f6 71 f4 73 8f f7 77  73 f3 f1 77 74 89 77 0a
              ^
    #0 0xaaaaba125f70 in clang::CompressedOffloadBundle::decompress(llvm::MemoryBuffer const&, bool) /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/clang/lib/Driver/OffloadBundler.cpp:1012:25
    #1 0xaaaaba126150 in clang::OffloadBundler::ListBundleIDsInFile(llvm::StringRef, clang::OffloadBundlerConfig const&) /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/clang/lib/Driver/OffloadBundler.cpp:1089:7

Will reland after fixing it.
alexcrichton pushed a commit to alexcrichton/llvm-project that referenced this pull request Mar 18, 2024
TestCases/Misc/Linux/sigaction.cpp fails because dlsym() may call malloc
on failure. And then the wrapped malloc appears to access thread local
storage using global dynamic accesses, thus calling
___interceptor___tls_get_addr, before REAL(__tls_get_addr) has
been set, so we get a crash inside ___interceptor___tls_get_addr. For
example, this can happen when looking up __isoc23_scanf which might not
exist in some libcs.

Fix this by marking the thread local variable accessed inside the
debug checks as "initial-exec", which does not require __tls_get_addr.

This is probably a better alternative to llvm#83886.

This fixes a different crash but is related to llvm#46204.

Backtrace:
```
#0 0x0000000000000000 in ?? ()
rust-lang#1 0x00007ffff6a9d89e in ___interceptor___tls_get_addr (arg=0x7ffff6b27be8) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:2759
rust-lang#2 0x00007ffff6a46bc6 in __sanitizer::CheckedMutex::LockImpl (this=0x7ffff6b27be8, pc=140737331846066) at /path/to/llvm/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:218
rust-lang#3 0x00007ffff6a448b2 in __sanitizer::CheckedMutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:129
rust-lang#4 __sanitizer::Mutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:167
rust-lang#5 0x00007ffff6abdbb2 in __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock (mu=0x730000000580, this=<optimized out>) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:383
rust-lang#6 __sanitizer::SizeClassAllocator64<__tsan::AP64>::GetFromAllocator (this=0x7ffff7487dc0 <__tsan::allocator_placeholder>, stat=stat@entry=0x7ffff570db68, class_id=11, chunks=chunks@entry=0x7ffff5702cc8, n_chunks=n_chunks@entry=128) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_primary64.h:207
rust-lang#7 0x00007ffff6abdaa0 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Refill (this=<optimized out>, c=c@entry=0x7ffff5702cb8, allocator=<optimized out>, class_id=<optimized out>)
 at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:103
rust-lang#8 0x00007ffff6abd731 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Allocate (this=0x7ffff6b27be8, allocator=0x7ffff5702cc8, class_id=140737311157448)
 at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:39
rust-lang#9 0x00007ffff6abc397 in __sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocator64<__tsan::AP64>, __sanitizer::LargeMmapAllocatorPtrArrayDynamic>::Allocate (this=0x7ffff5702cc8, cache=0x7ffff6b27be8, size=<optimized out>, size@entry=175, alignment=alignment@entry=16)
 at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_combined.h:69
rust-lang#10 0x00007ffff6abaa6a in __tsan::user_alloc_internal (thr=0x7ffff7ebd980, pc=140737331499943, sz=sz@entry=175, align=align@entry=16, signal=true) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:198
rust-lang#11 0x00007ffff6abb0d1 in __tsan::user_alloc (thr=0x7ffff6b27be8, pc=140737331846066, sz=11, sz@entry=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:223
rust-lang#12 0x00007ffff6a693b5 in ___interceptor_malloc (size=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:666
rust-lang#13 0x00007ffff7fce7f2 in malloc (size=175) at ../include/rtld-malloc.h:56
rust-lang#14 __GI__dl_exception_create_format (exception=exception@entry=0x7fffffffd0d0, objname=0x7ffff7fc3550 "/path/to/llvm/compiler-rt/cmake-build-all-sanitizers/lib/linux/libclang_rt.tsan-x86_64.so",
 fmt=fmt@entry=0x7ffff7ff2db9 "undefined symbol: %s%s%s") at ./elf/dl-exception.c:157
rust-lang#15 0x00007ffff7fd50e8 in _dl_lookup_symbol_x (undef_name=0x7ffff6af868b "__isoc23_scanf", undef_map=<optimized out>, ref=0x7fffffffd148, symbol_scope=<optimized out>, version=<optimized out>, type_class=0, flags=2, skip_map=0x7ffff7fc35e0) at ./elf/dl-lookup.c:793
--Type <RET> for more, q to quit, c to continue without paging--
rust-lang#16 0x00007ffff656d6ed in do_sym (handle=<optimized out>, name=0x7ffff6af868b "__isoc23_scanf", who=0x7ffff6a3bb84 <__interception::InterceptFunction(char const*, unsigned long*, unsigned long, unsigned long)+36>, vers=vers@entry=0x0, flags=flags@entry=2) at ./elf/dl-sym.c:146
rust-lang#17 0x00007ffff656d9dd in _dl_sym (handle=<optimized out>, name=<optimized out>, who=<optimized out>) at ./elf/dl-sym.c:195
rust-lang#18 0x00007ffff64a2854 in dlsym_doit (a=a@entry=0x7fffffffd3b0) at ./dlfcn/dlsym.c:40
rust-lang#19 0x00007ffff7fcc489 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffd310, operate=0x7ffff64a2840 <dlsym_doit>, args=0x7fffffffd3b0) at ./elf/dl-catch.c:237
rust-lang#20 0x00007ffff7fcc5af in _dl_catch_error (objname=0x7fffffffd368, errstring=0x7fffffffd370, mallocedp=0x7fffffffd367, operate=<optimized out>, args=<optimized out>) at ./elf/dl-catch.c:256
rust-lang#21 0x00007ffff64a2257 in _dlerror_run (operate=operate@entry=0x7ffff64a2840 <dlsym_doit>, args=args@entry=0x7fffffffd3b0) at ./dlfcn/dlerror.c:138
rust-lang#22 0x00007ffff64a28e5 in dlsym_implementation (dl_caller=<optimized out>, name=<optimized out>, handle=<optimized out>) at ./dlfcn/dlsym.c:54
rust-lang#23 ___dlsym (handle=<optimized out>, name=<optimized out>) at ./dlfcn/dlsym.c:68
rust-lang#24 0x00007ffff6a3bb84 in __interception::GetFuncAddr (name=0x7ffff6af868b "__isoc23_scanf", trampoline=140737311157448) at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:42
rust-lang#25 __interception::InterceptFunction (name=0x7ffff6af868b "__isoc23_scanf", ptr_to_real=0x7ffff74850e8 <__interception::real___isoc23_scanf>, func=11, trampoline=140737311157448)
 at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:61
rust-lang#26 0x00007ffff6a9f2d9 in InitializeCommonInterceptors () at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_common_interceptors.inc:10315
```

Reviewed By: vitalybuka, MaskRay

Pull Request: llvm#83890
alexcrichton pushed a commit to alexcrichton/llvm-project that referenced this pull request Mar 18, 2024
Modifies the privatization logic so that the emitted code only used the
HLFIR base (i.e. SSA value `#0` returned from `hlfir.declare`). Before
that, that emitted privatization logic was a mix of using `#0` and `rust-lang#1`
which leads to some difficulties trying to move to delayed privatization
(see the discussion on llvm#84033).
alexcrichton pushed a commit to alexcrichton/llvm-project that referenced this pull request Mar 18, 2024
…p canonicalization (llvm#84225)

The current canonicalization of `memref.dim` operating on the result of
`memref.reshape` into `memref.load` is incorrect as it doesn't check
whether the `index` operand of `memref.dim` dominates the source
`memref.reshape` op. It always introduces `memref.load` right after
`memref.reshape` to ensure the `memref` is not mutated before the
`memref.load` call. As a result, the following error is observed:

```
$> mlir-opt --canonicalize input.mlir

func.func @reshape_dim(%arg0: memref<*xf32>, %arg1: memref<?xindex>, %arg2: index) -> index {
    %c4 = arith.constant 4 : index
    %reshape = memref.reshape %arg0(%arg1) : (memref<*xf32>, memref<?xindex>) -> memref<*xf32>
    %0 = arith.muli %arg2, %c4 : index
    %dim = memref.dim %reshape, %0 : memref<*xf32>
    return %dim : index
  }
```

results in:

```
dominator.mlir:22:12: error: operand rust-lang#1 does not dominate this use
    %dim = memref.dim %reshape, %0 : memref<*xf32>
           ^
dominator.mlir:22:12: note: see current operation: %1 = "memref.load"(%arg1, %2) <{nontemporal = false}> : (memref<?xindex>, index) -> index
dominator.mlir:21:10: note: operand defined here (op in the same block)
    %0 = arith.muli %arg2, %c4 : index
```

Properly fixing this issue requires a dominator analysis which is
expensive to run within a canonicalization pattern. So, this patch fixes
the canonicalization pattern by being more strict/conservative about the
legality condition in which we perform this canonicalization.
The more general pattern is also added to `tensor.dim`. Since tensors are
immutable we don't need to worry about where to introduce the
`tensor.extract` call after canonicalization.
alexcrichton pushed a commit to alexcrichton/llvm-project that referenced this pull request Mar 18, 2024
…lvm#85653)

This reverts commit daebe5c.

This commit causes the following asan issue:

```
<snip>/llvm-project/build/bin/mlir-opt <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir | <snip>/llvm-project/build/bin/FileCheck <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir
# executed command: <snip>/llvm-project/build/bin/mlir-opt <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir
# .---command stderr------------
# | =================================================================
# | ==2772558==ERROR: AddressSanitizer: stack-use-after-return on address 0x7fd2c2c42b90 at pc 0x55e406d54614 bp 0x7ffc810e4070 sp 0x7ffc810e4068
# | READ of size 8 at 0x7fd2c2c42b90 thread T0
# |     #0 0x55e406d54613 in operator()<long int const*> /usr/include/c++/13/bits/predefined_ops.h:318
# |     rust-lang#1 0x55e406d54613 in __count_if<long int const*, __gnu_cxx::__ops::_Iter_pred<mlir::verifyListOfOperandsOrIntegers(Operation*, llvm::StringRef, unsigned int, llvm::ArrayRef<long int>, ValueRange)::<lambda(int64_t)> > > /usr/include/c++/13/bits/stl_algobase.h:2125
# |     rust-lang#2 0x55e406d54613 in count_if<long int const*, mlir::verifyListOfOperandsOrIntegers(Operation*, 
...
```
wesleywiser pushed a commit to wesleywiser/llvm-project that referenced this pull request Jul 17, 2024
This patch adds a frame recognizer for Clang's
`__builtin_verbose_trap`, which behaves like a
`__builtin_trap`, but emits a failure-reason string into debug-info in
order for debuggers to display
it to a user.

The frame recognizer triggers when we encounter
a frame with a function name that begins with
`__clang_trap_msg`, which is the magic prefix
Clang emits into debug-info for verbose traps.
Once such frame is encountered we display the
frame function name as the `Stop Reason` and display that frame to the
user.

Example output:
```
(lldb) run
warning: a.out was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 35942 launched: 'a.out' (arm64)
Process 35942 stopped
* thread rust-lang#1, queue = 'com.apple.main-thread', stop reason = Misc.: Function is not implemented
    frame rust-lang#1: 0x0000000100003fa4 a.out`main [inlined] Dummy::func(this=<unavailable>) at verbose_trap.cpp:3:5 [opt]
   1    struct Dummy {
   2      void func() {
-> 3        __builtin_verbose_trap("Misc.", "Function is not implemented");
   4      }
   5    };
   6
   7    int main() {
(lldb) bt
* thread rust-lang#1, queue = 'com.apple.main-thread', stop reason = Misc.: Function is not implemented
    frame #0: 0x0000000100003fa4 a.out`main [inlined] __clang_trap_msg$Misc.$Function is not implemented$ at verbose_trap.cpp:0 [opt]
  * frame rust-lang#1: 0x0000000100003fa4 a.out`main [inlined] Dummy::func(this=<unavailable>) at verbose_trap.cpp:3:5 [opt]
    frame rust-lang#2: 0x0000000100003fa4 a.out`main at verbose_trap.cpp:8:13 [opt]
    frame rust-lang#3: 0x0000000189d518b4 dyld`start + 1988
```
DianQK pushed a commit that referenced this pull request Aug 6, 2024
```
  UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp
```
`FAIL`s on 32 and 64-bit Linux/sparc64 (and on Solaris/sparcv9, too: the
test isn't Linux-specific at all). With
`UBSAN_OPTIONS=fast_unwind_on_fatal=1`, the stack trace shows a
duplicate innermost frame:
```
compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:31: runtime error: execution reached the end of a value-returning function without returning a value
    #0 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
    #1 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
    #2 0x7003a714 in g() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:17:38
```
which isn't seen with `fast_unwind_on_fatal=0`.

This turns out to be another fallout from fixing
`__builtin_return_address`/`__builtin_extract_return_addr` on SPARC. In
`sanitizer_stacktrace_sparc.cpp` (`BufferedStackTrace::UnwindFast`) the
`pc` arg is the return address, while `pc1` from the stack frame
(`fr_savpc`) is the address of the `call` insn, leading to a double
entry for the innermost frame in `trace_buffer[]`.

This patch fixes this by moving the adjustment before all uses.

Tested on `sparc64-unknown-linux-gnu` and `sparcv9-sun-solaris2.11`
(with the `ubsan/TestCases/Misc/Linux` tests enabled).

(cherry picked from commit 3368a32)
wesleywiser pushed a commit to wesleywiser/llvm-project that referenced this pull request Aug 11, 2024
…linux (llvm#99613)

Examples of the output:

ARM:
```
# ./a.out 
AddressSanitizer:DEADLYSIGNAL
=================================================================
==122==ERROR: AddressSanitizer: SEGV on unknown address 0x0000007a (pc 0x76e13ac0 bp 0x7eb7fd00 sp 0x7eb7fcc8 T0)
==122==The signal is caused by a READ memory access.
==122==Hint: address points to the zero page.
    #0 0x76e13ac0  (/lib/libc.so.6+0x7cac0)
    rust-lang#1 0x76dce680 in gsignal (/lib/libc.so.6+0x37680)
    rust-lang#2 0x005c2250  (/root/a.out+0x145250)
    rust-lang#3 0x76db982c  (/lib/libc.so.6+0x2282c)
    rust-lang#4 0x76db9918 in __libc_start_main (/lib/libc.so.6+0x22918)

==122==Register values:
 r0 = 0x00000000   r1 = 0x0000007a   r2 = 0x0000000b   r3 = 0x76d95020  
 r4 = 0x0000007a   r5 = 0x00000001   r6 = 0x005dcc5c   r7 = 0x0000010c  
 r8 = 0x0000000b   r9 = 0x76f9ece0  r10 = 0x00000000  r11 = 0x7eb7fd00  
r12 = 0x76dce670   sp = 0x7eb7fcc8   lr = 0x76e13ab4   pc = 0x76e13ac0  
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/lib/libc.so.6+0x7cac0) 
==122==ABORTING
```

AArch64:
```
# ./a.out 
UndefinedBehaviorSanitizer:DEADLYSIGNAL
==99==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000063 (pc 0x007fbbbc5860 bp 0x007fcfdcb700 sp 0x007fcfdcb700 T99)
==99==The signal is caused by a UNKNOWN memory access.
==99==Hint: address points to the zero page.
    #0 0x007fbbbc5860  (/lib64/libc.so.6+0x82860)
    rust-lang#1 0x007fbbb81578  (/lib64/libc.so.6+0x3e578)
    rust-lang#2 0x00556051152c  (/root/a.out+0x3152c)
    rust-lang#3 0x007fbbb6e268  (/lib64/libc.so.6+0x2b268)
    rust-lang#4 0x007fbbb6e344  (/lib64/libc.so.6+0x2b344)
    rust-lang#5 0x0055604e45ec  (/root/a.out+0x45ec)

==99==Register values:
 x0 = 0x0000000000000000   x1 = 0x0000000000000063   x2 = 0x000000000000000b   x3 = 0x0000007fbbb41440  
 x4 = 0x0000007fbbb41580   x5 = 0x3669288942d44cce   x6 = 0x0000000000000000   x7 = 0x00000055605110b0  
 x8 = 0x0000000000000083   x9 = 0x0000000000000000  x10 = 0x0000000000000000  x11 = 0x0000000000000000  
x12 = 0x0000007fbbdb3360  x13 = 0x0000000000010000  x14 = 0x0000000000000039  x15 = 0x00000000004113a0  
x16 = 0x0000007fbbb81560  x17 = 0x0000005560540138  x18 = 0x000000006474e552  x19 = 0x0000000000000063  
x20 = 0x0000000000000001  x21 = 0x000000000000000b  x22 = 0x0000005560511510  x23 = 0x0000007fcfdcb918  
x24 = 0x0000007fbbdb1b50  x25 = 0x0000000000000000  x26 = 0x0000007fbbdb2000  x27 = 0x000000556053f858  
x28 = 0x0000000000000000   fp = 0x0000007fcfdcb700   lr = 0x0000007fbbbc584c   sp = 0x0000007fcfdcb700  
UndefinedBehaviorSanitizer can not provide additional info.
SUMMARY: UndefinedBehaviorSanitizer: SEGV (/lib64/libc.so.6+0x82860) 
==99==ABORTING
```
wesleywiser pushed a commit to wesleywiser/llvm-project that referenced this pull request Aug 11, 2024
```
  UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp
```
`FAIL`s on 32 and 64-bit Linux/sparc64 (and on Solaris/sparcv9, too: the
test isn't Linux-specific at all). With
`UBSAN_OPTIONS=fast_unwind_on_fatal=1`, the stack trace shows a
duplicate innermost frame:
```
compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:31: runtime error: execution reached the end of a value-returning function without returning a value
    #0 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
    rust-lang#1 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
    rust-lang#2 0x7003a714 in g() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:17:38
```
which isn't seen with `fast_unwind_on_fatal=0`.

This turns out to be another fallout from fixing
`__builtin_return_address`/`__builtin_extract_return_addr` on SPARC. In
`sanitizer_stacktrace_sparc.cpp` (`BufferedStackTrace::UnwindFast`) the
`pc` arg is the return address, while `pc1` from the stack frame
(`fr_savpc`) is the address of the `call` insn, leading to a double
entry for the innermost frame in `trace_buffer[]`.

This patch fixes this by moving the adjustment before all uses.

Tested on `sparc64-unknown-linux-gnu` and `sparcv9-sun-solaris2.11`
(with the `ubsan/TestCases/Misc/Linux` tests enabled).
wesleywiser pushed a commit to wesleywiser/llvm-project that referenced this pull request Aug 21, 2024
…lvm#104148)

`hasOperands` does not always execute matchers in the order they are
written. This can cause issue in code using bindings when one operand
matcher is relying on a binding set by the other. With this change, the
first matcher present in the code is always executed first and any
binding it sets are available to the second matcher.

Simple example with current version (1 match) and new version (2
matches):
```bash
> cat tmp.cpp
int a = 13;
int b = ((int) a) - a;
int c = a - ((int) a);

> clang-query tmp.cpp
clang-query> set traversal IgnoreUnlessSpelledInSource
clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d"))))))

Match rust-lang#1:

tmp.cpp:1:1: note: "d" binds here
int a = 13;
^~~~~~~~~~
tmp.cpp:2:9: note: "root" binds here
int b = ((int)a) - a;
        ^~~~~~~~~~~~
1 match.

> ./build/bin/clang-query tmp.cpp
clang-query> set traversal IgnoreUnlessSpelledInSource
clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d"))))))

Match rust-lang#1:

tmp.cpp:1:1: note: "d" binds here
    1 | int a = 13;
      | ^~~~~~~~~~
tmp.cpp:2:9: note: "root" binds here
    2 | int b = ((int)a) - a;
      |         ^~~~~~~~~~~~

Match rust-lang#2:

tmp.cpp:1:1: note: "d" binds here
    1 | int a = 13;
      | ^~~~~~~~~~
tmp.cpp:3:9: note: "root" binds here
    3 | int c = a - ((int)a);
      |         ^~~~~~~~~~~~
2 matches.
```

If this should be documented or regression tested anywhere please let me
know where.
wesleywiser pushed a commit to wesleywiser/llvm-project that referenced this pull request Aug 21, 2024
…104523)

Compilers and language runtimes often use helper functions that are
fundamentally uninteresting when debugging anything but the
compiler/runtime itself. This patch introduces a user-extensible
mechanism that allows for these frames to be hidden from backtraces and
automatically skipped over when navigating the stack with `up` and
`down`.

This does not affect the numbering of frames, so `f <N>` will still
provide access to the hidden frames. The `bt` output will also print a
hint that frames have been hidden.

My primary motivation for this feature is to hide thunks in the Swift
programming language, but I'm including an example recognizer for
`std::function::operator()` that I wished for myself many times while
debugging LLDB.

rdar://126629381


Example output. (Yes, my proof-of-concept recognizer could hide even
more frames if we had a method that returned the function name without
the return type or I used something that isn't based off regex, but it's
really only meant as an example).

before:
```
(lldb) thread backtrace --filtered=false
* thread rust-lang#1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
    frame rust-lang#1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
    frame rust-lang#2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
    frame rust-lang#3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12
    frame rust-lang#4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10
    frame rust-lang#5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12
    frame rust-lang#6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
    frame rust-lang#7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
    frame rust-lang#8: 0x0000000183cdf154 dyld`start + 2476
(lldb) 
```

after

```
(lldb) bt
* thread rust-lang#1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
    frame rust-lang#1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
    frame rust-lang#2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
    frame rust-lang#6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
    frame rust-lang#7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
    frame rust-lang#8: 0x0000000183cdf154 dyld`start + 2476
Note: Some frames were hidden by frame recognizers
```
Patryk27 pushed a commit to Patryk27/rust-llvm-project that referenced this pull request Dec 10, 2024
…abort (llvm#117603)

Hey guys, I found that Flang's built-in ABORT function is incomplete
when I was using it. Compared with gfortran's ABORT (which can both
abort and print out a backtrace), flang's ABORT implementation lacks the
function of printing out a backtrace. This feature is essential for
debugging and understanding the call stack at the failure point.

To solve this problem, I completed the "// TODO:" of the abort function,
and then implemented an additional built-in function BACKTRACE for
flang. After a brief reading of the relevant source code, I used
backtrace and backtrace_symbols in "execinfo.h" to quickly implement
this. But since I used the above two functions directly, my
implementation is slightly different from gfortran's implementation (in
the output, the function call stack before main is additionally output,
and the function line number is missing). In addition, since I used the
above two functions, I did not need to add -g to embed debug information
into the ELF file, but needed -rdynamic to ensure that the symbols are
added to the dynamic symbol table (so that the function name will be
printed out).

Here is a comparison of the output between gfortran 's backtrace and my
implementation:
gfortran's implemention output:
```
#0  0x557eb71f4184 in testfun2_
        at /home/hunter/plct/fortran/test.f90:5
rust-lang#1  0x557eb71f4165 in testfun1_
        at /home/hunter/plct/fortran/test.f90:13
rust-lang#2  0x557eb71f4192 in test_backtrace
        at /home/hunter/plct/fortran/test.f90:17
rust-lang#3  0x557eb71f41ce in main
        at /home/hunter/plct/fortran/test.f90:18
```
my impelmention output:
```
Backtrace:
#0 ./test(_FortranABacktrace+0x32) [0x574f07efcf92]
rust-lang#1 ./test(testfun2_+0x14) [0x574f07efc7b4]
rust-lang#2 ./test(testfun1_+0xd) [0x574f07efc7cd]
rust-lang#3 ./test(_QQmain+0x9) [0x574f07efc7e9]
rust-lang#4 ./test(main+0x12) [0x574f07efc802]
rust-lang#5 /usr/lib/libc.so.6(+0x25e08) [0x76954694fe08]
rust-lang#6 /usr/lib/libc.so.6(__libc_start_main+0x8c) [0x76954694fecc]
rust-lang#7 ./test(_start+0x25) [0x574f07efc6c5]
```
test program is:
```
function testfun2() result(err)
  implicit none
  integer :: err
  err = 1
  call backtrace
end function testfun2

subroutine testfun1()
  implicit none
  integer :: err
  integer :: testfun2

  err = testfun2()
end subroutine testfun1

program test_backtrace
  call testfun1()
end program test_backtrace
```
I am well aware of the importance of line numbers, so I am now working
on implementing line numbers (by parsing DWARF information) and
supporting cross-platform (Windows) support.
Patryk27 pushed a commit to Patryk27/rust-llvm-project that referenced this pull request Dec 10, 2024
…ne symbol size as symbols are created (llvm#117079)"

This reverts commit ba668eb.

Below test started failing again on x86_64 macOS CI. We're unsure
if this patch is the exact cause, but since this patch has broken
this test before, we speculatively revert it to see if it was indeed
the root cause.
```
FAIL: lldb-shell :: Unwind/trap_frame_sym_ctx.test (1692 of 2162)
******************** TEST 'lldb-shell :: Unwind/trap_frame_sym_ctx.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 7: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
clang: warning: argument unused during compilation: '-fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell' [-Wunused-command-line-argument]
RUN: at line 8: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit | /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input
         ^
<stdin>:26:64: note: scanning from here
 frame rust-lang#1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
                                                               ^
<stdin>:27:2: note: possible intended match here
 frame rust-lang#2: 0x00007ff7bfeff6c0
 ^

Input file: <stdin>
Check file: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
           21:  0x100003ed1 <+0>: pushq %rbp
           22:  0x100003ed2 <+1>: movq %rsp, %rbp
           23: (lldb) thread backtrace -u
           24: * thread rust-lang#1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
           25:  * frame #0: 0x0000000100003ecc trap_frame_sym_ctx.test.tmp`bar
           26:  frame rust-lang#1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
check:21'0                                                                    X error: no match found
           27:  frame rust-lang#2: 0x00007ff7bfeff6c0
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:21'1      ?                             possible intended match
           28:  frame rust-lang#3: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           29:  frame rust-lang#4: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           30:  frame rust-lang#5: 0x00007ff8193cc41f dyld`start + 1903
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           31: (lldb) exit
check:21'0     ~~~~~~~~~~~~
>>>>>>
```
Patryk27 pushed a commit to Patryk27/rust-llvm-project that referenced this pull request Dec 10, 2024
## Description

This PR fixes a segmentation fault that occurs when passing options
requiring arguments via `-Xopenmp-target=<triple>`. The issue was that
the function `Driver::getOffloadArchs` did not properly parse the
extracted option, but instead assumed it was valid, leading to a crash
when incomplete arguments were provided.

## Backtrace

```sh
llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o 
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o
1.      Compilation construction
2.      Building compilation actions
 #0 0x0000562fb21c363b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (llvm-project/build/bin/clang+++0x392f63b)
 rust-lang#1 0x0000562fb21c0e3c SignalHandler(int) Signals.cpp:0:0
 rust-lang#2 0x00007fcbf6c81420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 rust-lang#3 0x0000562fb1fa5d70 llvm::opt::Option::matches(llvm::opt::OptSpecifier) const (llvm-project/build/bin/clang+++0x3711d70)
 rust-lang#4 0x0000562fb2a78e7d clang::driver::Driver::getOffloadArchs(clang::driver::Compilation&, llvm::opt::DerivedArgList const&, clang::driver::Action::OffloadKind, clang::driver::ToolChain const*, bool) const (llvm-project/build/bin/clang+++0x41e4e7d)
 rust-lang#5 0x0000562fb2a7a9aa clang::driver::Driver::BuildOffloadingActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, std::pair<clang::driver::types::ID, llvm::opt::Arg const*> const&, clang::driver::Action*) const (.part.1164) Driver.cpp:0:0
 rust-lang#6 0x0000562fb2a7c093 clang::driver::Driver::BuildActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, llvm::SmallVector<std::pair<clang::driver::types::ID, llvm::opt::Arg const*>, 16u> const&, llvm::SmallVector<clang::driver::Action*, 3u>&) const (llvm-project/build/bin/clang+++0x41e8093)
 rust-lang#7 0x0000562fb2a8395d clang::driver::Driver::BuildCompilation(llvm::ArrayRef<char const*>) (llvm-project/build/bin/clang+++0x41ef95d)
 rust-lang#8 0x0000562faf92684c clang_main(int, char**, llvm::ToolContext const&) (llvm-project/build/bin/clang+++0x109284c)
 rust-lang#9 0x0000562faf826cc6 main (llvm-project/build/bin/clang+++0xf92cc6)
rust-lang#10 0x00007fcbf6699083 __libc_start_main /build/glibc-LcI20x/glibc-2.31/csu/../csu/libc-start.c:342:3
rust-lang#11 0x0000562faf923a5e _start (llvm-project/build/bin/clang+++0x108fa5e)
[1]    2628042 segmentation fault (core dumped)   main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu  -o
```
Patryk27 pushed a commit to Patryk27/rust-llvm-project that referenced this pull request Dec 10, 2024
llvm#118923)

…d reentry.

These utilities provide new, more generic and easier to use support for
lazy compilation in ORC.

LazyReexportsManager is an alternative to LazyCallThroughManager. It
takes requests for lazy re-entry points in the form of an alias map:
lazy-reexports = {
  ( <entry point symbol rust-lang#1>, <implementation symbol rust-lang#1> ),
  ( <entry point symbol rust-lang#2>, <implementation symbol rust-lang#2> ),
  ...
  ( <entry point symbol #n>, <implementation symbol #n> )
}

LazyReexportsManager then:
1. binds the entry points to the implementation names in an internal
table.
2. creates a JIT re-entry trampoline for each entry point.
3. creates a redirectable symbol for each of the entry point name and
binds redirectable symbol to the corresponding reentry trampoline.

When an entry point symbol is first called at runtime (which may be on
any thread of the JIT'd program) it will re-enter the JIT via the
trampoline and trigger a lookup for the implementation symbol stored in
LazyReexportsManager's internal table. When the lookup completes the
entry point symbol will be updated (via the RedirectableSymbolManager)
to point at the implementation symbol, and execution will proceed to the
implementation symbol.

Actual construction of the re-entry trampolines and redirectable symbols
is delegated to an EmitTrampolines functor and the
RedirectableSymbolsManager respectively.

JITLinkReentryTrampolines.h provides a JITLink-based implementation of
the EmitTrampolines functor. (AArch64 only in this patch, but other
architectures will be added in the near future).

Register state save and reentry functionality is added to the ORC
runtime in the __orc_rt_sysv_resolve and __orc_rt_resolve_implementation
functions (the latter is generic, the former will need custom
implementations for each ABI and architecture to be supported, however
this should be much less effort than the existing OrcABISupport
approach, since the ORC runtime allows this code to be written as native
assembly).

The resulting system:
1. Works equally well for in-process and out-of-process JIT'd code.
2. Requires less boilerplate to set up.

Given an ObjectLinkingLayer and PlatformJD (JITDylib containing the ORC
runtime), setup is just:

```c++
auto RSMgr = JITLinkRedirectableSymbolManager::Create(OLL);
if (!RSMgr)
  return RSMgr.takeError();

auto LRMgr = createJITLinkLazyReexportsManager(OLL, **RSMgr, PlatformJD);
if (!LRMgr)
  return LRMgr.takeError();
```

after which lazy reexports can be introduced with:

```c++
JD.define(lazyReexports(LRMgr, <alias map>));
```

LazyObectLinkingLayer is updated to use this new method, but the LLVM-IR
level CompileOnDemandLayer will continue to use LazyCallThroughManager
and OrcABISupport until the new system supports a wider range of
architectures and ABIs.

The llvm-jitlink utility's -lazy option now uses the new scheme. Since
it depends on the ORC runtime, the lazy-link.ll testcase and associated
helpers are moved to the ORC runtime.
Patryk27 pushed a commit to Patryk27/rust-llvm-project that referenced this pull request Dec 22, 2024
According to the documentation described at
https://github.com/loongson/la-abi-specs/blob/release/ladwarf.adoc, the
dwarf numbers for floating-point registers range from 32 to 63.

An incorrect dwarf number will prevent the register values from being
properly restored during unwinding.

This test reflects this problem:

```
loongson@linux:~$ cat test.c

void foo() {
  asm volatile ("movgr2fr.d $fs2, $ra":::"$fs2");
}
int main() {
  asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
  foo();
  return 0;
}

loongson@linux:~$ clang -g test.c  -o test

```
Without this patch:
```
loongson@linux:~$ ./_build/bin/lldb ./t
(lldb) target create "./t"
Current executable set to
'/home/loongson/llvm-project/_build_lldb/t' (loongarch64).
(lldb) b foo
Breakpoint 1: where = t`foo + 20 at test.c:4:1, address =
0x0000000000000714
(lldb) r
Process 2455626 launched: '/home/loongson/llvm-project/_build_lldb/t' (loongarch64)
Process 2455626 stopped
* thread rust-lang#1, name = 't', stop reason = breakpoint 1.1
    frame #0: 0x0000555555554714 t`foo at test.c:4:1
   1    #include <stdio.h>
   2
   3    void foo() {
-> 4    asm volatile ("movgr2fr.d $fs2, $ra":::"$fs2");
   5    }
   6    int main() {
   7    asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
(lldb) si
Process 2455626 stopped
* thread rust-lang#1, name = 't', stop reason = instruction step into
    frame #0: 0x0000555555554718 t`foo at test.c:4:1
   1    #include <stdio.h>
   2
   3    void foo() {
-> 4    asm volatile ("movgr2fr.d $fs2, $ra":::"$fs2");
   5    }
   6    int main() {
   7    asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
(lldb) f 1
frame rust-lang#1: 0x0000555555554768 t`main at test.c:8:1
   5    }
   6    int main() {
   7    asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
-> 8    foo();
   9    return 0;
   10   }
(lldb) register read -a
General Purpose Registers:
        r1 = 0x0000555555554768  t`main + 40 at test.c:8:1
        r3 = 0x00007ffffffef780
       r22 = 0x00007ffffffef7b0
       r23 = 0x00007ffffffef918
       r24 = 0x0000000000000001
       r25 = 0x0000000000000000
       r26 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
       r27 = 0x0000555555554740  t`main at test.c:6
       r28 = 0x00007ffffffef928
       r29 = 0x00007ffff7febc88  ld-linux-loongarch-lp64d.so.1`_rtld_global_ro
       r30 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
        pc = 0x0000555555554768  t`main + 40 at test.c:8:1
33 registers were unavailable.

Floating Point Registers:
       f13 = 0x00007ffffffef780 !!!!! wrong register
       f24 = 0xffffffffffffffff
       f25 = 0xffffffffffffffff
       f26 = 0x0000555555554768  t`main + 40 at test.c:8:1
       f27 = 0xffffffffffffffff
       f28 = 0xffffffffffffffff
       f29 = 0xffffffffffffffff
       f30 = 0xffffffffffffffff
       f31 = 0xffffffffffffffff
32 registers were unavailable.
```
With this patch:
```
The previous operations are the same.
(lldb) register read -a
General Purpose Registers:
        r1 = 0x0000555555554768  t`main + 40 at test.c:8:1
        r3 = 0x00007ffffffef780
       r22 = 0x00007ffffffef7b0
       r23 = 0x00007ffffffef918
       r24 = 0x0000000000000001
       r25 = 0x0000000000000000
       r26 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
       r27 = 0x0000555555554740  t`main at test.c:6
       r28 = 0x00007ffffffef928
       r29 = 0x00007ffff7febc88  ld-linux-loongarch-lp64d.so.1`_rtld_global_ro
       r30 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
        pc = 0x0000555555554768  t`main + 40 at test.c:8:1
33 registers were unavailable.

Floating Point Registers:
       f24 = 0xffffffffffffffff
       f25 = 0xffffffffffffffff
       f26 = 0x00007ffffffef780
       f27 = 0xffffffffffffffff
       f28 = 0xffffffffffffffff
       f29 = 0xffffffffffffffff
       f30 = 0xffffffffffffffff
       f31 = 0xffffffffffffffff
33 registers were unavailable.
```

Reviewed By: SixWeining

Pull Request: llvm#120391
wesleywiser pushed a commit to wesleywiser/llvm-project that referenced this pull request Jan 22, 2025
Fix for the Coverity hit with CID1579964 in VPlan.cpp.

Coverity message with some context follows.

[Cov] var_compare_op: Comparing TermBr to null implies that TermBr might
be null.
434    } else if (TermBr && !TermBr->isConditional()) {
435      TermBr->setSuccessor(0, NewBB);
436    } else {
437 // Set each forward successor here when it is created, excluding
438 // backedges. A backward successor is set when the branch is
created.
439      unsigned idx = PredVPSuccessors.front() == this ? 0 : 1;
     	
[Cov] CID 1579964: (rust-lang#1 of 1): Dereference after null check
(FORWARD_NULL)
[Cov] var_deref_model: Passing null pointer TermBr to getSuccessor,
which dereferences it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants