Skip to content

Conversation

@jimingham
Copy link
Collaborator

This is an odd corner case of the use of scripts loaded from dSYM's - a macOS only feature, which can load OS Plugins that re-present the thread state of the program we attach to. If we find out about and load the dSYM scripts when we discover a target in the course of attaching to it, we can end up running the OS plugin before we've started up the private state thread. However, the os_plugin in that case will be running before we broadcast the stop event to the public event listener. So it should formally use the private state and not the public state for the Python code environment.

This patch says that if we have not yet started up the private state thread, then any thread that is servicing events is doing so on behalf of the private state machinery, and should see the private state, not the public state.

Most of the patch is getting a test that will actually reproduce the error. Only the test test_python_os_plugin_remote actually reproduced the error. In test_python_os_plugin we actually do start up the private state thread before handling the event. test_python_os_plugin is there for completeness sake.

callbacks see the correct process state.
@jimingham jimingham requested a review from labath July 1, 2025 00:34
@jimingham jimingham requested a review from JDevlieghere as a code owner July 1, 2025 00:34
@llvmbot llvmbot added the lldb label Jul 1, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 1, 2025

@llvm/pr-subscribers-lldb

Author: None (jimingham)

Changes

This is an odd corner case of the use of scripts loaded from dSYM's - a macOS only feature, which can load OS Plugins that re-present the thread state of the program we attach to. If we find out about and load the dSYM scripts when we discover a target in the course of attaching to it, we can end up running the OS plugin before we've started up the private state thread. However, the os_plugin in that case will be running before we broadcast the stop event to the public event listener. So it should formally use the private state and not the public state for the Python code environment.

This patch says that if we have not yet started up the private state thread, then any thread that is servicing events is doing so on behalf of the private state machinery, and should see the private state, not the public state.

Most of the patch is getting a test that will actually reproduce the error. Only the test test_python_os_plugin_remote actually reproduced the error. In test_python_os_plugin we actually do start up the private state thread before handling the event. test_python_os_plugin is there for completeness sake.


Full diff: https://github.com/llvm/llvm-project/pull/146441.diff

9 Files Affected:

  • (modified) lldb/include/lldb/Host/HostThread.h (+6)
  • (modified) lldb/include/lldb/Target/Process.h (+2)
  • (modified) lldb/source/Target/Process.cpp (+19-10)
  • (modified) lldb/source/Target/StackFrameList.cpp (+1-1)
  • (modified) lldb/test/API/functionalities/plugins/python_os_plugin/operating_system.py (+4)
  • (added) lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/Makefile (+4)
  • (added) lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/TestOSIndSYM.py (+145)
  • (added) lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/main.c (+10)
  • (added) lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/operating_system.py (+78)
diff --git a/lldb/include/lldb/Host/HostThread.h b/lldb/include/lldb/Host/HostThread.h
index d3477e115e2d8..d8b3610210aac 100644
--- a/lldb/include/lldb/Host/HostThread.h
+++ b/lldb/include/lldb/Host/HostThread.h
@@ -10,6 +10,7 @@
 #define LLDB_HOST_HOSTTHREAD_H
 
 #include "lldb/Host/HostNativeThreadForward.h"
+#include "lldb/Host/HostNativeThreadBase.h"
 #include "lldb/Utility/Status.h"
 #include "lldb/lldb-types.h"
 
@@ -42,6 +43,11 @@ class HostThread {
   lldb::thread_result_t GetResult() const;
 
   bool EqualsThread(lldb::thread_t thread) const;
+  
+  bool HasThread() const { 
+    if (!m_native_thread)
+      return false;
+    return m_native_thread->GetSystemHandle() != LLDB_INVALID_HOST_THREAD ; }
 
 private:
   std::shared_ptr<HostNativeThreadBase> m_native_thread;
diff --git a/lldb/include/lldb/Target/Process.h b/lldb/include/lldb/Target/Process.h
index a8892e9c43225..1329abba9b17b 100644
--- a/lldb/include/lldb/Target/Process.h
+++ b/lldb/include/lldb/Target/Process.h
@@ -2546,6 +2546,8 @@ void PruneThreadPlans();
   ProcessRunLock &GetRunLock();
 
   bool CurrentThreadIsPrivateStateThread();
+  
+  bool CurrentThreadPosesAsPrivateStateThread();
 
   virtual Status SendEventData(const char *data) {
     return Status::FromErrorString(
diff --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp
index bba1230c79920..9cdef7ef524fc 100644
--- a/lldb/source/Target/Process.cpp
+++ b/lldb/source/Target/Process.cpp
@@ -1271,7 +1271,7 @@ uint32_t Process::AssignIndexIDToThread(uint64_t thread_id) {
 }
 
 StateType Process::GetState() {
-  if (CurrentThreadIsPrivateStateThread())
+  if (CurrentThreadPosesAsPrivateStateThread())
     return m_private_state.GetValue();
   else
     return m_public_state.GetValue();
@@ -3144,16 +3144,17 @@ void Process::CompleteAttach() {
     }
   }
 
-  if (!m_os_up) {
-    LoadOperatingSystemPlugin(false);
-    if (m_os_up) {
-      // Somebody might have gotten threads before now, but we need to force the
-      // update after we've loaded the OperatingSystem plugin or it won't get a
-      // chance to process the threads.
-      m_thread_list.Clear();
-      UpdateThreadListIfNeeded();
-    }
+  if (!m_os_up) 
+  LoadOperatingSystemPlugin(false);
+
+  if (m_os_up) {
+    // Somebody might have gotten threads before now, but we need to force the
+    // update after we've loaded the OperatingSystem plugin or it won't get a
+    // chance to process the threads.  
+    m_thread_list.Clear();
+    UpdateThreadListIfNeeded();
   }
+
   // Figure out which one is the executable, and set that in our target:
   ModuleSP new_executable_module_sp;
   for (ModuleSP module_sp : GetTarget().GetImages().Modules()) {
@@ -5856,6 +5857,14 @@ bool Process::CurrentThreadIsPrivateStateThread()
   return m_private_state_thread.EqualsThread(Host::GetCurrentThread());
 }
 
+bool Process::CurrentThreadPosesAsPrivateStateThread()
+{
+  // If we haven't started up the private state thread yet, then whatever thread
+  // is fetching this event should be temporarily the private state thread.
+  if (!m_private_state_thread.HasThread())
+    return true;
+  return m_private_state_thread.EqualsThread(Host::GetCurrentThread());
+}
 
 void Process::Flush() {
   m_thread_list.Flush();
diff --git a/lldb/source/Target/StackFrameList.cpp b/lldb/source/Target/StackFrameList.cpp
index 9c6208e9e0a65..16cd2548c2784 100644
--- a/lldb/source/Target/StackFrameList.cpp
+++ b/lldb/source/Target/StackFrameList.cpp
@@ -723,7 +723,7 @@ void StackFrameList::SelectMostRelevantFrame() {
   // Don't call into the frame recognizers on the private state thread as
   // they can cause code to run in the target, and that can cause deadlocks
   // when fetching stop events for the expression.
-  if (m_thread.GetProcess()->CurrentThreadIsPrivateStateThread())
+  if (m_thread.GetProcess()->CurrentThreadPosesAsPrivateStateThread())
     return;
 
   Log *log = GetLog(LLDBLog::Thread);
diff --git a/lldb/test/API/functionalities/plugins/python_os_plugin/operating_system.py b/lldb/test/API/functionalities/plugins/python_os_plugin/operating_system.py
index f4404d78492f9..de9900cae4b75 100644
--- a/lldb/test/API/functionalities/plugins/python_os_plugin/operating_system.py
+++ b/lldb/test/API/functionalities/plugins/python_os_plugin/operating_system.py
@@ -24,6 +24,10 @@ def create_thread(self, tid, context):
         return None
 
     def get_thread_info(self):
+        if self.process.state != lldb.eStateStopped:
+            print("Error: get_thread_info called with state not stopped")
+            return []
+
         if not self.threads:
             self.threads = [
                 {
diff --git a/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/Makefile b/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/Makefile
new file mode 100644
index 0000000000000..93618844a7a4d
--- /dev/null
+++ b/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/Makefile
@@ -0,0 +1,4 @@
+C_SOURCES := main.c
+ENABLE_THREADS := YES
+
+include Makefile.rules
diff --git a/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/TestOSIndSYM.py b/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/TestOSIndSYM.py
new file mode 100644
index 0000000000000..06524fb70566a
--- /dev/null
+++ b/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/TestOSIndSYM.py
@@ -0,0 +1,145 @@
+"""
+Test that an OS plugin in a dSYM sees the right process state
+when run from a dSYM on attach
+"""
+
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+import lldbsuite.test.lldbutil as lldbutil
+from lldbgdbserverutils import get_debugserver_exe
+
+import os
+import lldb
+import time
+import socket
+import shutil
+
+class TestOSPluginIndSYM(TestBase):
+    NO_DEBUG_INFO_TESTCASE = True
+
+    # The port used by debugserver.
+    PORT = 54638
+
+    # The number of attempts.
+    ATTEMPTS = 10
+
+    # Time given to the binary to launch and to debugserver to attach to it for
+    # every attempt. We'll wait a maximum of 10 times 2 seconds while the
+    # inferior will wait 10 times 10 seconds.
+    TIMEOUT = 2
+
+    def no_debugserver(self):
+        if get_debugserver_exe() is None:
+            return "no debugserver"
+        return None
+
+    def port_not_available(self):
+        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+        if s.connect_ex(("127.0.0.1", self.PORT)) == 0:
+            return "{} not available".format(self.PORT)
+        return None
+
+    @skipUnlessDarwin
+    def test_python_os_plugin(self):
+        self.do_test_python_os_plugin(False)
+
+    @skipTestIfFn(no_debugserver)
+    @skipTestIfFn(port_not_available)
+    def test_python_os_plugin_remote(self):
+        self.do_test_python_os_plugin(True)
+
+    def do_test_python_os_plugin(self, remote):
+        """Test that the environment for os plugins in dSYM's is correct"""
+        executable = self.build_dsym("my_binary")
+
+        # Make sure we're set up to load the symbol file's python
+        self.runCmd("settings set target.load-script-from-symbol-file true")
+        
+        target = self.dbg.CreateTarget(None)
+        
+        error = lldb.SBError()
+
+        # Now run the process, and then attach.  When the attach
+        # succeeds, make sure that we were in the right state when
+        # the OS plugins were run.
+        if not remote:
+            popen = self.spawnSubprocess(executable, [])
+
+            process = target.AttachToProcessWithID(lldb.SBListener(), popen.pid, error)
+            self.assertSuccess(error, "Attach succeeded")
+        else:
+            self.setup_remote_platform(executable)
+            process = target.process
+            self.assertTrue(process.IsValid(), "Got a valid process from debugserver")
+
+        # We should have figured out the target from the result of the attach:
+        self.assertTrue(target.IsValid, "Got a valid target")
+
+        # Make sure that we got the right plugin:
+        self.expect("settings show target.process.python-os-plugin-path", substrs=["operating_system.py"])
+
+        for thread in process.threads:
+            stack_depth = thread.num_frames
+            reg_threads = thread.frames[0].reg
+
+        # OKAY, that realized the threads, now see if the creation
+        # state was correct.  The way we use the OS plugin, it doesn't need
+        # to create a thread, and doesn't have to call get_register_info,
+        # so we don't expect those to get called.
+        self.expect("test_report_command", substrs = ["in_init=1",
+                                                 "in_get_thread_info=1",
+                                                 "in_create_thread=2",
+                                                 "in_get_register_info=2",
+                                                 "in_get_register_data=1"])
+        
+        
+        
+        
+    def build_dsym(self, name):
+        self.build(debug_info="dsym", dictionary={"EXE": name})
+        executable = self.getBuildArtifact(name)
+        dsym_path = self.getBuildArtifact(name + ".dSYM")
+        python_dir_path = dsym_path
+        python_dir_path = os.path.join(dsym_path, "Contents", "Resources", "Python")
+        if not os.path.exists(python_dir_path):
+            os.mkdir(python_dir_path)
+        python_file_name = name + ".py"
+
+        os_plugin_dir = os.path.join(python_dir_path, "OS_Plugin") 
+        if not os.path.exists(os_plugin_dir):
+            os.mkdir(os_plugin_dir)
+
+        plugin_dest_path = os.path.join(os_plugin_dir, "operating_system.py")
+        plugin_origin_path = os.path.join(self.getSourceDir(), "operating_system.py")
+        shutil.copy(plugin_origin_path, plugin_dest_path)
+
+        module_dest_path = os.path.join(python_dir_path, python_file_name)
+        with open(module_dest_path, "w") as f:
+            f.write( "def __lldb_init_module(debugger, unused):\n")
+            f.write(f"    debugger.HandleCommand(\"settings set target.process.python-os-plugin-path '{plugin_dest_path}'\")\n")
+            f.close()
+
+        return executable
+        
+    def setup_remote_platform(self, exe):
+        # Get debugserver to start up our process for us, and then we
+        # can use `process connect` to attach to it.
+        debugserver = get_debugserver_exe()
+        debugserver_args = ["localhost:{}".format(self.PORT), exe]
+        self.spawnSubprocess(debugserver, debugserver_args)
+
+        # Select the platform.
+        self.runCmd("platform select remote-gdb-server")
+
+        # Connect to debugserver
+        interpreter = self.dbg.GetCommandInterpreter()
+        connected = False
+        for i in range(self.ATTEMPTS):
+            result = lldb.SBCommandReturnObject()
+            interpreter.HandleCommand(f"gdb-remote localhost:{self.PORT}", result)
+            connected = result.Succeeded()
+            if connected:
+                break
+            time.sleep(self.TIMEOUT)
+
+        self.assertTrue(connected, "could not connect to debugserver")
diff --git a/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/main.c b/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/main.c
new file mode 100644
index 0000000000000..ded9da5e18313
--- /dev/null
+++ b/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/main.c
@@ -0,0 +1,10 @@
+#include <unistd.h>
+
+int
+main()
+{
+  while (1) {
+    sleep(1);
+  }
+  return 0;
+}
diff --git a/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/operating_system.py b/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/operating_system.py
new file mode 100644
index 0000000000000..b8736f8b8a12e
--- /dev/null
+++ b/lldb/test/API/functionalities/plugins/python_os_plugin/os_plugin_in_dsym/operating_system.py
@@ -0,0 +1,78 @@
+#!/usr/bin/env python
+
+import lldb
+import struct
+
+# Value is:
+#   0 called - state is not stopped
+#   1 called - state is stopped
+#   2 not called
+
+stop_state = {"in_init" : 2,
+              "in_get_thread_info" : 2,
+              "in_create_thread" : 2,
+              "in_get_register_info" : 2,
+              "in_get_register_data" : 2}
+
+def ReportCommand(debugger, command, exe_ctx, result, unused):
+    global stop_state
+    for state in stop_state:
+        result.AppendMessage(f"{state}={stop_state[state]}\n")
+    result.SetStatus(lldb.eReturnStatusSuccessFinishResult)
+    
+class OperatingSystemPlugIn():
+    """This class checks that all the 
+    """
+
+    def __init__(self, process):
+        """Initialization needs a valid.SBProcess object.
+        global stop_state
+
+        This plug-in will get created after a live process is valid and has stopped for the
+        first time."""
+        self.process = process
+        stop_state["in_init"] = self.state_is_stopped()
+        interp = process.target.debugger.GetCommandInterpreter()
+        result = lldb.SBCommandReturnObject()
+        cmd_str = f"command script add test_report_command -o -f {__name__}.ReportCommand"
+        interp.HandleCommand(cmd_str, result)
+
+    def state_is_stopped(self):
+        if self.process.state == lldb.eStateStopped:
+            return 1
+        else:
+            return 0
+
+    def does_plugin_report_all_threads(self):
+        return True
+
+    def create_thread(self, tid, context):
+        global stop_state
+        stop_state["in_create_thread"] = self.state_is_stopped()
+        
+        return None
+
+    def get_thread_info(self):
+        global stop_state
+        stop_state["in_get_thread_info"] = self.state_is_stopped()
+        idx = self.process.threads[0].idx
+        return [
+            {
+                "tid": 0x111111111,
+                "name": "one",
+                "queue": "queue1",
+                "state": "stopped",
+                "stop_reason": "breakpoint",
+                "core": idx,
+            }
+        ]
+
+    def get_register_info(self):
+        global stop_state
+        stop_state["in_get_register_info"] = self.state_is_stopped()
+        return None
+
+    def get_register_data(self, tid):
+        global stop_state
+        stop_state["in_get_register_data"] = self.state_is_stopped()
+        return None

@jimingham
Copy link
Collaborator Author

I added Pavel because we were talking about the issue of how to more properly hand the process state out to workers that are doing their jobs before the event it fetched by the primary Process Listener. Don't feel obliged to peer at the dSYM making code.

@github-actions
Copy link

github-actions bot commented Jul 1, 2025

✅ With the latest revision this PR passed the Python code formatter.

@github-actions
Copy link

github-actions bot commented Jul 1, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@jimingham
Copy link
Collaborator Author

In the case we were discussing, I'll need to add "m_poses_as_private_state_thread" and add Process::SetPoseAs & Process::ClearPosesAs, and then have PosesAsPrivateStateThread consult that as well. Then just before we call DoOnRemoval for the process events, you set the state running the DoOnRemoval to pose as the private state thread, and clear it just before setting the public state and delivering the event to the Primary Listener.

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Jim and I have been discussing this issue so I'm also not totally impartial.

UpdateThreadListIfNeeded();
}

if (m_os_up) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first instinct here was "should this be an else" before I realized that LoadOperatingSystemPlugin probably initializes m_os_up. Might be wroth a comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment to make this clear.

@jimingham jimingham merged commit 9adc8dd into llvm:main Jul 11, 2025
9 checks passed
qxy11 added a commit to qxy11/llvm-project that referenced this pull request Jul 15, 2025
Summary:
There was a deadlock was introduced by [PR llvm#146441](llvm#146441) which changed `CurrentThreadIsPrivateStateThread()` to `CurrentThreadPosesAsPrivateStateThread()`. This change caused the execution path in `ExecutionContextRef::SetTargetPtr()` to now enter a code block that was previously skipped, triggering `GetSelectedFrame()` which leads to a deadlock.

In particular,  one thread held `m_modules_mutex` and tried to acquire `m_language_runtimes_mutex` (via the notifier call chain), and another thread held `m_language_runtimes_mutex` and tried to acquire `m_modules_mutex` (via `ScanForGNUstepObjCLibraryCandidate`)

This fixes the deadlock by adding a scoped block around the mutex lock before the call to the notifier, and moved the notifier call outside of the mutex-guarded section.

Test Plan:
Tested manually
@dmpots
Copy link
Contributor

dmpots commented Jul 15, 2025

@jimingham we found a deadlock that we suspect is related to this patch. Please take a look at the proposed fix in #148774. Thanks!

qxy11 added a commit that referenced this pull request Aug 4, 2025
…erver (#148774)

Summary:
There was a deadlock was introduced by [PR
#146441](#146441) which changed
`CurrentThreadIsPrivateStateThread()` to
`CurrentThreadPosesAsPrivateStateThread()`. This change caused the
execution path in
[`ExecutionContextRef::SetTargetPtr()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L513)
to now enter a code block that was previously skipped, triggering
[`GetSelectedFrame()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L522)
which leads to a deadlock.

Thread 1 gets m_modules_mutex in
[`ModuleList::AppendImpl`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Core/ModuleList.cpp#L218),
Thread 3 gets m_language_runtimes_mutex in
[`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501),
but then Thread 1 waits for m_language_runtimes_mutex in
[`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501)
while Thread 3 waits for m_modules_mutex in
[`ScanForGNUstepObjCLibraryCandidate`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Plugins/LanguageRuntime/ObjC/GNUstepObjCRuntime/GNUstepObjCRuntime.cpp#L57).

This fixes the deadlock by adding a scoped block around the mutex lock
before the call to the notifier, and moved the notifier call outside of
the mutex-guarded section. The notifier call
[`NotifyModuleAdded`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Target.cpp#L1810)
should be thread-safe, since the module should be added to the
`ModuleList` before the mutex is released, and the notifier doesn't
modify the module list further, and the call is operates on local state
and the `Target` instance.

### Deadlocked Thread backtraces:
```
* thread #3, name = 'dbg.evt-handler', stop reason = signal SIGSTOP
  * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563786bd5f40) at    futex-internal.h:146:13
   /*... a bunch of mutex related bt ... */    
   liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007f2f0f1927b0, __m=0x0000563786bd5f40) at std_mutex.h:229:19
    frame #8: 0x00007f2f27946eb7 liblldb.so.21.0git`ScanForGNUstepObjCLibraryCandidate(modules=0x0000563786bd5f28, TT=0x0000563786bd5eb8) at GNUstepObjCRuntime.cpp:60:41
    frame #9: 0x00007f2f27946c80 liblldb.so.21.0git`lldb_private::GNUstepObjCRuntime::CreateInstance(process=0x0000563785e1d360, language=eLanguageTypeObjC) at GNUstepObjCRuntime.cpp:87:8
    frame #10: 0x00007f2f2746fca5 liblldb.so.21.0git`lldb_private::LanguageRuntime::FindPlugin(process=0x0000563785e1d360, language=eLanguageTypeObjC) at LanguageRuntime.cpp:210:36
    frame #11: 0x00007f2f2742c9e3 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeObjC) at Process.cpp:1516:9
    ...
    frame #21: 0x00007f2f2750b5cc liblldb.so.21.0git`lldb_private::Thread::GetSelectedFrame(this=0x0000563785e064d0, select_most_relevant=DoNoSelectMostRelevantFrame) at Thread.cpp:274:48
    frame #22: 0x00007f2f273f9957 liblldb.so.21.0git`lldb_private::ExecutionContextRef::SetTargetPtr(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:525:32
    frame #23: 0x00007f2f273f9714 liblldb.so.21.0git`lldb_private::ExecutionContextRef::ExecutionContextRef(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:413:3
    frame #24: 0x00007f2f270e80af liblldb.so.21.0git`lldb_private::Debugger::GetSelectedExecutionContext(this=0x0000563785d83bc0) at Debugger.cpp:1225:23
    frame #25: 0x00007f2f271bb7fd liblldb.so.21.0git`lldb_private::Statusline::Redraw(this=0x0000563785d83f30, update=true) at Statusline.cpp:136:41
    ...
* thread #1, name = 'lldb', stop reason = signal SIGSTOP
  * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563785e1dd98) at futex-internal.h:146:13
   /*... a bunch of mutex related bt ... */    
   liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007ffe62be0488, __m=0x0000563785e1dd98) at std_mutex.h:229:19
    frame #8: 0x00007f2f2742c8d1 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeC_plus_plus) at Process.cpp:1510:41
    frame #9: 0x00007f2f2743c46f liblldb.so.21.0git`lldb_private::Process::ModulesDidLoad(this=0x0000563785e1d360, module_list=0x00007ffe62be06a0) at Process.cpp:6082:36
    ...
    frame #13: 0x00007f2f2715cf03 liblldb.so.21.0git`lldb_private::ModuleList::AppendImpl(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, use_notifier=true) at ModuleList.cpp:246:19
    frame #14: 0x00007f2f2715cf4c liblldb.so.21.0git`lldb_private::ModuleList::Append(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, notify=true) at ModuleList.cpp:251:3
    ...
    frame #19: 0x00007f2f274349b3 liblldb.so.21.0git`lldb_private::Process::ConnectRemote(this=0x0000563785e1d360, remote_url=(Data = "connect://localhost:1234", Length = 24)) at Process.cpp:3250:9
    frame #20: 0x00007f2f27411e0e liblldb.so.21.0git`lldb_private::Platform::DoConnectProcess(this=0x0000563785c59990, connect_url=(Data = "connect://localhost:1234", Length = 24), plugin_name=(Data = "gdb-remote", Length = 10), debugger=0x0000563785d83bc0, stream=0x00007ffe62be3128, target=0x0000563786bd5be0, error=0x00007ffe62be1ca0) at Platform.cpp:1926:23
```

## Test Plan:
Built a hello world a.out
Run server in one terminal:
```
~/llvm/build/Debug/bin/lldb-server g :1234 a.out
```
Run client in another terminal
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3"
```

Before:
Client hangs indefinitely
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b main"
(lldb) gdb-remote 1234

^C^C
```

After:
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3"
(lldb) gdb-remote 1234
Process 837068 stopped
* thread #1, name = 'a.out', stop reason = signal SIGSTOP
    frame #0: 0x00007ffff7fe4a60
ld-linux-x86-64.so.2`_start:
->  0x7ffff7fe4a60 <+0>: movq   %rsp, %rdi
    0x7ffff7fe4a63 <+3>: callq  0x7ffff7fe5780 ; _dl_start at rtld.c:522:1

ld-linux-x86-64.so.2`_dl_start_user:
    0x7ffff7fe4a68 <+0>: movq   %rax, %r12
    0x7ffff7fe4a6b <+3>: movl   0x18067(%rip), %eax ; _dl_skip_args
(lldb) b hello.cc:3
Breakpoint 1: where = a.out`main + 15 at hello.cc:4:13, address = 0x00005555555551bf
(lldb) c
Process 837068 resuming
Process 837068 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x00005555555551bf a.out`main at hello.cc:4:13
   1   	#include <iostream>
   2
   3   	int main() {
-> 4   	  std::cout << "Hello World" << std::endl;
   5   	  return 0;
   6   	}
```
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Aug 4, 2025
…db client/server (#148774)

Summary:
There was a deadlock was introduced by [PR
#146441](llvm/llvm-project#146441) which changed
`CurrentThreadIsPrivateStateThread()` to
`CurrentThreadPosesAsPrivateStateThread()`. This change caused the
execution path in
[`ExecutionContextRef::SetTargetPtr()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L513)
to now enter a code block that was previously skipped, triggering
[`GetSelectedFrame()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L522)
which leads to a deadlock.

Thread 1 gets m_modules_mutex in
[`ModuleList::AppendImpl`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Core/ModuleList.cpp#L218),
Thread 3 gets m_language_runtimes_mutex in
[`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501),
but then Thread 1 waits for m_language_runtimes_mutex in
[`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501)
while Thread 3 waits for m_modules_mutex in
[`ScanForGNUstepObjCLibraryCandidate`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Plugins/LanguageRuntime/ObjC/GNUstepObjCRuntime/GNUstepObjCRuntime.cpp#L57).

This fixes the deadlock by adding a scoped block around the mutex lock
before the call to the notifier, and moved the notifier call outside of
the mutex-guarded section. The notifier call
[`NotifyModuleAdded`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Target.cpp#L1810)
should be thread-safe, since the module should be added to the
`ModuleList` before the mutex is released, and the notifier doesn't
modify the module list further, and the call is operates on local state
and the `Target` instance.

### Deadlocked Thread backtraces:
```
* thread #3, name = 'dbg.evt-handler', stop reason = signal SIGSTOP
  * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563786bd5f40) at    futex-internal.h:146:13
   /*... a bunch of mutex related bt ... */
   liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007f2f0f1927b0, __m=0x0000563786bd5f40) at std_mutex.h:229:19
    frame #8: 0x00007f2f27946eb7 liblldb.so.21.0git`ScanForGNUstepObjCLibraryCandidate(modules=0x0000563786bd5f28, TT=0x0000563786bd5eb8) at GNUstepObjCRuntime.cpp:60:41
    frame #9: 0x00007f2f27946c80 liblldb.so.21.0git`lldb_private::GNUstepObjCRuntime::CreateInstance(process=0x0000563785e1d360, language=eLanguageTypeObjC) at GNUstepObjCRuntime.cpp:87:8
    frame #10: 0x00007f2f2746fca5 liblldb.so.21.0git`lldb_private::LanguageRuntime::FindPlugin(process=0x0000563785e1d360, language=eLanguageTypeObjC) at LanguageRuntime.cpp:210:36
    frame #11: 0x00007f2f2742c9e3 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeObjC) at Process.cpp:1516:9
    ...
    frame #21: 0x00007f2f2750b5cc liblldb.so.21.0git`lldb_private::Thread::GetSelectedFrame(this=0x0000563785e064d0, select_most_relevant=DoNoSelectMostRelevantFrame) at Thread.cpp:274:48
    frame #22: 0x00007f2f273f9957 liblldb.so.21.0git`lldb_private::ExecutionContextRef::SetTargetPtr(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:525:32
    frame #23: 0x00007f2f273f9714 liblldb.so.21.0git`lldb_private::ExecutionContextRef::ExecutionContextRef(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:413:3
    frame #24: 0x00007f2f270e80af liblldb.so.21.0git`lldb_private::Debugger::GetSelectedExecutionContext(this=0x0000563785d83bc0) at Debugger.cpp:1225:23
    frame #25: 0x00007f2f271bb7fd liblldb.so.21.0git`lldb_private::Statusline::Redraw(this=0x0000563785d83f30, update=true) at Statusline.cpp:136:41
    ...
* thread #1, name = 'lldb', stop reason = signal SIGSTOP
  * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563785e1dd98) at futex-internal.h:146:13
   /*... a bunch of mutex related bt ... */
   liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007ffe62be0488, __m=0x0000563785e1dd98) at std_mutex.h:229:19
    frame #8: 0x00007f2f2742c8d1 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeC_plus_plus) at Process.cpp:1510:41
    frame #9: 0x00007f2f2743c46f liblldb.so.21.0git`lldb_private::Process::ModulesDidLoad(this=0x0000563785e1d360, module_list=0x00007ffe62be06a0) at Process.cpp:6082:36
    ...
    frame #13: 0x00007f2f2715cf03 liblldb.so.21.0git`lldb_private::ModuleList::AppendImpl(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, use_notifier=true) at ModuleList.cpp:246:19
    frame #14: 0x00007f2f2715cf4c liblldb.so.21.0git`lldb_private::ModuleList::Append(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, notify=true) at ModuleList.cpp:251:3
    ...
    frame #19: 0x00007f2f274349b3 liblldb.so.21.0git`lldb_private::Process::ConnectRemote(this=0x0000563785e1d360, remote_url=(Data = "connect://localhost:1234", Length = 24)) at Process.cpp:3250:9
    frame #20: 0x00007f2f27411e0e liblldb.so.21.0git`lldb_private::Platform::DoConnectProcess(this=0x0000563785c59990, connect_url=(Data = "connect://localhost:1234", Length = 24), plugin_name=(Data = "gdb-remote", Length = 10), debugger=0x0000563785d83bc0, stream=0x00007ffe62be3128, target=0x0000563786bd5be0, error=0x00007ffe62be1ca0) at Platform.cpp:1926:23
```

## Test Plan:
Built a hello world a.out
Run server in one terminal:
```
~/llvm/build/Debug/bin/lldb-server g :1234 a.out
```
Run client in another terminal
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3"
```

Before:
Client hangs indefinitely
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b main"
(lldb) gdb-remote 1234

^C^C
```

After:
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3"
(lldb) gdb-remote 1234
Process 837068 stopped
* thread #1, name = 'a.out', stop reason = signal SIGSTOP
    frame #0: 0x00007ffff7fe4a60
ld-linux-x86-64.so.2`_start:
->  0x7ffff7fe4a60 <+0>: movq   %rsp, %rdi
    0x7ffff7fe4a63 <+3>: callq  0x7ffff7fe5780 ; _dl_start at rtld.c:522:1

ld-linux-x86-64.so.2`_dl_start_user:
    0x7ffff7fe4a68 <+0>: movq   %rax, %r12
    0x7ffff7fe4a6b <+3>: movl   0x18067(%rip), %eax ; _dl_skip_args
(lldb) b hello.cc:3
Breakpoint 1: where = a.out`main + 15 at hello.cc:4:13, address = 0x00005555555551bf
(lldb) c
Process 837068 resuming
Process 837068 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x00005555555551bf a.out`main at hello.cc:4:13
   1   	#include <iostream>
   2
   3   	int main() {
-> 4   	  std::cout << "Hello World" << std::endl;
   5   	  return 0;
   6   	}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants