Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock in RemoteDebugger::debug #87169

Merged
merged 1 commit into from
Jan 16, 2024

Conversation

0x0ACB
Copy link
Contributor

@0x0ACB 0x0ACB commented Jan 14, 2024

This raw mutex.lock will cause a deadlock in the debugger if the peer is not connected. Additionally, calling mutex.unlock if the mutex is not held by the current thread is undefined behavior. This is causing deadlocks for me when a different thread than the MainThread enters this loop due to eg an error in a GDScript thread as the microsoft inplementation will decrease the lock counter of the recursive mutex below 0 causing the mutex to appear as locked.

Not sure why this mutex.lock is here in the first place as every additional iteration won't relock the mutex. If this lock is indeed needed the loop needs to be converted to while(true) and the check placed inside but in my testing I didn't find any issues with completely removing this.

@0x0ACB 0x0ACB requested a review from a team as a code owner January 14, 2024 10:41
@AThousandShips AThousandShips added this to the 4.3 milestone Jan 14, 2024
@RandomShaper
Copy link
Member

RandomShaper commented Jan 15, 2024

Looks like it's a responsibility of the various peer implementations to lock internally if they need. In particular, RemoteDebuggerPeerTCP would need to leverage its mutex for connected, because its thread function can change it. So, in principle, I'd agree that the lock removed by this PR has indeed to go away. It may be helping somehow, for the wrong reasons. I'd say that, to be sure this fix won't cause extra issues, RemoteDebuggerPeerTCP (and maybe others?) would need to have their thread-safety revised.

@0x0ACB
Copy link
Contributor Author

0x0ACB commented Jan 15, 2024

Hmm I don't think there should be any new issues in that particular part of code from removing the lock.

  1. it always got unlocked before the second iteration anyways only in the 2nd iteration it would get permanently locked
  2. even if you synchronize access to the connected state. The peer might still disconnect between you checking and you executing the next action so you are not really gaining anything there

@RandomShaper
Copy link
Member

First of all, I agree that the current code features a little locking catastrophe there and that this PR does the right thing by taking it away. What I mean is that RemoteDebuggerPeerTCP should likely have its internals made thread-safe, because its handling its connected variable with no sync at all. This PR may be the one to fix that in addition (even if in practice things will be fine anyway, it's good that the code reflects the theoretically required sync needs). If you want to keep this PR's scope unchanged, I can make a PR to fix the other thing.

@akien-mga akien-mga changed the title Fix deadlock in RemoteDebugger::debug Fix deadlock in RemoteDebugger::debug Jan 16, 2024
@0x0ACB
Copy link
Contributor Author

0x0ACB commented Jan 16, 2024

Unless just making connected atomic is what you had in mind I don't have enough expertise with the RemoteDebuggerPeerTCP class to fix that. So it's probably best if you do those changes.

@akien-mga akien-mga merged commit 1debbaa into godotengine:master Jan 16, 2024
15 checks passed
@akien-mga
Copy link
Member

Thanks!

@AThousandShips AThousandShips added the cherrypick:4.2 Considered for cherry-picking into a future 4.2.x release label Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug cherrypick:4.2 Considered for cherry-picking into a future 4.2.x release topic:core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants