Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recent changes seem to cause a deadlock when thread spawns #30

Closed
ian-h-chamberlain opened this issue May 8, 2024 · 3 comments · Fixed by #31
Closed

Recent changes seem to cause a deadlock when thread spawns #30

ian-h-chamberlain opened this issue May 8, 2024 · 3 comments · Fixed by #31

Comments

@ian-h-chamberlain
Copy link
Member

I noticed this while trying to make some other changes to ctru-rs — I think because we are pulling in the new pthread-3ds changes, something in the test runner isn't working anymore and the tests hang forever, e.g. https://github.com/rust3ds/ctru-rs/actions/runs/9009948051/job/24755098299?pr=187

Comparing to the most recent successful run, this is the diff of compiler output:

4a5
> Compiling bindgen v0.65.1
7a9
> Compiling cc v1.0.97
14a17
> Compiling doxygen-rs v0.4.2
20a24
> Compiling futures-macro v0.3.30
38a43,44
> Compiling phf v0.11.2
> Compiling phf_macros v0.11.2
41a48,49
> Compiling prettyplease v0.2.20
> Compiling proc-macro2 v1.0.82
43c51,52
< Compiling pthread-3ds v0.1.0 (https://github.com/rust3ds/pthread-3ds.git#c885d8cd)
---
> Compiling pthread-3ds v0.1.0 (https://github.com/rust3ds/pthread-3ds.git#b15d26df)
> Compiling quote v1.0.36
47a57
> Compiling serde v1.0.201
55a66
> Compiling syn v2.0.61
63a75,76
> Compiling tokio-macros v2.2.0
> Compiling toml v0.5.11

Most of the other changes, as far as I can tell, are just build dependencies and I wouldn't expect them to affect the unit tests, so pthread having recent changes is my prime suspect.

Maybe #19 had some side effect on spawning new threads somehow? Otherwise, I'm not sure what might have changed, but this seems like somewhere to start.

Meanwhile, I'm going to try checking in Cargo.lock with the old hash (c885d8c) and see if that helps at all. If not, we can close this issue / move it to ctru-rs to try and track it down there.

@ian-h-chamberlain
Copy link
Member Author

Ok, pinning the older hash passed CI again, so I think that's pretty strong evidence it wasn't from something else unrelated: https://github.com/rust3ds/ctru-rs/actions/runs/9010075712/job/24755466567?pr=187

@Meziu
Copy link
Member

Meziu commented May 9, 2024

A deadlock it seems indeed. It's very weird though, the code pushed in #19 should only have effect once a thread is detached (which may still be the case for when a test ends its run).

I want to note two things:

  1. It might be related to the spinlock that handles the TLS variables, but it's hard for me to understand why the TLS variables would cause a deadlock (there shouldn't be anything used?).
  2. The tests are run "concurrently" by default, but it's a bit hard to say whether they should. Tests aren't written with cooperative concurrency in mind, so they end up running sequentially either way.

@ian-h-chamberlain
Copy link
Member Author

Hmm, initial thoughts from looking at the changes:

  1. If any of the destructors call some pthread_key_* function, the KEYS spinlock could be held already and deadlock would occur. Maybe it would be safer to collect all the destructor pointers first, then release the lock before actually calling the destructors?
  2. I think the test runner should be avoiding concurrency entirely since it sets the number of threads to 1: https://github.com/rust3ds/ctru-rs/blob/master/test-runner/src/lib.rs#L56
    It's possible that there are still some issues that could arise from having the main thread waiting on the individual test thread, but I think it's necessary to handle panics appropriately during tests and stuff like that. Without writing a whole new test runner from scratching I'm not sure how much we can do about this in particular (and up until now, it's seemed to work okay?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants