`kernel threads` and `kernel stacks` deadlock in many scenarios

**Describe the bug**

`kernel threads` and `kernel stacks` call `shell_print` in the callback of `k_thread_foreach`. `shell_print` attempts to take a mutex with a non-zero timeout. This can result in context switching while holding `z_thread_monitor_lock`, which results in incorrect behavior. 

**To Reproduce**

n/a

**Expected behavior**

...I'm not actually sure what the desired behavior is, to be honest. At a high level, I suppose you could take the mutex outside the `k_thread_foreach` - but even then, API issues aside, you'd still have a problem if/when your shell backend backpressured.

 I'm mainly opening this ticket as an invitation for discussion. Note that there are other issues discussing tangential topics, e.g. #13318, #14172, #20937, and #22841.
 
**Impact**

`kernel threads` and `kernel stacks` (and presumably any other `k_thread_foreach` user that does something that could block) sometimes deadlock if any of their calls to `shell_print` block. In particular:

- if the thread is resumed on a different core, you end up clobbering one core's interrupt state with the interrupt state of a different core. (Note: I'm going to be filing a question about this in a bit. The documented semantics do not work well in SMP.)
- If the callback to `k_thread_foreach` calls anything that ends up trying to take `z_thread_monitor_lock`, you deadlock (or assert).
- If any other task that gets context-switched in on the same core while you are in the `k_thread_foreach` ends up trying to take `z_thread_monitor_lock`, you deadlock (or assert).

**Logs and console output**

n/a

**Additional context**

Note that there appear to be multiple workarounds in place to use `k_thread_foreach_unlocked` instead, which results in _different_ incorrect behavior (e.g. if anything in the system happens to delete/add a thread while you are in the call you can follow an invalid pointer off into the weeds.)

The only ways I can see `kernel threads` and `kernel stacks` working currently are either:

1. Allocate enough up front for every thread's information. In `k_thread_foreach`, copy all of the information out of kernel structures into allocated memory. Then print outside of `k_thread_foreach` and release allocated memory. Unfortunately, "enough for every thread's information" isn't particularly well-defined, and you can't allocate during the execution of `k_thread_foreach`.
2. Like 1, but done in multiple chunks, "abusing" `k_thread_foreach` and doing, say, the first 10 threads, then the 10 threads starting at the 10th, and so on. Less memory use, but has the potential to miss or duplicate threads, and is O(n^2) time w.r.t. the number of threads in the system (as you have to walk the linked list all the way from the front every time).
3. Change the shell to busywait and directly do the backend work instead of taking mutexes if called with a lock taken, and ensure that the shell backend cannot ever be interrupted by something that calls `k_thread_foreach` that in turn ends up calling shell print.
4. Change thread creation and deletion (or rather, anything that potentially updates `next_thread`/`prev_thread`) to be mutually exclusive with `k_thread_foreach`, and ensure that it's documented that e.g. logging backends can never dynamically spin up threads.

...none of which are exactly ideal.

I'm hoping there are other options I'm not seeing here, because the commands are rather useful in general.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`kernel threads` and `kernel stacks` deadlock in many scenarios #32145

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kernel threads and kernel stacks deadlock in many scenarios #32145

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`kernel threads` and `kernel stacks` deadlock in many scenarios #32145