-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread creation with race-free access to the TCB #17542
Comments
How about asking the caller of |
IMO that would just require two allocations instead of one: thread_t and stack itself. I know there's stack wiggling, but it's straight forward, maybe just a bit awkward implementation wise with room for improvement.
I think "sleep, get, wake" is totally viable. So would be some API that returns the thread_t* (or error). Can the second be implemented with the first? |
If we make the wiggling public (given this stack area, tell me where the wiggling would place the thread_t), I think this would suffice. The mechanism could even be private. If two allocations are bad (it's not usually on the heap, so IDC too much), it might be an option to create a type that is passed in instead of a What I don't like too much about sleep-get-wake is that it's not the default, and that the default is presumably more efficient; I'd prefer Rust wrappers not to add cost just to gain the safety benefits.
Yes, it can -- but given that it takes extra steps it's not my preferred option. Right now, my preferred solution would be a public "get TCB from uint8_t[] stack" function. That would be accessible to the optimizer. |
I don't see the problem with thread_t thread;
uint8_t stack[FOO];
thread_create(&thread, stack, sizeof(stack), ...); It would make also debugging much easier. Right now, all |
If we go there, that should be aligned right away as well. Provided the stack wiggling is well enough understood to be public (even if the implementation varies by platform), then both APIs can be provided, with the one that takes a thread-and-a-stack as the main work horse and the other going through wiggling to split the unaligned blob into the two parts. |
I strongly agree. Also @benpicco was in favor, but @kaspar030 shot that idea down. |
That was the discussion in #1298, whose last two comments I read as "there was offline discussion". The keyword there was futureproofing -- and I do see that requirements of exotic platforms might be exotic. (They might be so exotic that even the provided u8 array is nothing to forge a thread stack from, hello WASM, but that's a different affair. But at least weirdo platforms like "my stack has uint16_t growing down, and doubles growing up", not that I'd know such a platform, can be accommodated with the current API). So considering previous discussion the evolution-not-revolution step here would be to make the way from u8-stack-to-structured public, and build on that. |
I changed my mind on that one :/ |
That would change to sth like
But more importantly, as long as we hand out PIDs and they can be checked for validity, it ensures a bit more that the pointed to stack (and thread_t) are valid. Otherwise, that's up to the user to ensure. |
Hm, even if it's some lines more, the bulk of thread creation is elsewhere. Performance wise I doubt it would matter, even if threads would be started in a loop (instead of a couple times at boot). I mean, we're talking about a three line wrapper, right? |
The PIDs have these checks more, but they lack one significant other check: "Is this still the thread I meant, or has that crashed, stopped, and the PID reassigned?" Given we don't have a waiting-to-be-reaped state that'd need to be wait()ed for by the parent or any other responsible party and only then allows reassigning of the PID, the PID may already be in use by someone else. The TCB doesn't have that issue, as long as one is in agreement with the owner of the TCB that the TCB won't be repurposed until all references to it have been returned.
I've looked at the code where I'm doing that right now again, and it's two things:
|
The solution for the purpose of having a handle that can be sure to never mean any different than thread than the one just created is, to me, solved with this. (Given the TCB may now be repurposed just as the thread ID is, I'd need to do a different check, but "is the sp of the process still in the stack I gave it" should be a suitable criterion). Its actionable result is tracked in RIOT-OS/rust-riot-wrappers#23. There are different issues that were touched in and around this issue ("how would a thread creator that knows its entry function's stack requirements well know how much more than the required space to allocate?", "can a thread block on the completion of another one?", "how do you get the return value of a thread?"), but those probably deserve their own issues. |
Description
Right now, when creating a thread, it is hard to reliably get the child's TCB:
One could start the thread sleeping, maybe that's a sufficient workaround.
One could read obtain the TCB pointer right after start, check if it's within the stack that was provided.
One could use IPC to send the TCB of the child to the parent.
Otherwise, it's hard to distinguish between cases of the child running, and the child having died real quick and another task having been created with the same PID.
Why is it important to obtain the TCB? Without the TCB, it's hard to be certain the task stopped. Sure, if the PID is gone it's sure that the task has stopped (especially as it'd be very impolite to reap someone else's zombie children), but if the PID is still around, the parent might be misled into thinking the child still lives.
This is all more important when the parent must be (under penalty of UB) sure that the child is done for good with its stack before the memory can be used (as is the case when spawning threads from Rust), but (at least in theory) affects C too with similar considerations as in #17502.
Possible ways forward
This is not all that easy because
thread_create
may also return an error.An alternative (also with a new function shimmable to the old) is to provide a
**thread_t
that gets populated unless NULL.I'm originally not a fan of doing 1 (and the pointer comparison seems too ugly even though it's what riot-wrappers are doing right now) because I want the Rust wrappers to be zero-cost. But having played around a bit with 2, I'm questioning whether adding the API wouldn't just add the one more indirection wouldn't just add the same few byte to the builds that 1 adds. Really, what is the benefit of having zero-cost anything if to get there you have to raise the cost for everyone (who, realistically speaking, usually don't start and stop tasks all the time) rather than biting down on the small cost yourself?
Still, before committing to that (and closing the issue with "1 is right"), I'd like to hear others' opinions, maybe I missed use cases.
Related discussions
Using TCB pointers over PIDs was discussed in #1004; not done at least partially because you can pack a PID and extra data (like a flag to be set or message type) in a
void *arg
for callbacks, and can be multiplexed also with a negative integer.Ages ago, in #1298 there was discussion of having the TCB separate from the stack pool for alignment reasons. Any such thing would break the"is the TCB really where I provided stack" trick, but would also come with its own way of getting the TCB in or out, probably. Either way, AIU that all didn't explain where the TCB would be allocated. If it were to come from a pool (like PIDs do), threads that want to be finished properly by their parents would need to go to zombie state (not a bad thing, really, but not sufficient on its own).
If a parent wants to wait for a child to stop, we don't support that, and there's no need to -- the child could just as well use IPC to signal to the parent that it's now ready to be collected from its Stopped state (which the parent can see because it has a TCB pointer; not having the TCB pointer going to zombie mode is insufficient because still the zombie could be someone else's child, and then how would anyone know how to reap them). Due to races and priorities the parent can't immediately know that the child is done, but thanks to #17093 it can raise the child's priority to its own (and then yield) while the child is not stopped.
The text was updated successfully, but these errors were encountered: