-
-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance & fix WorkerThreadPool
#86587
Conversation
040a3e6
to
a75613b
Compare
The changes to the xml documentation are straightforward changes, but any idea why the asan and such is failing. |
f9b9178
to
e030ada
Compare
Marking as Needs work because I'm rethinking one of the changes a bit, as I've found room for extra improvement. |
e030ada
to
498e846
Compare
Now I'm happy with this. |
f690123
to
a0c2cee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I like the new approach!
a0c2cee
to
51adc0a
Compare
I just cancelled CI, the tests that run the unit tests appeared to be totally stuck. They had been running for 5 hours when I cancelled the workflow |
Sounds like a deadlock somewhere, suspect the threads aren't joined correctly As a first step I'd suggest using RAII locking where possible to avoid problems there |
I've reproduced it locally. Thankfully the current tests stress the thing enough to make the issue apparent. When group tasks are involved in huge amounts, some low-prio tasks are kept in the queue with no threads realizing the need to claim them. Under investigation. 🔍 |
5e04514
to
9bbe65e
Compare
This commits rewrites the sync logic in a way that the `use_system_threads_for_low_priority_tasks` setting, which was added due to the lack of a cross-platform wait-for-multiple-objects functionality, can be removed (it's as if it was effectively hardcoded to `false`). With the new implementation, we have the best of both worlds: threads don't have to poll, plus no bespoke threads are used. In addition, regarding deadlock prevention, since not every possible case of wait-deadlock could be avoided, this commits removes the current best-effort avoidance mechanisms and keeps only a simple, pessimistic way of detection. It turns out that the only current user of deadlock prevention, ResourceLoader, works fine with it and so every possible situation in resource loading is now properly handled, with no possibilities of deadlocking. There's a comment in the code with further details. Lastly, a potential for load tasks never being awaited/disposed is cleared.
This commit lets CommandQueueMT play nicely with the WorkerThreadPool to avoid non-progressable situations caused by an interdependence between both. While a command queue is being flushed, it allows the WTP to release its lock while tasks are being awaited so they can make progress in case they need in turn to post to the command queue.
Just a little optimization. **NOTE:** With `RID_Owner` we could replace each pair of `PagedAllocator` and `HashMap`-of-ids-to-pointers. However, that would force us to expose `RID` as the task/group id, instead of `int`, which would break the API. Too bad. Let's wait until Godot 5.0.
9bbe65e
to
a731774
Compare
OK. I've found and fixed the logic error. |
Thanks! |
A number of changes to the
WorkerThreadPool
, split into commits with specific goals:2023-01-05: Heavily edited after squashing two commits together and greatly polishing some changes.
WorkerThreadPool: Overhaul scheduling and synchronization
This commits rewrites the sync logic in a way that the
use_system_threads_for_low_priority_tasks
setting, which was added due to the lack of a cross-platform wait-for-multiple-objects functionality, can be removed (it's as if it was effectively hardcoded tofalse
).With the new implementation, we have the best of both worlds: threads don't have to poll, plus no bespoke threads are used.
In addition, regarding deadlock prevention, since not every possible case of wait-deadlock could be avoided, this commits removes the current best-effort avoidance mechanisms and keeps only a simple, pessimistic way of detection.
It turns out that the only current user of deadlock prevention, ResourceLoader, works fine with it and so every possible situation in resource loading is now properly handled, with no possibilities of deadlocking. There's a comment in the code with further details.
Lastly, a potential for load tasks never being awaited/disposed is cleared.
WorkerThreadPool: Avoid deadlocks when
CommandQueueMT
is involvedThis commit lets
CommandQueueMT
play nicely with the WorkerThreadPool to avoid non-progressable situations caused by an interdependence between both. While a command queue is being flushed, it allows theWorkerThreadPool
to releasethe lock of the former while tasks are being awaited so they can make progress in case they need in turn to post to the command queue.
WorkerThreadPool: Avoid most runtime allocations
Just a little optimization.
NOTE:
With
RID_Owner
we could replace each pair ofPagedAllocator
andHashMap
-of-ids-to-pointers. However, that would force us to exposeRID
as the task/group id, instead ofint
, which would break the API. Too bad. Let's wait until Godot 5.0.Includes #86555.Fixes #78790.
Testing appreciated. I'll keep testing it on different projects.