TestActor startup deadlock #643
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Explanation by ChatGPT:
TL;DR
Akka.Hosting.TestKitcurrently callsTestKitBase.InitializeTestinside aWithActorsstartup callback.InitializeTestcreates the TestActor on the CallingThreadDispatcher (or otherwise thread-affine behavior), while the host startup thread is waiting for that callback to return. In parallel, your actor wiring (also running inWithActors) constructs actors that (directly or indirectly) send to the TestActor / EventStream during theirPreStart/DI initialization.Because the TestActor is bound to the “calling thread” (the startup thread) and that thread is blocked until the callback returns, any message that needs the TestActor to run won’t be processed → circular wait → deadlock. Turning timeouts up doesn’t help because nothing is running.
This shows up only under parallel test execution because you get concurrent hosts doing the above at the same time, increasing the chance that one host’s startup thread is the same thread that a second host needs to run a TestActor turn on.
Why this happens (step-by-step)
Host starts → ActorSystem created.
Akka.Hostingexecutes allWithActors((system, registry) => …)callbacks synchronously as part of startup.Hosting.TestKit’s callback runs and calls
TestKitBase.InitializeTest(system, …)inside that same callback.InitializeTestcreates TestActor (and wires event stream subscriptions). In classic TestKit this actor is usually constructed so that it behaves calling-thread-synchronously (CallingThreadDispatcher semantics) to makeExpect…deterministic.Your actor wiring (same
WithActorsbatch) constructs application actors. DuringPreStart/ registry setup they:Those sends require the TestActor to run on the startup thread (calling-thread semantics), but that thread is blocked waiting for the
WithActorscallback(s) to finish.→ Circular wait: startup thread can’t return because work wants to run on it; the work can’t run because the thread is blocked.
Parallel execution amplifies this because multiple hosts share the same thread pool; chances of the above interleaving go way up.
Changes