Skip to content

Conversation

@Arkatufus
Copy link
Contributor

Explanation by ChatGPT:

TL;DR

Akka.Hosting.TestKit currently calls TestKitBase.InitializeTest inside a WithActors startup callback.
InitializeTest creates the TestActor on the CallingThreadDispatcher (or otherwise thread-affine behavior), while the host startup thread is waiting for that callback to return. In parallel, your actor wiring (also running in WithActors) constructs actors that (directly or indirectly) send to the TestActor / EventStream during their PreStart/DI initialization.

Because the TestActor is bound to the “calling thread” (the startup thread) and that thread is blocked until the callback returns, any message that needs the TestActor to run won’t be processed → circular wait → deadlock. Turning timeouts up doesn’t help because nothing is running.

This shows up only under parallel test execution because you get concurrent hosts doing the above at the same time, increasing the chance that one host’s startup thread is the same thread that a second host needs to run a TestActor turn on.


Why this happens (step-by-step)

  1. Host starts → ActorSystem created.
    Akka.Hosting executes all WithActors((system, registry) => …) callbacks synchronously as part of startup.

  2. Hosting.TestKit’s callback runs and calls TestKitBase.InitializeTest(system, …) inside that same callback.

  3. InitializeTest creates TestActor (and wires event stream subscriptions). In classic TestKit this actor is usually constructed so that it behaves calling-thread-synchronously (CallingThreadDispatcher semantics) to make Expect… deterministic.

  4. Your actor wiring (same WithActors batch) constructs application actors. During PreStart / registry setup they:

    • log (event stream),
    • Watch/unwatch,
    • or (some test infra) send to the TestActor.
  5. Those sends require the TestActor to run on the startup thread (calling-thread semantics), but that thread is blocked waiting for the WithActors callback(s) to finish.
    Circular wait: startup thread can’t return because work wants to run on it; the work can’t run because the thread is blocked.

Parallel execution amplifies this because multiple hosts share the same thread pool; chances of the above interleaving go way up.

You don’t need Akka.Remote for this to happen; it’s entirely local and dispatcher/thread-affinity driven.

Changes

  • Add reproduction/regression test

@Arkatufus Arkatufus marked this pull request as draft August 19, 2025 16:52
@Arkatufus Arkatufus marked this pull request as ready for review August 22, 2025 15:40
@Arkatufus
Copy link
Contributor Author

Fixed by Akka.NET 1.5.48 changes

@Arkatufus Arkatufus merged commit b2768cb into akkadotnet:dev Aug 22, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant