Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] AsyncAutoResetEvent deadlocks #82

Closed
jasonswearingen opened this issue Oct 28, 2021 · 12 comments
Closed

[bug] AsyncAutoResetEvent deadlocks #82

jasonswearingen opened this issue Oct 28, 2021 · 12 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@jasonswearingen
Copy link

jasonswearingen commented Oct 28, 2021

Run this test. On my computer, it deadlocks after about 6 seconds

	[TestMethod]
	public async Task DotNextAsyncAutoResetEvent()
	{
		DotNext.Threading.AsyncAutoResetEvent autoResetEvent = new(false);

		var loopCount = 0;
		var setCount = 0;
		var consumedCount = 0;
		var timer =Stopwatch.StartNew();
		var lastSecondReported = 0;
		var producerTask = Task.Run(() => {

			while (true)
			{
				loopCount++;
				var didSet = autoResetEvent.Set();
				if (didSet)
				{
					setCount++;
				}

				if (timer.Elapsed > TimeSpan.FromSeconds(lastSecondReported))
				{
					var tup1 = new { loopCount };
					var tup = new { loopCount, setCount, consumedCount };
					Console.WriteLine($"t={lastSecondReported}sec:  {new { loopCount, setCount, consumedCount }}");
					lastSecondReported++;
				}

				if (timer.Elapsed > TimeSpan.FromSeconds(30))
				{
					break;
				}
			}
		});


		var consumerTask = Task.Run(async () => {

			while (true)
			{
				var success = await autoResetEvent.WaitAsync(TimeSpan.FromMilliseconds(1));
				if (success)
				{
					consumedCount++;
				}
				if (producerTask.IsCompleted)
				{
					break;
				}
			}
		});


		await producerTask;
		autoResetEvent.Set();
		await consumerTask;
	}

using DotNext 4.0.0-beta.8 from Nuget

@jasonswearingen
Copy link
Author

using await autoResetEvent.WaitAsync(); instead of the TimSpan overload does not seem to deadlock

@sakno sakno self-assigned this Oct 28, 2021
@sakno sakno added the bug Something isn't working label Oct 28, 2021
@sakno sakno added this to the 4.0 milestone Oct 28, 2021
@sakno
Copy link
Collaborator

sakno commented Oct 28, 2021

@jasonswearingen , thanks for the report! Regardless of potential bug, I'm trying to figure out what happening in your test. I would appreciate if you can help me with that. I see the following:

  1. Logical thread trying to set the event to signaled state during 30 seconds (Producer thread)
  2. Another thread trying to wait and reset the event automatically (Consumer thread). If Producer thread is finished, then finalize Consumer thread as well
  3. The test method waits for Producer thread only

I see that Producer thread has no async code (no await operators). Moreover, the test waits for Producer completion only. So I'm not sure how AsyncAutoResetEvent can block the entire test. Even if it is bugged, it's not observable by the test. I think, you miss await consumerTask.

@jasonswearingen
Copy link
Author

yes you are correct regarding the test workflow.

And yes, there is a small bug in the test that would leave the consumer task stalled if the deadlock did not exist. The last line should have:

await producerTask;
autoResetEvent.Set();

I have updated the test code accordingly.

@sakno
Copy link
Collaborator

sakno commented Oct 28, 2021

Also, there is an interesting bug in .NET itself that was fixed: dotnet/runtime#60182. It is observable for very small timeouts only. 1 milliseconds is enough to blow up on that mine. Unfortunately, the fix is not included in .NET 6 RC2. Timeout tracking in all asynchronous primitives rely on CancellationTokenSource and its ability to reset the state. The bug unpredictably crashes the internal state of DotNext.Threading.Tasks.ValueTaskCompletionSource<T> that might be a source of deadlock.

@sakno
Copy link
Collaborator

sakno commented Oct 28, 2021

@jasonswearingen , maybe

await producerTask;
autoResetEvent.Set();
await consumerTask;

?

@jasonswearingen
Copy link
Author

sure that would be better :)

btw when I ran it, the producer thread is hung inside the .Set() call, and further up the callstack it is stuck inside the ValueTaskCompletionSource. So your theory seems valid.

@sakno
Copy link
Collaborator

sakno commented Oct 28, 2021

Unfortunately, we have to wait for final release of .NET 6 and then check again.

@sakno
Copy link
Collaborator

sakno commented Nov 10, 2021

@jasonswearingen , .NEXT RC1 has been released. It uses GA version of .NET 6 so you can try tests again.

@jasonswearingen
Copy link
Author

jasonswearingen commented Nov 11, 2021

The bug is still there. deadlocks in same location. tested with dotnet6.0 release and dotNext 4.0.0-rc.1

@sakno
Copy link
Collaborator

sakno commented Nov 11, 2021

I found the root cause:

  1. Thread # 1: ManualResetCompletionSource.CancellationRequested is trying to enter monitor infinitely
  2. Thread # 2: ManualResetCompletionSource.OnCompleted is trying to enter monitor infinitely
  3. Thread # 3: ManualResetCompletionSource.StopTrackingCancellation stuck on timeoutTracker.Dispose() method

timeoutTracker is of type CancellationTokenRegistration. I think that Dispose waits for completion of cancellation callback. In the same time, cancellation callback stuck on Thread # 1. So Dispose is called within monitor and cancellation callback too. As a result, we observe a dead lock.

@sakno
Copy link
Collaborator

sakno commented Nov 11, 2021

According to sources in .NET repo, Dispose call is blocking while Unregister not.

@sakno
Copy link
Collaborator

sakno commented Nov 11, 2021

@jasonswearingen , many thanks for reporting this bug! The potential fix is in develop branch. The related commit is specified above.

@sakno sakno closed this as completed in 495e56f Nov 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants