Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition with cancellation tokens in Read/Flush operations #59090

Merged
merged 4 commits into from
Sep 15, 2021

Conversation

davidfowl
Copy link
Member

@davidfowl davidfowl commented Sep 14, 2021

  • There's a tight race where UnsafeRegister can fire inline and then ReadAsync never completes. This is because the cancellation token property has not been assigned yet so the callback noops. This change checks the result of the UnsafeRegister operation to see if ran synchronously and throws if it did. We also move any state transitions to after these checks to make sure the PipeAwaitable state doesn't change before throwing.

Fixes #58909

cc @zlatanov

PS: We conveniently added an overload that takes the cancellation token in .NET 6 but I didn't try to use it here.

- There's a tight race where UnsafeRegister can fire inline and then ReadAsync never completes. This is because the cancellation token property has not been assigned yet so the callback noops. This change checks the result of the UnsafeRegister operation to see if ran synchronously and throws if it did. We also move any state transitions to after these checks to make sure the PipeAwaitable state doesn't change before throwing.
@davidfowl davidfowl changed the title Fix race condition with cancellation tokens in Read/Write operations Fix race condition with cancellation tokens in Read/Flush operations Sep 14, 2021
@zlatanov
Copy link
Contributor

@davidfowl wouldn't it be a problem that the completion data isn't being zeroed out? Also the Pipe seem to use the completion data to schedule the next read / write.

Here Cancel(out completionData) will not be called:

public void CancellationTokenFired(out CompletionData completionData)
{
// We might be getting stale callbacks that we already unsubscribed from
if (CancellationToken.IsCancellationRequested)
{
Cancel(out completionData);
}
else
{
completionData = default;
}
}

And here the next TrySchedule will have default CompletionData:

private void ReaderCancellationRequested()
{
CompletionData completionData;
lock (SyncObj)
{
_readerAwaitable.CancellationTokenFired(out completionData);
}
TrySchedule(ReaderScheduler, completionData);
}

And it seems that TrySchedule with default completion data is a noop.

private static void TrySchedule(PipeScheduler scheduler, in CompletionData completionData)
{
Action<object?> completion = completionData.Completion;
// Nothing to do
if (completion is null)
{
return;
}
// Ultimately, we need to call either
// 1. The sync context with a delegate
// 2. The scheduler with a delegate
// That delegate and state will either be the action passed in directly
// or it will be that specified delegate wrapped in ExecutionContext.Run
if (completionData.SynchronizationContext is null && completionData.ExecutionContext is null)
{
// Common fast-path
scheduler.UnsafeSchedule(completion, completionData.CompletionState);
}
else
{
ScheduleWithContext(scheduler, in completionData);
}
}

This might as well be the desired behavior, I'm just pointing it out :)

@davidfowl
Copy link
Member Author

davidfowl commented Sep 14, 2021

In the cases where the PipeAwaitable returns no CompletionData it's a noop schedule. That's by design. It could be optimized by making these methods return a bool but this is how it has always worked.

@zlatanov
Copy link
Contributor

zlatanov commented Sep 14, 2021

In the cases where the PipeAwaitable returns no CompletionData it's a noop schedule. That's by design. It could be optimized by making these methods return a bool but this is how it has always worked.

This actually is my point. In case of synchronous cancellation callback, the completion data will be defaulted (but not reset). If it was asynchronous, it could contain actual data.

@davidfowl
Copy link
Member Author

davidfowl commented Sep 14, 2021

This actually is my point. In case of synchronous cancellation callback, the completion data will be defaulted (but not reset). If it was asynchronous, it could contain actual data.

That's the case that's being handled by this PR. It will run the callback, result in a noop, then throw synchronously from ReadAsync/FlushAsync (aka BeginOperation)

@adityamandaleeka
Copy link
Member

@davidfowl Will you be backporting this to 6 once it's in?

@davidfowl
Copy link
Member Author

6, 5, 3 yes

@ericstj
Copy link
Member

ericstj commented Sep 15, 2021

Runtime test failiures are known issue on Win11, disabled tests in #59135
image

Runtime-staging failure is build race condition fixed by #59181

@ericstj ericstj merged commit 467f5c6 into main Sep 15, 2021
@davidfowl
Copy link
Member Author

/backport to release/6.0

@github-actions
Copy link
Contributor

Started backporting to release/6.0: https://github.com/dotnet/runtime/actions/runs/1239603000

@ghost ghost locked as resolved and limited conversation to collaborators Nov 3, 2021
@davidfowl
Copy link
Member Author

/backport to release/6.0

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

System.IO.Pipelines.Pipe.ReadAsync fails to cancel on CancellationToken
6 participants