-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix WriteAsyncCancelledFile test failure #56895
Conversation
Tagging subscribers to this area: @dotnet/area-system-io Issue DetailsTry to fix #56894 Please ignore this PR until the CI is 100% green
|
@jozkee some explanation: in #56716 I've changed the code to use |
get => _filePosition; | ||
set => _filePosition = value; | ||
get => Interlocked.Read(ref _filePosition); | ||
set => Interlocked.Exchange(ref _filePosition, value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of performance impact might this end up having, in particular on 32-bit? Is this just a case of a test doing something unsupported, or we think this access pattern driving the need for this is valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of performance impact might this end up having, in particular on 32-bit?
For the following micro-benchmark:
[Benchmark]
[Arguments(OneKibibyte, FileOptions.None)]
[Arguments(OneKibibyte, FileOptions.Asynchronous)]
public void PositionGetSet(long fileSize, FileOptions options)
{
string filePath = _sourceFilePaths[fileSize];
using (FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read, FourKibibytes, options))
{
for (long i = 0; fileStream.Position < fileSize; i++)
{
fileStream.Position = i;
}
}
}
For x86:
Method | Job | fileSize | options | Mean |
---|---|---|---|---|
PositionGetSet | .NET 5 | 1024 | None | 730.0 us |
PositionGetSet | .NET 6 | 1024 | None | 69.95 us |
PositionGetSet | this PR | 1024 | None | 329.94 us |
PositionGetSet | .NET 5 | 1024 | Asynchronous | 2,821.5 us |
PositionGetSet | .NET 6 | 1024 | Asynchronous | 80.85 us |
PositionGetSet | this PR | 1024 | Asynchronous | 346.32 us |
So the impact on x86 is quite huge, however, we are still few times faster than .NET 5 and I don't expect this method to be on a hot path.
Is this just a case of a test doing something unsupported, or we think this access pattern driving the need for this is valid?
The test is not doing anything unsupported. It starts an async write operation and tries to cancel it, if the cancellation succeeds, we check for file length and position to see that they are 0
.
Based on my analysis of the source code and handling of WriteFile
return values and callbacks my only idea was false sharing and the extra interlocked seems to solve the problem.
I am going to see if marking _position
as volatile
would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a volatile field cannot be of the type 'long'
So I don't have other ideas except of disabling this test for x86
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I have one more idea, but let's wait to see CI results before we get into details (it might be not worth it)
get => _filePosition; | ||
set => _filePosition = value; | ||
get => Interlocked.Read(ref _filePosition); | ||
set => Interlocked.Exchange(ref _filePosition, value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can this have an outdated value? The failing test (WriteAsyncCancelledFile
) is creating a new file and a new FileStream
instance in each run, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be caused by false sharing, but I am just guessing: https://docs.microsoft.com/en-us/archive/msdn-magazine/2008/october/net-matters-false-sharing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that's a good article. The lead author must really know what he's talking about. 😄
That said, this isn't false sharing. @jozkee is correct that this value shouldn't be outdated, as it comes linearly after the operations should have completed and quiesced with appropriate synchronization to ensure that all updates were published. I don't know why it's failing, nor why this fixes it, but it suggests a real problem with something potentially being done in a problematic order in the code, such as accidentally changing _filePosition after completing the task.
(When I previously commented, I hadn't actually looked at the test in question. I thought this was trying to fix a problem where the test was accessing position concurrently with asynchronous operations being in flight, in which case there could be torn reads/writes of a 64-bit value on a 32-bit platform. But that doesn't appear to be what this test is doing. Maybe something in the implementation itself is doing so?)
…ration finishes, but track the next async operation offset in a separate field and update it accordingly
I am going to close this PR and send a new one with the fix once I figure out what exactly is going on. Thank you all for feedback! |
Try to fix #56894
Please ignore this PR until the CI is 100% green