-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Win32 FileStream will issue a seek on every ReadAsync call #16354
Comments
cc; @ericstj, @ianhays. @JeremyKuhne |
The uber problem here is that FileStream's contract is to update the Length and Position as part completing the synchronous portion of the Read. Folks are seeing the same sorts of problems on Writes where we call setlength before the write. The fix here is to change that contract. If FileStream kept track of it's own Length and Position in the case of an async handle and was less aggressive about making sure the handle was in sync we'd fix these perf issues. The downside is that this changes the externally observable behavior of the file/handle (if shared) so we'd need to quirk it. |
Why do you need to change the contract? The code is already tracking this: If |
I was referring more to the write side of this bug. The length of the file is observable from an external process. |
Still the same thing, though. It is the OS responsibility to update the file length if the file length changed. What is the scenario where an external process require you to set the file pointer if the handle isn't available externally? |
Ideally yes, but we set a precedent in desktop otherwise (set length before write), thus the need to be careful about this fix. I'm not saying we don't do this, just saying we need to be careful about existing externally observable behavior. |
seek() + write() vs pwrite() |
My test was run on Windows. *Hibernating Rhinos Ltd * Oren Eini* l CEO l *Mobile: + 972-52-548-6969 Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811 On Fri, Feb 12, 2016 at 7:03 PM, OmariO notifications@github.com wrote:
|
@omariom You can simulate pwrite on Windows by using WriteFile with an OVERLAPPED structure with hEvent set to 0, and the 64bit file pointer set via Offset and OffsetHigh. |
Another problem with An excellent read for Linux: https://learning.oreilly.com/library/view/The+Linux+Programming+Interface/9781593272203/xhtml/ch05.xhtml#ch05lev1sec06 I am going to try to figure a way to use |
Why is that a problem? |
We perform the extra seek to verify that nobody else has changed the file offset in the meantime (because we want to throw in such situations). Despite doing it and paying the perf price for it we can still miss the situation when it got modified after the call to seek and before the call to read? |
Personally, I think this existing attempt at protection is inherently flawed in purpose. The implementation doesn't try to make it work, but rather simply tries to detect misuse, and as you say, such protection is broken, anyway. Maybe I'm misunderstanding the suggestion, but switching to pread suggests to me you're not just trying to be more accurate in the misuse detection, but actually aim to make the scenario work (rather than detect misuse). I don't think that needs to be a goal; personally I think we can just delete the flawed attempt at protection. What did you have in mind for pread? |
It is allowed to have multiple concurrent On runtime/src/libraries/System.Private.CoreLib/src/System/IO/FileStream.Unix.cs Lines 523 to 530 in c7fda3b
So that will ensure that the position is changed in step. That make me think that on Linux, you can just drop this (assuming that the semaphore applies to write as well). But on Windows, there is this comment that is interesting: runtime/src/libraries/System.Private.CoreLib/src/System/IO/FileStream.Windows.cs Lines 879 to 883 in c7fda3b
Seems to indicate that the position may change unexpected with writes. It looks like the Seems far fetched. |
Fixed by #49975, please expect a blog post with all the details and perf numbers in April (before we ship preview 4) |
Full description is here:
https://ayende.com/blog/173345/the-cost-of-async-i-o?key=227b21e219594be48ca03bc8312b08b9
But the gist of it is that we are trying to call ReadAsync on a file that is located on a remote share.
Using ReadAsync, we saw a really horrible performance degradation.
We tracked it down to this issue:
And here is the problematic code:
https://github.com/dotnet/corefx/blob/master/src/System.IO.FileSystem/src/System/IO/Win32FileStream.cs#L1201
Given that all async calls are always passing the position they use in the overlap, and how expensive this call is, it make for a very bad combination.
The text was updated successfully, but these errors were encountered: