-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data corruption in BinaryReader.ReadChars #30637
Comments
Since this issue exists since .NET Framework, I will give it the 5.0 milestone. |
@carlossanlop, you wrote above for the UTF8 issue "The code below passes on .NET Core 2.2 and fails on 3.0 when using UTF-8"... that issue passes or fails on .NET Framework? Regardless, it sounds like a regression from 2.2 that should be addressed in 3.0, no? |
It fails on .NET Framework, which is why it was originally opened as an internal bug. Then we tested it on Core, and since it fails, I opened a bug here too. |
I just tried your UTF8 example from above... it outputs true/true/true/true on both .NET Framework 4.8 and .NET Core 2.2, and fails with an exception on .NET Core 3.0. |
Ah you're correct. You quoted the text I wrote above the second case. |
This issue can be split into two:
|
I believe there is only one root cause, with two different symptoms. The code attempts to establish an upper limit for the number of bytes remaining in the stream that represent the char array being read, but it fails to account for the internal state of the decoder being used, and sometimes chooses an upper limit that’s too high. In 3.0, it ignores such state altogether, so the UTF-8 and UTF-16 symptoms manifest. In 2.2, it tries to account for the state, but in a manner that’s only sufficient for UTF-8, not for UTF-16, so only the UTF-16 symptom manifests. However, the root cause in either case is the same. |
It is even listed as known problem in the documentation: https://docs.microsoft.com/en-us/dotnet/api/system.io.binaryreader.readchars?view=netcore-3.0#remarks |
I ran the UTF8 example in both 2.2 and 3.0 and compared the code that is being executed:
This is the exception callstack:
@jkotas since your PR refactored the BinaryReader/Writer code, can you please take a look at the UTF8 regression? CC @JeremyKuhne |
BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455
BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455
…et#26324) BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455
BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455 Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455 Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455 Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455 Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455 Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455 Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
…) (#26356) BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455
…356) BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455 Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
BinaryReader.ReadChars incorrectly read more than necessary from the underlying Stream when multi-byte characters straddled the read chunks. Fixes https://github.com/dotnet/corefx/issues/40455 Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
Changes: mono/api-snapshot@fc50bc4...45a61d9 $ git diff --shortstat fc50bc4f...45a61d93 22 files changed, 775 insertions(+), 474 deletions(-) Changes: dotnet/cecil@a6c8f5e...a6a7f5c $ git diff --shortstat a6c8f5e1...a6a7f5c0 55 files changed, 818 insertions(+), 530 deletions(-) Changes: mono/corefx@1f87de3...49f1c45 $ git diff --shortstat e4f7102b...49f1c453 38 files changed, 1171 insertions(+), 419 deletions(-) Changes: dotnet/linker@ebe2a1f...e8d054b $ git diff --shortstat ebe2a1f4...e8d054bf 137 files changed, 5360 insertions(+), 1781 deletions(-) Changes: mono/mono@8946e49...18920a8 $ git diff --shortstat 8946e49a...18920a83 1811 files changed, 47240 insertions(+), 48331 deletions(-) Changes: xamarin/xamarin-android-api-compatibility@a61271e...50a3c52 $ git diff --shortstat a61271e0...50a3c52d 1 file changed, 2 insertions(+), 791 deletions(-) Fixes: #3619 Context: https://dev.azure.com/devdiv/DevDiv/_workitems/edit/1005448 Context: https://devdiv.visualstudio.com/DefaultCollection/DevDiv/_workitems/edit/967582 Context: https://github.com/dotnet/coreclr/issues/26370 Context: https://github.com/dotnet/coreclr/issues/26479 Context: https://github.com/dotnet/corefx/issues/40455 Context: https://github.com/dotnet/corefx/issues/40578 Context: mono/mono#7377 Context: mono/mono#12421 Context: mono/mono#12586 Context: mono/mono#14080 Context: mono/mono#14725 Context: mono/mono#14772 Context: mono/mono#15261 Context: mono/mono#15262 Context: mono/mono#15263 Context: mono/mono#15307 Context: mono/mono#15308 Context: mono/mono#15310 Context: mono/mono#15646 Context: mono/mono#15687 Context: mono/mono#15805 Context: mono/mono#15992 Context: mono/mono#15994 Context: mono/mono#15999 Context: mono/mono#16032 Context: mono/mono#16034 Context: mono/mono#16046 Context: mono/mono#16192 Context: mono/mono#16308 Context: mono/mono#16310 Context: mono/mono#16369 Context: mono/mono#16380 Context: mono/mono#16381 Context: mono/mono#16395 Context: mono/mono#16411 Context: mono/mono#16415 Context: mono/mono#16486 Context: mono/mono#16570 Context: mono/mono#16605 Context: mono/mono#16616 Context: mono/mono#16689 Context: mono/mono#16701 Context: mono/mono#16712 Context: mono/mono#16742 Context: mono/mono#16759 Context: mono/mono#16803 Context: mono/mono#16808 Context: mono/mono#16824 Context: mono/mono#16876 Context: mono/mono#16879 Context: mono/mono#16918 Context: mono/mono#16943 Context: mono/mono#16950 Context: mono/mono#16974 Context: mono/mono#17004 Context: mono/mono#17017 Context: mono/mono#17038 Context: mono/mono#17040 Context: mono/mono#17083 Context: mono/mono#17084 Context: mono/mono#17133 Context: mono/mono#17139 Context: mono/mono#17151 Context: mono/mono#17180 Context: mono/mono#17278 Context: mono/mono#17549 Context: mono/mono#17569 Context: mono/mono#17665 Context: mono/mono#17687 Context: mono/mono#17737 Context: mono/mono#17790 Context: mono/mono#17924 Context: mono/mono#17931 Context: https://github.com/mono/mono/issues/26758 Context: https://github.com/mono/mono/issues/37913 Context: dotnet/macios#7005
Changes: mono/api-snapshot@fc50bc4...45a61d9 $ git diff --shortstat fc50bc4f...45a61d93 22 files changed, 775 insertions(+), 474 deletions(-) Changes: dotnet/cecil@a6c8f5e...a6a7f5c $ git diff --shortstat a6c8f5e1...a6a7f5c0 55 files changed, 818 insertions(+), 530 deletions(-) Changes: mono/corefx@1f87de3...49f1c45 $ git diff --shortstat e4f7102b...49f1c453 38 files changed, 1171 insertions(+), 419 deletions(-) Changes: dotnet/linker@ebe2a1f...e8d054b $ git diff --shortstat ebe2a1f4...e8d054bf 137 files changed, 5360 insertions(+), 1781 deletions(-) Changes: mono/mono@8946e49...18920a8 $ git diff --shortstat 8946e49a...18920a83 1811 files changed, 47240 insertions(+), 48331 deletions(-) Changes: xamarin/xamarin-android-api-compatibility@a61271e...50a3c52 $ git diff --shortstat a61271e0...50a3c52d 1 file changed, 2 insertions(+), 791 deletions(-) Fixes: #3619 Context: https://dev.azure.com/devdiv/DevDiv/_workitems/edit/1005448 Context: https://devdiv.visualstudio.com/DefaultCollection/DevDiv/_workitems/edit/967582 Context: https://github.com/dotnet/coreclr/issues/26370 Context: https://github.com/dotnet/coreclr/issues/26479 Context: https://github.com/dotnet/corefx/issues/40455 Context: https://github.com/dotnet/corefx/issues/40578 Context: mono/mono#7377 Context: mono/mono#12421 Context: mono/mono#12586 Context: mono/mono#14080 Context: mono/mono#14725 Context: mono/mono#14772 Context: mono/mono#15261 Context: mono/mono#15262 Context: mono/mono#15263 Context: mono/mono#15307 Context: mono/mono#15308 Context: mono/mono#15310 Context: mono/mono#15646 Context: mono/mono#15687 Context: mono/mono#15805 Context: mono/mono#15992 Context: mono/mono#15994 Context: mono/mono#15999 Context: mono/mono#16032 Context: mono/mono#16034 Context: mono/mono#16046 Context: mono/mono#16192 Context: mono/mono#16308 Context: mono/mono#16310 Context: mono/mono#16369 Context: mono/mono#16380 Context: mono/mono#16381 Context: mono/mono#16395 Context: mono/mono#16411 Context: mono/mono#16415 Context: mono/mono#16486 Context: mono/mono#16570 Context: mono/mono#16605 Context: mono/mono#16616 Context: mono/mono#16689 Context: mono/mono#16701 Context: mono/mono#16712 Context: mono/mono#16742 Context: mono/mono#16759 Context: mono/mono#16803 Context: mono/mono#16808 Context: mono/mono#16824 Context: mono/mono#16876 Context: mono/mono#16879 Context: mono/mono#16918 Context: mono/mono#16943 Context: mono/mono#16950 Context: mono/mono#16974 Context: mono/mono#17004 Context: mono/mono#17017 Context: mono/mono#17038 Context: mono/mono#17040 Context: mono/mono#17083 Context: mono/mono#17084 Context: mono/mono#17133 Context: mono/mono#17139 Context: mono/mono#17151 Context: mono/mono#17180 Context: mono/mono#17278 Context: mono/mono#17549 Context: mono/mono#17569 Context: mono/mono#17665 Context: mono/mono#17687 Context: mono/mono#17737 Context: mono/mono#17790 Context: mono/mono#17924 Context: mono/mono#17931 Context: https://github.com/mono/mono/issues/26758 Context: https://github.com/mono/mono/issues/37913 Context: dotnet/macios#7005
This issue was first reported internally as a bug in .NET Framework, but it is also reproducible in .NET Core.
UTF-16
When using a 2-byte encoding (i.e. UTF-16), and
_stream.Read()
returns an uneven number of bytes (e.g. a network stream), then in the next iteration of the loop,numBytes
can be off by one, and the next read can go past the end of the encoded data in the stream. This can either cause an unexpected end of stream, and/or corruption of any subsequent data read from it.The repro below is supposed to print all "True"s, as it does when a
MemoryStream
is used. However, when using a mockNetStream
that simulates short reads, data corruption occurs.Update: It seems that this issue has always been there, so it's not a new regression.
UTF-8
The code below passes on .NET Framework and .NET Core 2.2, but fails on 3.0, so it's a new regression:
Remarks
BinaryReader
fail the UTF-16 repro.BinaryReader
that has a special case for handling t`his edge case for single-byte encodings:The special case in 2.2: dotnet/coreclr/blob/ce1d090d33b400a25620c0145046471495067cc7/src/mscorlib/src/System/IO/BinaryReader.cs#L377-L386
No special case in 3.0 and 5.0: dotnet/coreclr/blob/9642de76d4f5e563150a90f5923b304d87d7a8b1/src/System.Private.CoreLib/shared/System/IO/BinaryReader.cs#L387-L389
The text was updated successfully, but these errors were encountered: