-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/9.0-rc2] NRBF Fuzzer and bug fixes #107788
[release/9.0-rc2] NRBF Fuzzer and bug fixes #107788
Conversation
* bug #1: don't allow for values out of the SerializationRecordType enum range * bug #2: throw SerializationException rather than KeyNotFoundException when the referenced record is missing or it points to a record of different type * bug #3: throw SerializationException rather than FormatException when it's being thrown by BinaryReader (or sth else that we use) * bug #4: document the fact that IOException can be thrown * bug #5: throw SerializationException rather than OverflowException when parsing the decimal fails * bug #6: 0 and 17 are illegal values for PrimitiveType enum * bug #7: throw SerializationException when a surrogate character is read (so far an ArgumentException was thrown) # Conflicts: # src/libraries/System.Formats.Nrbf/src/System/Formats/Nrbf/NrbfDecoder.cs
dotnet#107532) (so far an ArgumentException was thrown)
- Don't use `Debug.Fail` not followed by an exception (it may cause problems for apps deployed in Debug) - avoid Int32 overflow - throw for unexpected enum values just in case parsing has not rejected them - validate the number of chars read by BinaryReader.ReadChars - pass serialization record id to ex message - return false rather than throw EndOfStreamException when provided Stream has not enough data - don't restore the position in finally - limit max SZ and MD array length to Array.MaxLength, stop using LinkedList<T> as List<T> will be able to hold all elements now - remove internal enum values that were always illegal, but needed to be handled everywhere - Fix DebuggerDisplay
* copy comments and asserts from Levis internal code review * apply Levis suggestion: don't store Array.MaxLength as a const, as it may change in the future * add missing and fix some of the existing comments * first bug fix: SerializationRecord.TypeNameMatches should throw ArgumentNullException for null Type argument * second bug fix: SerializationRecord.TypeNameMatches should know the difference between SZArray and single-dimension, non-zero offset arrays (example: int[] and int[*]) * third bug fix: don't cast bytes to booleans * fourth bug fix: don't cast bytes to DateTimes * add one test case that I've forgot in previous PR # Conflicts: # src/libraries/System.Formats.Nrbf/src/System/Formats/Nrbf/SerializationRecord.cs
* introduce ArrayRecord.FlattenedLength * do not include invalid Type or Assembly names in the exception messages, as it's most likely corrupted/tampered/malicious data and could be used as a vector of attack. * It is possible to have binary array records have an element type of array without being marked as jagged
@MihuBot fuzz NrbfDecoder |
@@ -75,29 +75,30 @@ ulong value when TypeNameMatches(typeof(UIntPtr)) => Create(new UIntPtr(value)), | |||
_ => this | |||
}; | |||
} | |||
else if (HasMember("_ticks") && MemberValues[0] is long ticks && TypeNameMatches(typeof(TimeSpan))) | |||
else if (HasMember("_ticks") && GetRawValue("_ticks") is long ticks && TypeNameMatches(typeof(TimeSpan))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is double-dipping the dictionary. Ideally it would be enhanced to use a TryGetValue approach.
That doubles the perf cost of this, and "double" isn't a threat, so it's not critical to fix, but would maybe make sense to do for vNext/main.
I was going to comment that the HasMember call was unnecessary, but it looks like GetMember (called by GetRawValue) will throw for an unknown name.
@adamsitnik this won't make it before the RC2, which I will do in a few minutes. Please retarget this PR to the |
Assert.True(a); | ||
bool b = bools[1]; | ||
Assert.True(b); | ||
bool c = a && b; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you need to use single-ampersand to test what you were trying to test (reliably)
Sharplab shows that sometimes the compiler writes down the IL instruction "and" (which turns into the x86 "and") for both operations, and sometimes it uses brfalse for "&&".
The only time I've seen it use brfalse involved spans, e.g. this example on sharplab...
but &
always ends up as IL:and.
Probably doesn't super matter, as this probably uses IL:and.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After finding some boundary examples, I believe that Roslyn turns expr1 && expr2
into and
if expr2
is a literal-, local-, or parameter- expression, and brfalse
otherwise. (In my linked example it used brfalse
because the RHS was an indexer-expression).
If I'm right, then your &&
here will use the IL and
instruction, hence the CPU and
instruction, so it is a valid test.
But, wow, that took a long time to "prove".
writer.Write((byte)PrimitiveType.Char); | ||
} | ||
|
||
writer.Write((byte)0xC0); // a surrogate character |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there also tests that show that a valid surrogate pair work? Or does BinaryFormatter not support characters outside the BMP?
{ | ||
result = reader.ReadChars(count); | ||
} | ||
catch (ArgumentException) // A surrogate character was read. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a single character I can maybe see it throwing for a surrogate half; but I would expect a multiple character read to work for correctly paired surrogates.
Is the comment wrong, and it meant "an unpaired surrogate character was read" (emphasis on "unpaired") or does whatever calls this not support characters outside the BMP? (question is slightly repeated in the tests)
|
||
SerializationRecord record = NrbfDecoder.Decode(Serialize(input)); | ||
|
||
Assert.Throws<ArgumentNullException>(() => record.TypeNameMatches(type: null)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert.Throws<ArgumentNullException>(() => record.TypeNameMatches(type: null)); | |
AssertExtensions.Throws<ArgumentNullException>("type", () => record.TypeNameMatches(type: null)); |
Is a better test, as it ensures that the parameter name is correct.
throw new SerializationException(ex.Message, ex); | ||
} | ||
|
||
return Unsafe.As<long, DateTime>(ref data); | ||
[MethodImpl(MethodImplOptions.NoInlining)] | ||
static DateTime CreateFromAmbiguousDst(ulong ticks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see new tests for this code. Hopefully there are existing ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anything that would preclude backporting; but there are a few "room for improvement" things for main/vNext.
We need to retarget this PR to the |
Tagging subscribers to 'binaryformatter-migration': @adamsitnik, @bartonjs, @jeffhandley, @terrajobst |
This PR combines the PRs that added Fuzzing for
NrbfDecoder
with the most recent bug fixes.It contains of:
DateTime
and are now compliantSerializationException
rather thanArgumentException
for surrogate charactersStream
code pathDebug.Fail
and limited the max array length toArray.MaxLength
(so far we were doing that only for single dimension arrays, now we do it for multi dimensional arrays as well)boolean
andDateTime
, addednull
handling toSerializationRecord.TypeNameMatches
copied a lot of valuable comments from the internal code reviewCustomer Impact
Half of the bugs were discovered by the Fuzzer, the other half were reported internally by @GrabYourPitchforks.
Regression
[If yes, specify when the regression was introduced. Provide the PR or commit if known.]
Testing
All bugs have been turned into unit tests (and of course are passing now).
Risk
The bug fixes present in this PR were relatively simple and each of them individually represents a low risk. But the fact that there are so many increases the risk. Because of this risk, we decided to Mark the System.Formats.Nrbf assembly as [Experimental] with SYSLIB5005 (#107905).