-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JsonSerializer deserialization using overloads taking Utf8JsonReader are unexpectedly slower than equivalents taking string #99674
Comments
Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis |
I believe The serializer, for custom converter validation, is done in a performant way: runtime/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonConverterOfT.cs Line 503 in 4cf19ee
Perhaps the |
The current behavior is very much intentional, but IIRC this is mostly forced because of assumptions that the serializer is making about input data (must be a self-contained JSON value without trailing data). In principle I think it might be possible to avoid some of that double parsing by reworking the core serialization routines (allowing trailing data, adding extra checks ensuring deserialization does not escape its current scope) but that would require benchmarking to ensure that it does result in net perf improvements. |
@eiriktsarpalis - The scenario that caused me to encounter this unexpected performance difference was the need to deserialize from a Given the apparent difficulty in changing the behaviour of the Unless perhaps there's already a way to deserialize from a |
That is one possibility, FWIW ReadOnlySequence was never a first-class citizen in the serializer layer (in many cases converters assume the underlying data is a span) however #68586 is going to change this. |
Description
Whilst testing JSON deserialization performance of moderately-sized (10-15KiB) UTF-8 data with the latest System.Text.Json NuGet package (8.0.3), I found that the
JsonSerializer.Deserialize(ref Utf8JsonReader, ...)
overloads seem to be unexpectedly much slower than those taking the data asstring
. The difference is such that it can be faster to convert the underlying UTF-8 bytes to astring
first and then use aJsonSerializer.Deserialize(string, ...)
overload instead.The problem seems most pronounced when the JSON data contains string values with many escape sequences (escaped double-quotes in my testing, but others may also exhibit the issue).
Configuration
Regression?
I don't believe so.
Data
The following code was used to perform the benchmark. I have omitted the JSON data itself from the snippet below due to its size, but can provide it separately.
Analysis
Cursory profiling suggests that the overhead of the
Deserialize(ref Utf8JsonReader, ...)
overloads stems from the calls toGetReaderScopedToNextValue()
, which performs areader.TrySkip()
that would seem to result in essentially parsing the JSON data twice - first while skipping the current array/object inGetReaderScopedToNextValue
, then again when actually deserializing the data. This perhaps explains why it's approximately twice as slow as the baseline in the above results.Unfortunately all
Deserialize(ref Utf8JsonReader, ...)
overloads appear to callGetReaderScopedToNextValue
, so this overhead cannot be avoided, even if the caller knows that the reader is already appropriately "scoped".The text was updated successfully, but these errors were encountered: