Developers should be able to pass state to custom converters. #63795

eiriktsarpalis · 2022-01-14T15:14:16Z

Background and Motivation

The current model for authoring custom converters in System.Text.Json is general-purpose and powerful enough to address most serialization customization requirements. Where it falls short currently is in the ability to accept user-provided state scoped to the current serialization operation; this effectively blocks a few relatively common scenaria:

Custom converters requiring dependency injection scoped to the current serialization. A lack of a reliable state passing mechanism can prompt users to rebuild the converter cache every time a serialization operation is performed.
Custom converters do not currently support streaming serialization. Built-in converters avail of the internal "resumable converter" abstraction, a pattern which allows partial serialization and deserialization by marshalling the serialization state/stack into a state object that gets passed along converters. It lets converters suspend and resume serialization as soon as the need to flush the buffer or read more data arises. This pattern is implemented using the internal JsonConveter<T>.TryWrite and JsonConverter<T>.TryRead methods.

Since resumable converters are an internal implementation detail, custom converters cannot support resumable serialization. This can create performance problems in both serialization and deserialization:
- In async serialization, System.Text.Json will delay flushing the buffer until the custom converter (and any child converters) have completed writing.
- In async deserialization, System.Text.Json will need to read ahead all JSON data at the current depth to ensure that the custom converter has access to all required data at the first read attempt.
We should consider exposing a variant of that abstraction to advanced custom converter authors.
Custom converters are not capable of passing internal serialization state, often resulting in functional bugs when custom converters are encountered in an object graph (cf. ReferenceHandler.IgnoreCycles doesn't work with Custom Converters #51715, Incorrect JsonException.Path when using non-leaf custom JsonConverters #67403, Keep LineNumber, BytePositionInLine and Path when calling JsonSerializer.Deserialize<TValue>(ref reader) #77345)

Proposal

Here is a rough sketch of how this API could look like:

namespace System.Text.Json.Serialization;

public struct JsonWriteState
{
    public CancellationToken { get; }
    public Dictionary<string, object> UserState { get; }
}

public struct JsonReadState
{
    public CancellationToken { get; }
    public Dictionary<string, object> UserState { get; }
}

public abstract class JsonStatefulConverter<T> : JsonConverter<T>
{
     public abstract void Write(Utf8JsonWriter writer, T value, JsonSerializerOptions options, ref JsonWriteState state);
     public abstract T? Read(ref Utf8JsonReader writer, Type typeToConvert, JsonSerializerOptions options, ref JsonReadState state);

      // Override the base methods: implemented in terms of the new virtuals and marked as sealed
      public sealed override void Write(Utf8JsonWriter writer, T value, JsonSerializerOptions options) {}
      public sealed override T? Read(ref Utf8JsonReader writer, Type typeToConvert, JsonSerializerOptions options) {}
}

public abstract class JsonResumableConverter<T> : JsonConverter<T>
{
     public abstract bool TryWrite(Utf8JsonWriter writer, T value, JsonSerializerOptions options, ref JsonWriteState state);
     public abstract bool TryRead(ref Utf8JsonReader writer, Type typeToConvert, JsonSerializerOptions options, ref JsonReadState state, out T? result);

      // Override the base methods: implemented in terms of the new virtuals and marked as sealed
      public sealed override void Write(Utf8JsonWriter writer, T value, JsonSerializerOptions options) {}
      public sealed override T? Read(ref Utf8JsonReader writer, Type typeToConvert, JsonSerializerOptions options) {}
}

public partial class JsonSerializer
{
     // Overloads to existing methods accepting state
     public static string Serialize<T>(T value, JsonSerializerOptions options, ref JsonWriteState state);
     public static string Serialize<T>(T value, JsonTypeInfo<T> typeInfo, ref JsonWriteState state);
     public static T? Deserialize<T>(string json, JsonSerializerOptions options, ref JsonReadState state);
     public static T? Deserialize<T>(string json, JsonTypeInfo<T> typeInfo, ref JsonReadState state);

     // New method groups enabling low-level streaming serialization
     public static bool TrySerialize(T value, JsonTypeInfo<T> typeInfo, ref JsonWriteState state);
     public static bool TryDeserialize(string json, JsonTypeInfo<T> typeInfo, ref JsonReadState state);
}

Users should be able to author custom converters that can take full advantage of async serialization, and compose correctly with the contextual serialization state. This is particularly important in the case of library authors, who might want to extend async serialization support for custom sets of types. It could also be used to author top-level async serialization methods that target other data sources (e.g. using System.IO.Pipelines cf. #29902)

Usage Examples

MyPoco value = new() { Value = "value" };
JsonWriteState state = new() { UserState = { ["sessionId"] = "myId" }};
JsonSerializer.Serialize(value, options, state); // { "sessionId" : "myId", "value" : "value" }

public class MyConverter : JsonStatefulConverter<MyPoco>
{
    public override void Write(Utf8JsonWriter writer, MyPoco value, JsonSerializerOptions options, ref JsonWriteState state)
    {
         writer.WriteStartObject();
         writer.WriteString("sessionId", (string)state.UserState["sessionId"]);
         writer.WriteString("value", value.Value);
     }
}

Alternative designs

We might want to consider the viability attaching the state values as properties on Utf8JsonWriter and Utf8JsonReader. It would avoid the need of introducing certain overloads, but on the flip side it could break scenaria where the writer/reader objects are being passed to nested serialization operations.

Goals

Support custom resumable converters.
Support custom converters that are passing the serialization state to child converters.
Support async serialization using data sources other than Stream (à la [API Proposal]: JsonSerializer.TryReadValue(ref Utf8JsonReader) #29902).
Support users attaching custom state to serialization operations ([API Proposal]: Support Custom Data on JsonSerializerOptions #71718)

Progress

Author prototype
API proposal & review
Implementation & tests
Conceptual documentation & blog posts.

The text was updated successfully, but these errors were encountered:

ghost · 2022-01-14T15:14:19Z

Tagging subscribers to this area: @dotnet/area-system-text-json
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and Motivation

Async serialization in System.Text.Json is a powerful feature that lets users serialize and deserialize from streaming JSON data, without the need to load the entire payload in-memory. At the converter level, this is achieved with "resumable converters", a pattern that allows for partial serialization and deserialization by marshalling the serialization state/stack into a mutable struct that gets passed along converters. It lets converters suspend and resume serialization as soon as the need to flush the buffer or read more data arises. This pattern is implemented using the internal JsonConveter<T>.TryWrite and JsonConverter<T>.TryRead methods.

Since resumable converters are an internal implementation detail, custom converters cannot support resumable serialization. This can create performance problems in both serialization and deserialization:

In async serialization, System.Text.Json will delay flushing the buffer until the custom converter (and any child converters) have completed writing.
In async deserialization, System.Text.Json will need to read ahead all JSON data at the current depth to ensure that the custom converter has access to all required data at the first read attempt.

Because custom converters cannot pass the serialization state, System.Text.Json suffers from a class of functional bugs that arise because of the serialization state resetting every time a custom converter is encountered in the object graph (cf. #51715).

Proposal

Users should be able to author custom converters that can take full advantage of async serialization, and compose correctly with the contextual serialization state. This is particularly important in the case of library authors, who might want to extend async serialization support for custom sets of types. It could also be used to author top-level async serialization methods that target other data sources (e.g. using System.IO.Pipelines cf. #29902)

This is a proposal to publicize aspects of the TryWrite/TryRead pattern to the users of System.Text.Json. It should be noted that this is a feature aimed exclusively towards "advanced" users, primarily third-party library and framework authors.

Goals

Support custom resumable converters.
Support custom converters that are passing the serialization state to child converters.
Support async serialization using data sources other than Stream (à la [API Proposal]: JsonSerializer.TryReadValue(ref Utf8JsonReader) #29902).

Progress

Author prototype
API proposal & review
Implementation & tests
Conceptual documentation & blog posts.

Author:	eiriktsarpalis
Assignees:	-
Labels:	`area-System.Text.Json`, `User Story`, `Priority:2`, `Cost:M`, `Team:Libraries`
Milestone:	7.0.0

eiriktsarpalis · 2022-01-14T15:14:31Z

Tagging @steveharter who might be interested in this.

jeffhandley · 2022-04-06T01:11:57Z

I'm updating this feature's milestone to Future as it is not likely to make it into .NET 7.

layomia · 2022-10-14T18:05:17Z

There are scenarios where a custom converter might want metadata info about the type (including it's properties) or property it is processing, e.g. #35240 (comment). I understand that with JsonSerializerOptions.GetTypeInfo(Type), it is now possible to retrieve type metadata within converters, but should there be a first class mechanism, e.g. adding relevant type/property metadata to the state object passed to converters?

I believe I've also come across scenarios where a converter might want to know where in the object graph it is being invoked, i.e the root object vs property values. Is that also state that should be passed?

eiriktsarpalis · 2022-10-14T18:29:55Z

Yes that's plausible. Effectively we should investigate what parts of ReadStack/WriteStack could/should be exposed to the user. I'm not sure if we could meaningfully expose what amounts to WriteStack.Current, since that only gets updated consistently when the internal converters are being called. Prototyping is certainly required to validate feasibility.

thomaslevesque · 2022-10-19T01:36:45Z

To be honest, with the proposed design, this feature would only be moderately useful, because it requires passing state explicitly to the Serialize/Deserialize methods. This doesn't help when you don't have control over these callsites, as is the case in ASP.NET Core JSON input/output formatters.

I realize what I'm talking about isn't really "state", at least not callsite-specific state, but it's what was requested in #71718, which has been closed in favor this issue... I don't think the proposed design addresses the requirements of #71718.

eiriktsarpalis · 2022-10-19T07:34:53Z

I realize what I'm talking about isn't really "state", at least not callsite-specific state, but it's what was requested in #71718, which has been closed in favor this issue... I don't think the proposed design addresses the requirements of #71718.

Yes, this proposal specifically concerns state scoped to the operation rather than the options instance. The latter is achievable if you really need it whereas the former is outright impossible.

stevejgordon · 2022-10-19T08:10:25Z

@eiriktsarpalis I must admit that I'd not read this through and assumed it solved the suggestion from my original proposal. I agree that this solves a different problem (which I've not personally run into). It would not help at all with the scenario I have when building a library. Is there any reason not to re-open #71718 to complement this? I think both are valid scenarios that should be possible to achieve with less ceremony.

eiriktsarpalis · 2022-10-19T08:12:19Z

Agree, I've reopened the issue based on your feedback.

osexpert · 2023-05-09T19:06:17Z

"We might want to consider the viability attaching the state values as properties on Utf8JsonWriter and Utf8JsonReader"
I see when Deserialize from Stream, an Utf8JsonReader instance is made for every iteration of a while-loop, so does not seem like a good fit.

dennis-yemelyanov · 2024-03-04T21:29:05Z

Was just looking for something like this. I'd like to be able to extract some value from the object during serialization and make it available after the serialization is done. It seems like currently the only way to achieve this is to be instantiating a new converter for each serialization, which is not great for perf.

Is this still planned to be implemented at some point?

brantburnett · 2024-03-22T21:02:41Z

Personally, I'm more interested in the performance benefits of the resuming bits of this proposal. While I realize they're somewhat related, I wonder if the lack of progress here could be partially related to the scope of including both resuming and user state in the same proposal. Should it be separated so the more valuable one (whichever that is) could be done independently, so long as the design has a path forward to the other?

eiriktsarpalis · 2024-03-22T21:14:51Z

Should it be separated so the more valuable one (whichever that is) could be done independently, so long as the design has a path forward to the other?

I think both are equally valuable. At the same time, we should be designing the serialization state types in a way that satisfy the requirements for both use cases.

andrewjsaid · 2024-05-31T09:38:51Z

What is the purpose of the JsonReadState and JsonWriteState parameters being passed in as ref?
The dictionary is already mutable so it's probably not that...

eiriktsarpalis · 2024-05-31T10:03:35Z

The types already exist as internal implementation detail and hold additional state which wouldn't be visible to users.

JustDre · 2024-11-11T19:23:22Z

Given that #64182 was closed in deference to this issue, I don't know why the solution needs to involve async converters when DeserializeAsyncEnumerable does not and yet provides significant memory savings over DeserializeAsync (whenever streaming deserialization is appropriate). It seems the ability to call DeserializeAsyncEnumerable at any arbitrary point of an incoming stream, and leaving the stream open at the appropriate byte location upon exit, would be immensely useful. While the solutions discussed above are optimal from a resource perspective, forcing all developers to fully support async converters seems like a bigger lift than necessary, simply to support non-root array deserialization. If an overload can be implemented to achieve this, then once async converters become available, developers can opt in to implement them as appropriate.

JustDre · 2024-11-11T19:30:27Z

Maybe it's not important, but issue #77018 (that tracks this one) is currently closed. I don't see it being tracked by an equivalent issue for .NET 9, but since it's days (hours?) away from release, it probably needs to be tracked by the equivalent for .NET 10.

Harpush · 2025-04-03T23:11:28Z

Any chance for non root deserialize to IAsyncEnumerable? Many APIs return the array inside an object thus eliminating the possibility to use IAsyncEnumerable

eiriktsarpalis · 2025-04-04T12:32:45Z

Any chance for non root deserialize to IAsyncEnumerable? Many APIs return the array inside an object thus eliminating the possibility to use IAsyncEnumerable

How would this work? Assuming you were trying to stream a large JSON object like so

{ "prop1" : 1, "largeProp" : [1,2,3,4,5,....], "prop2": 2 }

into the type

record Poco(int prop1, IAsyncEnumerable<int> largeProp, int prop2);

the returned value should have been hydrated with data for all its properties. How could this be reconciled with streaming?

airbreather · 2025-04-04T13:13:01Z

I'm guessing the idea (because I have a similar use case to this) would be that we have a root object that is essentially nothing more than a container for an array? GeoJSON FeatureCollection comes to mind.

Edit to elaborate: the implication being that we would deserialize only the inner array for a given property as though it were top-level, and all other properties would be ignored (except maybe for the purposes of raising format errors or something).

IMO not worth significant investment because it only works for narrow use cases, but not fundamentally incompatible or useless either.

eiriktsarpalis · 2025-04-04T13:19:01Z

Edit to elaborate: the implication being that we would deserialize only the inner array for a given property as though it were top-level, and all other properties would be ignored (except maybe for the purposes of raising format errors or something).

That is definitely achievable. The potential shape we've been thinking about is a JsonSerializer.DeserializeAsyncEnumerable overload that accepts a JSON pointer to the nested location that needs to be streamed. It means that any other contextual information would be discarded.

Harpush · 2025-04-04T14:05:17Z

Discarding is the way to go. Let's put aside streaming non array jsons as I am really not sure there is any way to do it except for tokens.
The idea is for cases like { "data": [/* huge array */] }. I have zero interest in data wrapper and I would be happy with the service returning just an array. So ideally I would expect a way to specify path or even property where the array exists.

Another more low level oriented approach can be to allow getting a tokens stream from http stream skipping whatever you want and then an option to say deserialize async enumerable from here.
This will allow to easily even parse (as one big concatenated array) a json like {data1: [], data2: []}.

eiriktsarpalis · 2025-04-04T14:08:34Z

Makes sense. I would say though that this is out of scope for this particular issue. Could you create a new one please?

Harpush · 2025-04-04T14:53:47Z

Makes sense. I would say though that this is out of scope for this particular issue. Could you create a new one please?

#114265

eiriktsarpalis added area-System.Text.Json User Story A single user-facing feature. Can be grouped under an epic. Priority:2 Work that is important, but not critical for the release Cost:M Work that requires one engineer up to 2 weeks Team:Libraries labels Jan 14, 2022

eiriktsarpalis added this to the 7.0.0 milestone Jan 14, 2022

ghost added the untriaged New issue has not been triaged by the area owner label Jan 14, 2022

eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Jan 14, 2022

eiriktsarpalis mentioned this issue Jan 27, 2022

[API Proposal]: Support streaming deserialization of JSON objects #64182

Closed

joelverhagen mentioned this issue Jan 31, 2022

Use System.Text.Json instead of Newtonsoft.Json NuGet/Insights#44

Closed

mattchidley mentioned this issue Mar 2, 2022

Passing a Stream Utf8JsonWriter to a JsonSerializer.Serialize method results in Pending bytes being misreported #66102

Closed

This was referenced Apr 2, 2022

JsonSerializer.Deserialize can't be reliably used with Utf8JsonReaders that have been resumed #67454

Open

Incorrect JsonException.Path when using non-leaf custom JsonConverters #67403

Closed

jeffhandley modified the milestones: 7.0.0, Future Apr 6, 2022

eiriktsarpalis mentioned this issue Apr 18, 2022

NullReferenceException when composing a custom converter with a default converter #57280

Open

eiriktsarpalis mentioned this issue Apr 27, 2022

[API Proposal]: Add PipeReader and PipeWriter overloads to JsonSerializer.SerializeAsync/DeserializeAsync #68586

Open

eiriktsarpalis mentioned this issue Jul 6, 2022

[API Proposal]: Support Custom Data on JsonSerializerOptions #71718

Open

eiriktsarpalis modified the milestones: Future, 8.0.0 Jul 6, 2022

eiriktsarpalis mentioned this issue Jul 19, 2022

[API Proposal]: System.Text.Json serialization callback context #59892

Closed

layomia mentioned this issue Aug 18, 2022

JsonConverter.Read call inside another converter throws InvalidOperationException: Cannot skip tokens on partial JSON. #74108

Closed

eiriktsarpalis mentioned this issue Oct 4, 2022

Consider clearing JsonSerializerOptions.Caching cache using a timer, not just based on incoming calls #76548

Closed

eiriktsarpalis mentioned this issue Oct 23, 2022

Keep LineNumber, BytePositionInLine and Path when calling JsonSerializer.Deserialize<TValue>(ref reader) #77345

Closed

eiriktsarpalis added Cost:L Work that requires one engineer up to 4 weeks and removed Cost:M Work that requires one engineer up to 2 weeks labels Jan 23, 2023

eiriktsarpalis modified the milestones: 8.0.0, Future Jan 23, 2023

eiriktsarpalis mentioned this issue Dec 2, 2023

Add ability to stream large strings in Utf8JsonWriter/Utf8JsonReader #67337

Open

eiriktsarpalis mentioned this issue Jan 22, 2024

Text.Json Serialization Context and Reference Handler #97315

Closed

eiriktsarpalis mentioned this issue Apr 18, 2024

[Blazor] Use JSON source generator during WebAssembly startup dotnet/aspnetcore#54956

Merged

eiriktsarpalis mentioned this issue Jun 3, 2024

IAsyncEnumerableOfTConverter<TAsyncEnumerable, TElement> throws OutOfMemoryException when custom JsonConverter involved #102984

Closed

Developers should be able to pass state to custom converters. #63795

Developers should be able to pass state to custom converters. #63795

Comments

eiriktsarpalis commented Jan 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background and Motivation

Proposal

Usage Examples

Alternative designs

Goals

Progress

ghost commented Jan 14, 2022

Background and Motivation

Proposal

Goals

Progress

Uh oh!

eiriktsarpalis commented Jan 14, 2022

Uh oh!

jeffhandley commented Apr 6, 2022

Uh oh!

layomia commented Oct 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eiriktsarpalis commented Oct 14, 2022

Uh oh!

thomaslevesque commented Oct 19, 2022

Uh oh!

eiriktsarpalis commented Oct 19, 2022

Uh oh!

stevejgordon commented Oct 19, 2022

Uh oh!

eiriktsarpalis commented Oct 19, 2022

Uh oh!

osexpert commented May 9, 2023

Uh oh!

dennis-yemelyanov commented Mar 4, 2024

Uh oh!

brantburnett commented Mar 22, 2024

Uh oh!

eiriktsarpalis commented Mar 22, 2024

Uh oh!

andrewjsaid commented May 31, 2024

Uh oh!

eiriktsarpalis commented May 31, 2024

Uh oh!

JustDre commented Nov 11, 2024

Uh oh!

JustDre commented Nov 11, 2024

Uh oh!

Harpush commented Apr 3, 2025

Uh oh!

eiriktsarpalis commented Apr 4, 2025

Uh oh!

airbreather commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eiriktsarpalis commented Apr 4, 2025

Uh oh!

Harpush commented Apr 4, 2025

Uh oh!

eiriktsarpalis commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Harpush commented Apr 4, 2025

Uh oh!

eiriktsarpalis commented Jan 14, 2022 •

edited

Loading

layomia commented Oct 14, 2022 •

edited

Loading

airbreather commented Apr 4, 2025 •

edited

Loading

eiriktsarpalis commented Apr 4, 2025 •

edited

Loading