-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Add PipeReader and PipeWriter overloads to JsonSerializer.SerilizeAsync/DeserializeAsync #68586
Comments
Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis Issue DetailsBackground and motivationWhen we originally did System.Text.Json there was some discussion about adding overloads to support PipeReader and PipeWriter. These discussion died because we didn't want to decide on forcing every API that had a Stream overload to have one for PipeReader/PipeWriter (some of that is here #28325). ASP.NET Core can save on allocations and excess buffer copies if we had support for these primitives in the JSON serializer. This likely isn't the only change we need to make to get great performance (there are some tweaks to pipelines that can be made to improve buffer sizes), but the idea is to add the APIs so that we can improve this on top. Today in ASP.NET Core, we use the Stream overloads and:
Of course we should measure and see how this materializes but we need to expose APIs to do so (at least privately so that ASP.NET Core can properly plumb them through the chain). API Proposalnamespace System.Collections.Generic;
public static class JsonSerializer
{
public static Task SerializeAsync<TValue>(PipeWriter utf8Json, TValue value, JsonSerializerOptions? options = null, CancellationToken cancellationToken = default);
public static ValueTask<TValue?> DeserializeAsync<TValue>(PipeReader utf8Json, JsonSerializerOptions? options = null, CancellationToken cancellationToken = default)
} API Usagevar pipe = new Pipe();
await JsonSerialize.SerilizeAsync(pipe.Writer, new Person { Name = "David" });
await pipe.Writer.CompleteAsync();
var result = await JsonSerialize.DeserilizeAsync<Person>(pipe.Reader);
await pipe.Reader.CompleteAsync(); Alternative DesignsNo response RisksNo response
|
cc @stephentoub |
If we are reviewing this, it should probably be looked at in conjunction with #63795. I'm not sure if STJ should be taking a dependency on pipes, but it should certainly be possible to write extensions targeting pipes or any other streaming abstraction. |
Nothing need be merged for such experimentation. A motivated developer can do it all in the privacy of their own machine. Until there's meaningful data demonstrating this is a significant improvement, I don't want to pursue this, for the same reasons we've not done so in the past. |
Async converters aren't needed.
That's fair. I'll bring this back for API review when I have those numbers. |
The linked issue title might be slightly misleading, but the proposal concerns exposing resumable serialization in the STJ public APIs. This starts with extending converters of course, but it would include exposing resumable |
So the idea would be to implement these APIs in another package outside of the |
Maybe we can do the QUIC model?:
Alternatively, we can build a new experimental package that references System.Text.Json and has IVT so that it can use the internal APIs to build new static serializer methods that add the pipelines API. @stephentoub I am motivated to add these APIs but I want to reduce the friction and want to make it so that we can experiment with these changes and get benchmarks in a reasonable way. I'd love to help figuring out the most effective way to do that when changes cut across repos like this. |
We have dotnet/runtimelab. If we want to do pure experiments and merge and iterate on them, we can add them there. If we want to add a dotnet/aspnetcorelab, or have branches in dotnet/aspnetcore that consume dependencies from dotnet/runtimelab, we can do that. But I want to avoid churning the production code in dotnet/runtime for things that are pure experimentation. QUIC is a different beast: it's a major piece of work, that's fully intended to ship, that's used by production code in .NET 6 even if itself didn't expose public APIs, etc. |
So your suggestion for this change is to use dotnet/runtimelab to add 2 APIs to the JsonSerializer, and then make an aspcorelab (we have a labs repo already but its not a fork today) that consumes this change so we can make more public API changes. Feels like using a bazooka to kill an ant 😄. I'll spend some time investigating this. But you're right this experiment isn't the same level as QUIC, UTF8String, or native AOT. It's 2 APIs that we need to add. |
I'm suggesting if the goal is to make end-to-end experimentation easier that we make end-to-end experimentation easier, not limited to this specific case. |
That's fair, then I'd suggest runtimelab should contain ASP.NET Core. The 2 repos should be merged into one for experimentation purposes. IMO that's a more sane approach and would allow quicker experimentation. |
@eiriktsarpalis I'll make a branch with these API that I'd like your eyes on to make sure I didn't do anything dumb. I'll see if I can get some basic dependency flow going for this change so I can test it out in a meantime while we figure out the bigger labs changes. |
Is there any update on this API Proposal? |
Moving this out of future to 9.0.0. I'll update the API design with more overloads for JsonSerializationContext |
The recommendation for this to go into the System.Text.Json assembly and for System.IO.Pipelines to go into the shared framework as a result. Related to #28760 |
For the writing side, there's also a discussion around adding FlushAsync to IBufferWriter to make it usable for async IO. public interface IBufferWriter<T>
{
+ ValueTask FlushAsync(CancellationToken cancellationToken = default) => default;
} This would be a DIM and would only be implemented on .NET 9+. That would make the SerializeAsync overloads more general purpose: public class JsonSerializer
{
public static class JsonSerializer
{
+ public static Task SerializeAsync<TValue>(
+ IBufferWriter<byte> utf8Json,
+ TValue value, JsonSerializerOptions? options = null,
+ CancellationToken cancellationToken = default);
+ public static Task SerializeAsync(
+ IBufferWriter<byte> utf8Json,
+ object? value,
+ JsonTypeInfo jsonTypeInfo,
+ CancellationToken cancellationToken = default);
+ public static Task SerializeAsync(
+ IBufferWriter<byte> utf8Json,
+ object? value,
+ Type inputType,
+ JsonSerializerContext context,
+ CancellationToken cancellationToken = default)
} We could also consider: interface IAsyncBufferWriter<T> : IBufferWriter<T>
{
ValueTask FlushAsync(CancellationToken cancellationToken = default);
} Since It's likely that most code paths with |
Sharing from offline conversation: adding
There are likely more latent bugs to be discovered here. Non-allocating converters need to use special logic for handling ReadOnlySequence so there is a chance there could be latent bugs, both in our built-in converters and user-defined custom converters. If and when aspnetcore switches to PipeReader-based methods, we should make sure all these issues have been ironed out to avoid regressing users. |
For completeness, here’s the implementation we spiked ~ 5 years ago: Here’s a recent PipeWriter implementation by @BrennanConroy |
I have a |
We have a similar issue here with a chain of custom converters and large amounts of data ( The only way around this currently seems to be to base the serialization on an For a Kestrel output its more straightforward as it has handling for this sort of behaviour to Http pipeline, but we also need to support it via WebSocket and IpcSocket which don't so will have to construct a |
@benaadams Have a look at the Utf8JsonAsyncStreamReader. This was designed to work with any stream. You could write the |
@gragra33 am going the other way and writing out; also blow the limits of |
@benaadams You won't blow out memory if done correctly. There is a link to an article with sample code and data so you can try it out. The samples provided use eBay data as it was the largest public stream available for the purposes of demonstration. The samples provided also demonstrate both file and web streams, both raw and zipped. FYI, we're processing JSON streams over 10GB in size. |
@eiriktsarpalis The issue I found was with the |
We had a quick conversation among @davidfowl @halter73 @eiriktsarpalis and myself. Points discussed:
|
namespace System.Collections.Generic;
public static class JsonSerializer
{
public static Task SerializeAsync<TValue>(
PipeWriter utf8Json,
TValue value,
JsonTypeInfo<TValue> jsonTypeInfo,
CancellationToken cancellationToken = default);
public static Task SerializeAsync<TValue>(
PipeWriter utf8Json,
TValue value,
JsonSerializerOptions? options = null,
CancellationToken cancellationToken = default);
public static Task SerializeAsync(
PipeWriter utf8Json,
object? value,
JsonTypeInfo jsonTypeInfo,
CancellationToken cancellationToken = default);
public static Task SerializeAsync(
PipeWriter utf8Json,
object? value,
Type inputType,
JsonSerializerContext context,
CancellationToken cancellationToken = default);
public static Task SerializeAsync(
Stream utf8Json,
object? value,
Type inputType,
JsonSerializerOptions? options = null,
CancellationToken cancellationToken = default);
public static ValueTask<TValue?> DeserializeAsync<TValue>(
PipeReader utf8Json,
JsonSerializerOptions? options = null,
CancellationToken cancellationToken = default);
public static ValueTask<TValue?> DeserializeAsync<TValue>(
PipeReader utf8Json,
JsonTypeInfo<TValue> jsonTypeInfo,
CancellationToken cancellationToken = default);
public static ValueTask<object?> DeserializeAsync(
PipeReader utf8Json,
JsonTypeInfo jsonTypeInfo,
CancellationToken cancellationToken = default);
public static ValueTask<object?> DeserializeAsync(
PipeReader utf8Json,
Type returnType,
JsonSerializerContext context,
CancellationToken cancellationToken = default);
public static ValueTask<object?> DeserializeAsync(
Stream utf8Json,
Type returnType,
JsonSerializerOptions? options = null,
CancellationToken cancellationToken = default);
public static IAsyncEnumerable<TValue> DeserializeAsyncEnumerable<TValue>(
Stream utf8Json,
JsonSerializerOptions? options = null,
CancellationToken cancellationToken = default);
public static IAsyncEnumerable<TValue> DeserializeAsyncEnumerable<TValue>(
Stream utf8Json,
JsonTypeInfo<TValue> jsonTypeInfo,
CancellationToken cancellationToken = default);
} |
FWIW this was prototyped when STJ was just being created, before being removed due to the assembly dependency on System.IO.Pipelines: |
|
Moving to 10.0.0 since it's unlikely we'll get to |
This whole Whenever I see stuff that always takes the same pairs of overloads like this it looks like a design red flag to me. Asking out of curiosity (I don't know if there are any proposals out there discussing this already, if so please share them). |
Hi, would new serialization API work in pair with IAsyncEnumerables (perhaps as I am interesed in solution of streaming data directly from sql database to one microservice and then to another. |
@FLAMESpl All |
Background and motivation
When we originally did System.Text.Json there was some discussion about adding overloads to support PipeReader and PipeWriter. These discussion died because we didn't want to decide on forcing every API that had a Stream overload to have one for PipeReader/PipeWriter (some of that is here #28325). ASP.NET Core can save on allocations and excess buffer copies if we had support for these primitives in the JSON serializer. This likely isn't the only change we need to make to get great performance (there are some tweaks to pipelines that can be made to improve buffer sizes), but the idea is to add the APIs so that we can improve this on top.
Today in ASP.NET Core, we use the Stream overloads and:
Of course we should measure and see how this materializes but we need to expose APIs to do so (at least privately so that ASP.NET Core can properly plumb them through the chain).
API Proposal
API Usage
Alternative Designs
No response
Risks
No response
The text was updated successfully, but these errors were encountered: