-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Utf8JsonReader to read json from stream #30328
Comments
Maybe but not really since it's a ref struct, you'd need to make sure all of the data was in the stream before you parse, which kinda defeats the purpose of a Stream. We'd need to make the reader a class so that you could store the stream as a field or we'd need some new class that used the reader and a stream together (like the JsonSerializer). Why not use the JsonSerializer directly into a JsonElement? Or copy this logic https://github.com/dotnet/corefx/blob/347412c9a917c71a744d8e20b090da90aa558a79/src/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.Stream.cs#L75-L226 😄 |
A ref struct can hold a normal reference just fine, so it could theoretically work. IMO it would be bad, though, because the ValueSpan (or ValueSequence) properties would be returning Spans to buffers the user never owned, making their lifetime ambiguous (at best). Certainly we could make a Stream-based wrapper to do the buffer management, which inverts the flow: public class Utf8JsonStreamReader
{
...
public JsonTokenType TokenType { get; }
public int TokenStartIndex { get; }
public int TokenLength { get; }
public void CopyTokenValue(Span<byte> destination);
public void Read();
...
} But that seems awkward. |
But it can't have async operations (Rich didn't mention that in his description, but I expect David was assuming that as a necessity). |
I'd be happy w/o the wrapper (avoiding lifetime and async challenges), and for the ability to provide the json reader with document lines, one at a time. Ideally (for my scenario), I could give the reader an |
I am looking to read json from a file stream. Struggling to understand how to do that.. My use case is that I want to open a json file, and navigate to a particular section of it, then deserialise just that particular section of it. I thought I best use |
I hope this gets fixed. @dazinator from the same answer you linked, in the comments somebody found it a few bugs and fixed them in this repo. |
I came here to figure out
And i have custom Reason: Decision: Increasing |
I get the point above that Especially since |
I came up with the following code. It reads JSON dataset as using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.CompilerServices;
using System.Text.Json;
using System.Threading;
using Opw.HttpExceptions;
namespace YourApp
{
using Record = IEnumerable<KeyValuePair<string, object>>;
public static class JsonArrayReader
{
public static IAsyncEnumerable<Record> ReadJsonRecords(this Stream input, CancellationToken cancellationToken)
{
bool isArrayStart = true;
return Parse(input, cancellationToken, (ref Utf8JsonReader reader) =>
{
if (isArrayStart)
{
ReadArrayStart(ref reader);
isArrayStart = false;
}
return ReadRecords(ref reader);
});
}
private delegate IEnumerable<T> Parser<T>(ref Utf8JsonReader reader);
// inspired by https://github.com/scalablecory/system-text-json-samples/blob/master/json-test/JsonParser.ParseSimpleAsync.cs
private static async IAsyncEnumerable<T> Parse<T>(Stream input, [EnumeratorCancellation] CancellationToken cancellationToken, Parser<T> parser)
{
var buffer = new byte[4096];
var fill = 0;
var consumed = 0;
var done = false;
var readerState = new JsonReaderState();
while (!done)
{
if (fill == buffer.Length)
{
if (consumed != 0)
{
buffer.AsSpan(consumed).CopyTo(buffer);
fill -= consumed;
consumed = 0;
}
else
{
Array.Resize(ref buffer, buffer.Length * 3 / 2);
}
}
int read = await input.ReadAsync(buffer.AsMemory(fill), cancellationToken).ConfigureAwait(false);
fill += read;
done = read == 0;
foreach (var item in ParseBuffer())
{
yield return item;
}
}
IEnumerable<T> ParseBuffer()
{
var reader = new Utf8JsonReader(buffer.AsSpan(consumed, fill - consumed), done, readerState);
var result = parser(ref reader);
consumed += (int)reader.BytesConsumed;
readerState = reader.CurrentState;
return result;
}
}
private static void ReadArrayStart(ref Utf8JsonReader reader)
{
if (!reader.Read())
{
throw new BadRequestException("Unexpected EOF");
}
// skip comments
while (reader.TokenType == JsonTokenType.Comment)
{
reader.Skip();
}
if (reader.TokenType != JsonTokenType.StartArray)
{
throw new BadRequestException($"Expect JSON array, but got {reader.TokenType}");
}
}
private static IEnumerable<Record> ReadRecords(ref Utf8JsonReader reader)
{
var records = new List<Record>();
while (true)
{
if (!reader.Read())
{
if (reader.TokenType == JsonTokenType.EndArray)
{
break;
}
throw new BadRequestException("Unexpected EOF");
}
if (reader.TokenType == JsonTokenType.EndArray)
{
break;
}
if (reader.TokenType != JsonTokenType.StartObject)
{
throw new BadRequestException($"Expect {JsonTokenType.StartObject}, but got {reader.TokenType}");
}
var record = ReadRecord(ref reader);
if (record == null)
{
break;
}
records.Add(record);
}
return records;
}
private static Record ReadRecord(ref Utf8JsonReader reader)
{
try
{
var savePoint = reader;
var result = JsonSerializer.Deserialize<Dictionary<string, object>>(ref savePoint);
reader = savePoint;
return result;
}
catch (JsonException)
{
return null;
}
}
}
} It reuses idea from https://github.com/scalablecory/system-text-json-samples/blob/master/json-test/JsonParser.ParseSimpleAsync.cs. Also if you target .NET 3+ you might have to implement |
@alexandrvslv, can you please file a separate issue with a simplified repro test app of the issue you were seeing. At first glance, this seems like a bug, and you shouldn't need to increase the |
@ahsonkhan, i will try to reproduce the issue, it may take some time to implement test for serialize to json with custom formater, process data with some pipe, and deserialize it with custom parser. |
To add to @bartonjs and @stephentoub's initial points, it is currently possible to snapshot a I think this would have been a different conversation if |
See also #30405 where there is a similar request to start deserialization from a particular point in a Stream. |
Basically, implement analogue of JsonTextReader(TextReader).
My scenario is reading the result of:
docker inspect [image]
(which produces a json document), either called via Process.Start() or piped in via standard input. Both scenarios result in TextReader objects. I’d like to see either a new constructor to enable this scenario or some straightforward collaborative mechanism between the two readers.Related: https://github.com/dotnet/corefx/issues/38581
The text was updated successfully, but these errors were encountered: