Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utf8JsonReader\Writer should support all primitive types #29000

Closed
steveharter opened this issue Mar 18, 2019 · 15 comments
Closed

Utf8JsonReader\Writer should support all primitive types #29000

steveharter opened this issue Mar 18, 2019 · 15 comments
Assignees
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Text.Json
Milestone

Comments

@steveharter
Copy link
Member

Currently these primitive types are not directly supported and must have custom code to use, such as what the JsonSerializer currently does:

  • char (treated as a string of length 1)
  • byte
  • sbyte
  • short
  • ushort

cc @ahsonkhan

@ahsonkhan
Copy link
Contributor

ahsonkhan commented Mar 18, 2019

float

{Try}GetSingle exists already.

Same with WriteNumber with float support.

Therefore, I don't think we need custom code like this. @steveharter, please let me know if I am mistaken.
https://github.com/dotnet/corefx/blob/d866aa6eb7f4f5c2df5892b09d694280d2982150/src/System.Text.Json/src/System/Text/Json/Serialization/Converters/JsonValueConverterSingle.cs#L11-L23

New APIs:

public ref partial struct Utf8JsonReader
{
    public byte GetByte() { throw null; }
    public sbyte GetSByte() { throw null; }
    public short GetInt16() { throw null; }
    public ushort GetUInt16() { throw null; }
    public bool TryGetByte(out byte value) { throw null; }
    public bool TryGetSByte(out sbyte value) { throw null; }
    public bool TryGetInt16(out short value) { throw null; }
    public bool TryGetUInt16(out ushort value) { throw null; }

    public char GetChar() { throw null; }
    public bool TryGetChar(out char value) { throw null; }
}

public readonly partial struct JsonElement
{
    public byte GetByte() { throw null; }
    public sbyte GetSByte() { throw null; }
    public short GetInt16() { throw null; }
    public ushort GetUInt16() { throw null; }
    public bool TryGetByte(out byte value) { throw null; }
    public bool TryGetSByte(out sbyte value) { throw null; }
    public bool TryGetInt16(out short value) { throw null; }
    public bool TryGetUInt16(out ushort value) { throw null; }

    public char GetChar() { throw null; }
    public bool TryGetChar(out char value) { throw null; }
}

public sealed partial class Utf8JsonWriter : IDisposable
{
    // Not strictly needed since the int32 overload works as well.
    public void WriteNumberValue(byte value) { }
    public void WriteNumberValue(sbyte value) { }
    public void WriteNumberValue(short value) { }
    public void WriteNumberValue(ushort value) { }

    // Not strictly needed since the int32 overload works as well.
    public void WriteNumber(System.ReadOnlySpan<byte> utf8PropertyName, byte value) { }
    public void WriteNumber(System.ReadOnlySpan<byte> utf8PropertyName, sbyte value) { }
    public void WriteNumber(System.ReadOnlySpan<byte> utf8PropertyName, short value) { }
    public void WriteNumber(System.ReadOnlySpan<byte> utf8PropertyName, ushort value) { }
    public void WriteNumber(System.ReadOnlySpan<char> propertyName, byte value) { }
    public void WriteNumber(System.ReadOnlySpan<char> propertyName, sbyte value) { }
    public void WriteNumber(System.ReadOnlySpan<char> propertyName, short value) { }
    public void WriteNumber(System.ReadOnlySpan<char> propertyName, ushort value) { }
    public void WriteNumber(string propertyName, byte value) { }
    public void WriteNumber(string propertyName, sbyte value) { }
    public void WriteNumber(string propertyName, short value) { }
    public void WriteNumber(string propertyName, ushort value) { }
    public void WriteNumber(JsonEncodedText propertyName, byte value) { }
    public void WriteNumber(JsonEncodedText propertyName, sbyte value) { }
    public void WriteNumber(JsonEncodedText propertyName, short value) { }
    public void WriteNumber(JsonEncodedText propertyName, ushort value) { }

    // For single-char string, user can wrap it in a span.
    public void WriteString(string propertyName, char value) { }
    public void WriteString(JsonEncodedText propertyName, char value) { }
    public void WriteString(System.ReadOnlySpan<char> propertyName, char value) { }
    public void WriteString(System.ReadOnlySpan<byte> utf8PropertyName, char value) { }
    public void WriteStringValue(char value) { }
}

Open Questions:

  1. {Try}GetChar semantics? Throw/return false if not a single-character? Surrogate pair behavior?
  2. WriteStringValue(char) semantics? Write the character as a JSON string surrounded with quotes? Surrogate pair behavior?
  3. @bartonjs, should we add similar APIs to JsonElement as well?
  4. What about WriteStringValue(byte utf8Char)? Is that not appropriate given the encoding is variable length and more often there will be 1-4 bytes (whereas 2-char pairs are less common)?

Current workarounds within the serializer:
https://github.com/dotnet/corefx/blob/d866aa6eb7f4f5c2df5892b09d694280d2982150/src/System.Text.Json/src/System/Text/Json/Serialization/Converters/JsonValueConverterInfoInt16.cs#L11-L33
https://github.com/dotnet/corefx/blob/d866aa6eb7f4f5c2df5892b09d694280d2982150/src/System.Text.Json/src/System/Text/Json/Serialization/Converters/JsonValueConverterChar.cs#L12-L42

Related issue: https://github.com/dotnet/corefx/issues/34690
cc @layomia

cc @GrabYourPitchforks - if you have thoughts on surrogate pair behavior.

@bartonjs
Copy link
Member

For the writer, these values are already covered by the implicit conversions to int.

I don't know what it means to write a single char. If it's a single-character string, just wrap it in a span and call the span one.

For the reader there's arguably benefit for the Try APIs on the smaller numeric types... I could go either way. Again, I don't know what it means for char (a string whose value is representable with a single UTF-16 code unit?).

@ahsonkhan
Copy link
Contributor

For the writer, these values are already covered by the implicit conversions to int.

I don't know what it means to write a single char. If it's a single-character string, just wrap it in a span and call the span one.

Agreed. I don't mind removing it from the writer.

For the reader there's arguably benefit for the Try APIs on the smaller numeric types... I could go either way.

Yep. I think having it on the reader makes more sense.

Again, I don't know what it means for char (a string whose value is representable with a single UTF-16 code unit?).

I had the same question. @steveharter, any thoughts? Otherwise, we can discuss in the review itself.

@steveharter
Copy link
Member Author

The float\single converter in the serializer now uses the appropriate reader API. I'll update the issue.

For char a string of length 1 makes the most sense IMO. That is also what Json.NET uses. The work-around for someone to handle this themselves is not trivial.

@terrajobst
Copy link
Contributor

terrajobst commented May 30, 2019

Video

The char one might be harder to define. We should get @JamesNK input on those. Basically two options:

  • We can treat it as a number. If the primary goal is to define serialization behavior, that might be OK. And given that we have to write UTF8 that might be more honest.

  • Treat it as a single character string. This would mean that we either throw for chars that we can't represent in UTF8 or write garbage.

The other ones seem fine.

@WinCPP
Copy link
Contributor

WinCPP commented Jun 5, 2019

@steveharter @ahsonkhan is this something that I can start working on while the conclusion is reached on API for char?

In that case I could start with TryGet* and Get* APIs on Utf8JsonReader and JsonElement in comment by @ahsonkhan and work on the Write* APIs when concluded?

@ahsonkhan
Copy link
Contributor

@WinCPP, sound good to me. Assigning to you. Thanks.

@WinCPP
Copy link
Contributor

WinCPP commented Jun 8, 2019

@ahsonkhan I followed the steps given in updating-ref-source.md to add new Get* and TryGet* APIs to ref assembly. This generates a lot of changes such as adding readonly to some getters etc. apart from changing order of some existing APIs. Additionally it introduces following change and adds DisposeAsync method. Entire diff is available in this gist.

-    public sealed partial class Utf8JsonWriter : System.IDisposable
+    public sealed partial class Utf8JsonWriter : System.IAsyncDisposable, System.IDisposable
+        public System.Threading.Tasks.ValueTask DisposeAsync() { throw null; }

Looks like the steps in doc are no more being followed? And instead should I directly add the methods by hand? Appreciate your inputs.

@ahsonkhan
Copy link
Contributor

ahsonkhan commented Jun 8, 2019

Additionally it introduces following change and adds DisposeAsync method.

This is a known limitation and requires a manual change to remove (currently the DisposeAsync API surface is within the manual ref since the API surface is different between platforms).
https://github.com/dotnet/corefx/blob/e175429fc706ae9ef6ab6afd9d746937872d922e/src/System.Text.Json/ref/System.Text.Json.Manual.cs#L12
https://github.com/dotnet/corefx/issues/38349

This generates a lot of changes such as adding readonly to some getters etc.

Can you file an issue regarding this in arcade - https://github.com/dotnet/arcade/issues? I am not sure if this is expected (maybe related to recent changes like dotnet/arcade#2526).
cc @tannergooding, @ericstj, @safern

And instead should I directly add the methods by hand?

Sure, if that's easier. One thing you could do is generate the ref to see the fully qualified API surface and what order they should appear in, undo all the other changes, and manually re-add the new APIs. Other than having to deal with some corner cases (for instance where the manual ref file is involved), it should generally work.

@stephentoub
Copy link
Member

currently the DisposeAsync API surface is within the manual ref since the API surface is different between platforms

Can we fix that?

@WinCPP
Copy link
Contributor

WinCPP commented Jun 12, 2019

@ahsonkhan Additions for (Try)Get for byte, sbyte, short, ushort are almost ready.

  1. IIUC we don't want to do the Write APIs. Please confirm.
  2. Do we have conclusion on the (Try)GetChar API?
  3. I see that JsonDocument doesn't have equivalent TryGetValue APIs. (Noticed it when going through changes for Ensure JSON TryGetXXX sets values to default if returning false #37119 corefx#37838). So do we want to have them and if yes, in this issue or in separate issue?

@ahsonkhan
Copy link
Contributor

IIUC we don't want to do the Write APIs. Please confirm.

They are not yet needed, and can be added later (especially after some perf validation related to up-casting to a larger numeric type).

Do we have conclusion on the (Try)GetChar API?

I'll follow up on that and get back to you.

I see that JsonDocument doesn't have equivalent TryGetValue APIs. (Noticed it when going through changes for dotnet/corefx#37838). So do we want to have them and if yes, in this issue or in separate issue?

This is the issue. If you look at the API proposal, these APIs need to be added to JsonElement as well (not JsonDocument).
https://github.com/dotnet/corefx/issues/36125#issuecomment-474074184

@WinCPP
Copy link
Contributor

WinCPP commented Jun 15, 2019

This is the issue. If you look at the API proposal, these APIs need to be added to JsonElement as well (not JsonDocument).
#36125 (comment)

Oops sorry. How did I miss that 🤦‍♂️ : Will will on the same.

ahsonkhan referenced this issue in dotnet/corefx Jun 25, 2019
* CoreFx #36125 (Try)Get for byte sbyte short ushort

* Init out values on failure return paths.

* APIs in JsonElement and related tests

* More test cases, test bug fixes, code fixes
@terrajobst
Copy link
Contributor

We concluded we shouldn't support it unless there is significant customer feedback.

@ahsonkhan
Copy link
Contributor

To be clear, the conclusion was we wouldn't support char (specifically) for the reader/writer APIs. The previously approved APIs for smaller numeric types has already been checked-in by @WinCPP.

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the 3.0 milestone Feb 1, 2020
stephentoub pushed a commit to Jacksondr5/runtime that referenced this issue Oct 23, 2020
now that all primitive types are supported (thanks to dotnet#29000), we can
remove custom bound checks.
layomia pushed a commit that referenced this issue Oct 29, 2020
now that all primitive types are supported (thanks to #29000), we can
remove custom bound checks.
@ghost ghost locked as resolved and limited conversation to collaborators Dec 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Text.Json
Projects
None yet
Development

No branches or pull requests

7 participants