Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReferenceHandler.Preserve does not work well with JsonConverters #21777

Closed
YeskaNova opened this issue Nov 21, 2020 · 10 comments · Fixed by #22322
Closed

ReferenceHandler.Preserve does not work well with JsonConverters #21777

YeskaNova opened this issue Nov 21, 2020 · 10 comments · Fixed by #22322
Assignees
Labels
doc-enhancement Improve the current content [org][type][category] dotnet-fundamentals/svc Pri2

Comments

@YeskaNova
Copy link

YeskaNova commented Nov 21, 2020

When I use a simple JsonConverter, with Reference handler equal to Preserve to serialize cyclic objects, the $id property added to each object is not unique, in this example "$id":1 exists two times :

System.Text.Json : 5.0.0

using System;
using System.Diagnostics;
using System.Text.Json;
using System.Text.Json.Serialization;

namespace ConsoleApp2
{
    public class ActionResultDto<T>
    {
        public T Result { get; set; }
    }
    public class Data
    {
    }
    class Program
    {
        static void Main(string[] args)
        {
            var data = new ActionResultDto<object>
            {
                Result = new Data() 
            };
            var options = new JsonSerializerOptions()
            {
                ReferenceHandler = ReferenceHandler.Preserve
            };
            options.Converters.Add(new EmptyJsonConverter());
            var str = JsonSerializer.Serialize(data, options);
            Debug.Assert(str.Split("\"$id\":\"1\"").Length == 2); // we must have only one occurence of "$id":"1"
        }
    }
    internal class EmptyJsonConverter
       : JsonConverter<object>
    {
        public override object Read(ref Utf8JsonReader reader,Type typeToConvert,JsonSerializerOptions options)
        {
            throw new InvalidOperationException("Should not get here.");
        }

        public override void Write(Utf8JsonWriter writer,object objectToWrite,JsonSerializerOptions options)
        {
            JsonSerializer.Serialize(writer, objectToWrite, options);
        }
    }
}

// Output =>    {"$id":"1","Result":{"$id":"1"}}

Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

@layomia
Copy link
Contributor

layomia commented Dec 1, 2020

cc @jozkee to confirm the story with JsonConverters and the preserved-reference feature.

Since reference data is only cached per call to Serialize/Deserialize, I don't see a way for existing reference ID metadata to be passed along when calling back into the serializer from a custom converter. Maybe in the future when we expose ReadStack and other metadata types (dotnet/runtime#34456), this scenario will be possible.


On a side note, I'd expect normally a StackOverflow to be thrown given the way the custom converter calls back to the serializer. The serializer would ordinarily fetch the same converter instance to handle the (de)serialization, which would cause the overflow. This is not the case here. I haven't dug deep here, but maybe it has to do with the converter being for System.Object. See https://docs.microsoft.com/en-us/dotnet/standard/serialization/system-text-json-migrate-from-newtonsoft-how-to?pivots=dotnet-5-0#required-properties for more information on calling back into the serializer from a custom converter.

@jozkee
Copy link
Member

jozkee commented Dec 1, 2020

You can root the ReferenceResolver instance in the call site of JsonSerializer.Serialize in order to make it persistent on nested calls to Serialize. However, be careful to not expand this pattern to a "global" JsonSerializerOptions, otherwise the reference data will never reset and the underlying dictionaries may grow forever.

class Program
{
    static void Main(string[] args)
    {
        var data = new ActionResultDto<object>
        {
            Result = new Data() 
        };
        var options = new JsonSerializerOptions()
        {
            ReferenceHandler = ReferenceHandler.Preserve
        };
        options.Converters.Add(new EmptyJsonConverter());
        options.ReferenceHandler = new MyReferenceHandler();

        string str = JsonSerializer.Serialize(data, options);

        // Always reset or null-out options.ReferenceHandler after you are done serializing 
        // in order to avoid out of bounds memory growth in the reference resolver.
        options.ReferenceHandler = null;

        System.Console.WriteLine(str); // Output =>    {"$id":"1","Result":{"$id":"2"}}
    }
}

class MyReferenceHandler : ReferenceHandler
{
    public MyReferenceHandler()
    {
        _rootedResolver = new MyReferenceResolver();
    }

    private ReferenceResolver _rootedResolver;
    public override ReferenceResolver CreateResolver() => _rootedResolver;

    class MyReferenceResolver : ReferenceResolver
    {
        private uint _referenceCount;
        private readonly Dictionary<string, object> _referenceIdToObjectMap = new Dictionary<string, object>();
        private readonly Dictionary<object, string> _objectToReferenceIdMap = new Dictionary<object, string>(ReferenceEqualityComparer.Instance);

        public override void AddReference(string referenceId, object value)
        {
            if (!_referenceIdToObjectMap.TryAdd(referenceId, value))
            {
                throw new JsonException();
            }
        }

        public override string GetReference(object value, out bool alreadyExists)
        {
            if (_objectToReferenceIdMap.TryGetValue(value, out string referenceId))
            {
                alreadyExists = true;
            }
            else
            {
                _referenceCount++;
                referenceId = _referenceCount.ToString();
                _objectToReferenceIdMap.Add(value, referenceId);
                alreadyExists = false;
            }

            return referenceId;
        }

        public override object ResolveReference(string referenceId)
        {
            if (!_referenceIdToObjectMap.TryGetValue(referenceId, out object value))
            {
                throw new JsonException();
            }

            return value;
        }
    }
}

@layomia
Copy link
Contributor

layomia commented Dec 1, 2020

Transferring this to dotnet/docs to consider adding a sample for

  • preserving references + custom converters
  • preserving references across multiple (de)serialization calls

cc @tdykstra

@layomia layomia transferred this issue from dotnet/runtime Dec 1, 2020
@dotnet-bot dotnet-bot added the ⌚ Not Triaged Not triaged label Dec 1, 2020
@PRMerger13 PRMerger13 added the Pri3 label Dec 1, 2020
@tdykstra tdykstra added Pri2 and removed ⌚ Not Triaged Not triaged Pri3 labels Dec 1, 2020
@tdykstra tdykstra self-assigned this Dec 1, 2020
@adegeo adegeo added the doc-idea Indicates issues that are suggestions for new topics [org][type][category] label Dec 3, 2020
@dotnet-bot dotnet-bot removed the ⌚ Not Triaged Not triaged label Dec 3, 2020
@adegeo adegeo added doc-enhancement Improve the current content [org][type][category] accessibility Issues that pertain to accessibility problems, such as missing alt-text or illegible screenshots. and removed doc-idea Indicates issues that are suggestions for new topics [org][type][category] doc-enhancement Improve the current content [org][type][category] labels Dec 3, 2020
@dotnet-bot dotnet-bot added the ⌚ Not Triaged Not triaged label Dec 3, 2020
@adegeo adegeo added doc-enhancement Improve the current content [org][type][category] and removed accessibility Issues that pertain to accessibility problems, such as missing alt-text or illegible screenshots. labels Dec 3, 2020
@dotnet-bot dotnet-bot removed the ⌚ Not Triaged Not triaged label Dec 3, 2020
@adegeo
Copy link
Contributor

adegeo commented Dec 3, 2020

@tdykstra FYI I added doc-enhancement as the categorization label. Please correct it if I'm mistaken.

@tdykstra
Copy link
Contributor

tdykstra commented Dec 3, 2020

Thanks, that's the right label.

@tdykstra
Copy link
Contributor

consider adding a sample for

  • preserving references + custom converters
  • preserving references across multiple (de)serialization calls

@layomia or @jozkee can you provide an example suitable for docs for each of these scenarios? Something like the one for non-custom-converter preserve references would be more helpful than a converter for Object.

I don't see how to adapt this approach to types other than Object because, as @layomia noted, we advise not passing in options when calling Deserialize/Serialize in converter code, and it's not clear why doing that in this example doesn't cause a StackOverflow exception.

Also, when I run the example code it fails at options.ReferenceHandler = null; because you can't change an options instance after it's been used with a call to Deserialize/Serialize.

@jozkee
Copy link
Member

jozkee commented Jan 5, 2021

preserving references + custom converters

Assuming that you wrote a custom converter for Company and you don't want to manually serialize the Supervisor, which is an Employee. you want to delegate that to the Serializer and you also want to preserve the references that you have already saved.
https://gist.github.com/Jozkee/88ac585c1912f6a3c6816254cecc54b5

preserving references across multiple (de)serialization calls

Assuming that you have a list of Employees and you have to serialize each one individually and you also want to take advantage of the references saved in the ReferenceHandler's resolver.
https://gist.github.com/Jozkee/0b50c40b58fa93c006a3ba322b4e1165

Also, when I run the example code it fails at options.ReferenceHandler = null; because you can't change an options instance after it's been used with a call to Deserialize/Serialize.

@tdykstra I fixed that in above examples by adding a Reset method that creates a new ReferenceResolver instead of mutating the JsonSerializerOptions.

we advise not passing in options when calling Deserialize/Serialize in converter code, and it's not clear why doing that in this example doesn't cause a StackOverflow exception.

I think that's only problematic when you funnel the call to JsonSerializer.Serialize and T is the same type as your value; or in the case of Deserialize, T is the same type as typeToConvert.

@tdykstra
Copy link
Contributor

Thanks for the code and explanations, @jozkee ! These are added to the docs in #22322

@layomia
Copy link
Contributor

layomia commented Jan 21, 2021

I think that's only problematic when you funnel the call to JsonSerializer.Serialize and T is the same type as your value; or in the case of Deserialize, T is the same type as typeToConvert.

@jozkee @tdykstra - as a sidebar to the main discussion of this issue; given the provided example:

internal class EmptyJsonConverter : JsonConverter<object>
{
    public override object Read(ref Utf8JsonReader reader,Type typeToConvert,JsonSerializerOptions options)
    {
        throw new InvalidOperationException("Should not get here.");
    }

    public override void Write(Utf8JsonWriter writer,object objectToWrite,JsonSerializerOptions options)
    {
        JsonSerializer.Serialize(writer, objectToWrite, options);
    }
}

It just occurred to me that the serializer calls object.GetType() when typeof(object) is passed at the root level when serializing. So, the object is processed as the run-time type of the instance, which is certainly different from typeof(object). This is why we don't have a stack overflow caused by the JsonSerializer.Serialize(writer, objectToWrite, options) line: the "T"s are different.

@YeskaNova
Copy link
Author

YeskaNova commented Apr 18, 2021

@jozkee , I have the same problem but now with IgnoreCycles: I need to add $type property for an array that will contain objects of different types ex : ( [A,B,C] of IEnumerable< A> => [A,{$type:"B",$value:B},{$type:"C",$value:C}] )
My custom JsonConverter will have a JsonSerializer.Serialize in the Write method in case when the type of the object == T in IEnumerable< T> so when there is no need to add $type :

        private abstract class AbstractEnumerableJsonConverter<TObj, TCollection> :
            JsonConverter<TCollection> where TCollection: IEnumerable<TObj>
        {


            public override TCollection? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
            {
                var list = new List<TObj>();
                while (reader.Read())
                {
                    if (reader.TokenType == JsonTokenType.EndArray)
                    {
                        break;
                    }
                    if (reader.TokenType == JsonTokenType.StartObject)
                    {
                        var deserializedObj = default(TObj);
                        var document = JsonDocument.ParseValue(ref reader);
                        if (document.RootElement.TryGetProperty("$type", out var typeName))
                        {
                            var type = Type.GetType(typeName.GetString());
                            if (type.IsAssignableTo(typeof(IDbo)))
                                deserializedObj = (TObj)JsonSerializer.Deserialize(document.RootElement.GetProperty("$value").GetRawText(), type, options);
                        }
                        deserializedObj ??= (TObj)JsonSerializer.Deserialize(document.RootElement.ToString(), typeof(TObj), options);
                        list.Add(deserializedObj);
                    }
                }
                var ret = (IList)Activator.CreateInstance(typeToConvert, list.Count);
                if (ret.Count == 0)
                    for (var i = 0; i < list.Count; ++i)
                        ret.Add(list[i]);
                else
                    for (var i = 0; i < ret.Count; ++i)
                        ret[i] = list[i];

                return (TCollection)ret;
            }
            
            public override void Write(Utf8JsonWriter writer, TCollection value, JsonSerializerOptions options)
            {
                writer.WriteStartArray();
                foreach (var obj in value)
                {
                    if( obj.GetType() != typeof(TObj))
                    {
                        writer.WriteStartObject();
                        writer.WriteString("$type", obj.GetType().FullName);
                        writer.WritePropertyName("$value");
                        JsonSerializer.Serialize(writer, obj, options);   //<---- The state is lost so we can no more check cycles**
                        writer.WriteEndObject();
                    }
                    else
                    JsonSerializer.Serialize(writer, obj, options);       //<---- The state is lost so we can no more check cycles**
                }
                writer.WriteEndArray();
            }
        }

I have tried to do something like #21777 (comment) , but it seems that IgnoreCycles relies on a lot of things that are internal and resides in the Core code ( like PushReferenceForCycleDetection that maybe would help if it's protected and not internal https://github.com/dotnet/runtime/blob/01b7e73cd378145264a7cb7a09365b41ed42b240/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/IgnoreReferenceResolver.cs#L23 ).

Is there any way I can get around these cycles?

(I have also tried to serialize the properties of the object one by one to avoid calling JsonSerializer.Serialize, but the code to enumerate one object's properties is also internal )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-enhancement Improve the current content [org][type][category] dotnet-fundamentals/svc Pri2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants