Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Text.Json metadata writer does not check for property name conflicts #72170

Open
eiriktsarpalis opened this issue Jul 14, 2022 · 11 comments

Comments

@eiriktsarpalis
Copy link
Member

Originally posted by @ilya-scale in #63747 (comment)

This can be reproduced both with ReferenceHandler.Preserve:

var value = new Poco { Value = "string" };
var options = new JsonSerializerOptions { ReferenceHandler = ReferenceHandler.Preserve };
string json = JsonSerializer.Serialize(value, options);
Console.WriteLine(json); // {"$id":"1","$id":"string"}
JsonSerializer.Deserialize<Poco>(json, options); // Fails with 'The metadata property '$id' must be the first reference preservation property in the JSON object. 

public class Poco
{
    [JsonPropertyName("$id")]
    public string Value { get; set; }
}

and polymorphism

Base value = new Derived { Value = "value" };
string json = JsonSerializer.Serialize(value);
Console.WriteLine(json); // {"$type":"derived","$type":"value"}
Base result = JsonSerializer.Deserialize<Base>(json); // System.Text.Json.JsonException: 'Deserialized object contains a duplicate type discriminator metadata property. 
Console.WriteLine(result is Derived);

[JsonDerivedType(typeof(Base), "base")]
[JsonDerivedType(typeof(Derived), "derived")]
public class Base
{
    [JsonPropertyName("$type")]
    public string Value { get; set; }
}

public class Derived : Base
{
}

We should validation for both cases to ensure invalid JSON is not emitted over the wire.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 14, 2022
@ghost
Copy link

ghost commented Jul 14, 2022

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Issue Details

Originally posted by @ilya-scale in #63747 (comment)

This can be reproduced both with ReferenceHandler.Preserve:

var value = new Poco { Value = "string" };
var options = new JsonSerializerOptions { ReferenceHandler = ReferenceHandler.Preserve };
string json = JsonSerializer.Serialize(value, options);
Console.WriteLine(json); // {"$id":"1","$id":"string"}
JsonSerializer.Deserialize<Poco>(json, options); // Fails with 'The metadata property '$id' must be the first reference preservation property in the JSON object. 

public class Poco
{
    [JsonPropertyName("$id")]
    public string Value { get; set; }
}

and polymorphism

Base value = new Derived { Value = "value" };
string json = JsonSerializer.Serialize(value);
Console.WriteLine(json); // {"$type":"derived","$type":"value"}
Base result = JsonSerializer.Deserialize<Base>(json); // System.Text.Json.JsonException: 'Deserialized object contains a duplicate type discriminator metadata property. 
Console.WriteLine(result is Derived);

[JsonDerivedType(typeof(Base), "base")]
[JsonDerivedType(typeof(Derived), "derived")]
public class Base
{
    [JsonPropertyName("$type")]
    public string Value { get; set; }
}

public class Derived : Base
{
}

We should validation for both cases to ensure invalid JSON is not emitted over the wire.

Author: eiriktsarpalis
Assignees: -
Labels:

area-System.Text.Json, untriaged

Milestone: -

@eiriktsarpalis eiriktsarpalis added bug and removed untriaged New issue has not been triaged by the area owner labels Jul 14, 2022
@eiriktsarpalis eiriktsarpalis added this to the 7.0.0 milestone Jul 14, 2022
@eiriktsarpalis eiriktsarpalis self-assigned this Jul 14, 2022
@ilya-scale
Copy link

Would there be a way to have both polymorphic deserialization and a possibility to read the property, i.e. the discriminator from the resulting C# object, or you want to explicitly prohibit it so that it will not be possible?

@eiriktsarpalis
Copy link
Member Author

I think it would make sense to explicitly prohibit.

@ilya-scale
Copy link

It is probably not a bad idea though to allow reading the type discriminator unless it will lead to some unforeseen consequences or complicated implementation?

@eiriktsarpalis
Copy link
Member Author

When I created this issue, I had assumed that this would be a metadata validation concern, but this is not the case with dictionary types; we'd need to validate individual key/value pairs at runtime which is not ideal from a performance perspective, and may not even be possible if we have a converter that overrides WriteAsPropertyName.

If not possible to enforce on Dictionary types, does it make sense to enforce it on object types from a consistency perspective?

Moving to Future, we can consider doing this in later releases should there be demand.

@eiriktsarpalis eiriktsarpalis modified the milestones: 7.0.0, Future Aug 2, 2022
@eiriktsarpalis
Copy link
Member Author

eiriktsarpalis commented Aug 4, 2022

Related issue: #1780. That one suggests escaping $ to avoid conflicts, however polymorphism makes it possible to define metadata properties with $ prefixes.

@eiriktsarpalis eiriktsarpalis removed their assignment Oct 3, 2022
@layomia
Copy link
Contributor

layomia commented Dec 2, 2022

JSON technically allows duplicate properties. For non-metadata properties STJ employs a "last one wins" strategy for deserialization.

For these metadata-based features, we seem to be throwing reasonable exceptions to avoid ambiguous metadata interpretation. Seems we can close this issue?

@eiriktsarpalis
Copy link
Member Author

eiriktsarpalis commented Dec 2, 2022

I think the problem primarily here is with serialization: it emits duplicate entries even though these are impossible to deserialize using the same configuration. That being said, it's not a particularly important problem.

@dennis-yemelyanov
Copy link
Contributor

I have a related problem where I can avoid duplicate serialization, however I wasn't able to get the Type object property (also used as the discriminator) populated on deserialization.

For example, for the following class hierarchy:

abstract class Base
{
    public string Type { get; set; }
}

class Person : Base
{
    public string Name { get; set; }
}

class Container
{
    public Base Obj { get; set; }
}

I wrote the following TypeResolver with an option to remove the Type property and avoid duplicate "Type" json keys written on serialization:

public class PolymorphicTypeResolver : DefaultJsonTypeInfoResolver
{
    private readonly bool _removeTypeProperty;

    public PolymorphicTypeResolver(bool removeTypeProperty)
    {
        _removeTypeProperty = removeTypeProperty;
    }

    public override JsonTypeInfo GetTypeInfo(Type type, JsonSerializerOptions options)
    {
        var typeInfo = base.GetTypeInfo(type, options);

        if (typeInfo.Type == typeof(Person))
        {
            var typeProperty = typeInfo.Properties.Single(x => x.Name == "Type");

            if (_removeTypeProperty)
            {
                typeInfo.Properties.Remove(typeProperty);
            }
        }

        if (typeInfo.Type == typeof(Base))
        {
            typeInfo.PolymorphismOptions = new JsonPolymorphismOptions
            {
                TypeDiscriminatorPropertyName = "Type",
                IgnoreUnrecognizedTypeDiscriminators = false,
                UnknownDerivedTypeHandling = JsonUnknownDerivedTypeHandling.FailSerialization,
                DerivedTypes = { new JsonDerivedType(typeof(Person), nameof(Person)) }
            };
        }

        return typeInfo;
    }
}

I use the following code to create and roundtrip an object:

var serializeOptions = new JsonSerializerOptions
{
    TypeInfoResolver = new PolymorphicTypeResolver(removeTypeProperty: true)
};

var container = new Container { Obj = new Person { Name = "John" } };

var json = JsonSerializer.Serialize(container, serializeOptions);

Console.WriteLine(json);

var deserializeOptions = new JsonSerializerOptions
{
    TypeInfoResolver = new PolymorphicTypeResolver(removeTypeProperty: false)
};

var result = JsonSerializer.Deserialize<Container>(json, deserializeOptions);

Console.WriteLine($"Type: {result.Obj.Type}; Name: {((Person)result.Obj).Name}");

The problem is that the last line doesn't optput correct Type value:

{"Obj":{"Type":"Person","Name":"John"}}
Type: ; Name: John

I want it to print Type: Person; Name John instead.

I would appreciate it if there are any workarounds, maybe some customer converters or other tricks can be used to achieve the desired result?

@eiriktsarpalis
Copy link
Member Author

@dennis-yemelyanov the scenario you're describing is not supported. Type discriminators map string identifiers to specific derived types and as such they cannot be bound to the runtime value of any properties of the object being serialized.

@dennis-yemelyanov
Copy link
Contributor

Thank you @eiriktsarpalis for confirming. I came up with a slightly different approach using a converter, but was curious if this approach is safe or could have potential issues like infinite recursion due to passing the same options to Serialize/Deserialize?

Basically in the converter for the base class I first read json as JsonObject, then convert it to a specific class depending on the Type value. And when writing the JSON, simply write the object as a specific class:

class PolymorphicConverter : JsonConverter<Base>
{
    public override Base? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
    {
        var jsonObj = JsonSerializer.Deserialize<JsonObject>(ref reader, options);

        var type = (string)jsonObj["Type"];

        return type switch
        {
            "Person" => jsonObj.Deserialize<Person>(options),
            _ => throw new InvalidOperationException($"Unknown type '{type}'")
        };
    }

    public override void Write(Utf8JsonWriter writer, Base value, JsonSerializerOptions options)
    {
        if (value is Person person)
        {
            JsonSerializer.Serialize(writer, person, options);
        }
        else
        {
            throw new InvalidOperationException($"Unknown type '{value.GetType().Name}'");
        }
    }
}

This also requires configuring a modifier for the Type property so it can have the correct value depending on the actual class type:

private static void ConfigureTypePropertyModifier(JsonTypeInfo jsonTypeInfo)
{
    if (jsonTypeInfo.Kind != JsonTypeInfoKind.Object)
        return;

    var typeProperty = jsonTypeInfo.Properties.SingleOrDefault(x => x.Name == "Type");

    if (typeProperty != null)
    {
        typeProperty.Get = obj => obj.GetType().Name;
    }
}

var serializeOptions = new JsonSerializerOptions
{
    Converters = { new PolymorphicConverter() },
    TypeInfoResolver = new DefaultJsonTypeInfoResolver
    {
        Modifiers = { ConfigureTypePropertyModifier }
    }
};

var container = new Container { Obj = new Person { Name = "John" } };

var json = JsonSerializer.Serialize(container, serializeOptions);

Console.WriteLine(json);

var result = JsonSerializer.Deserialize<Container>(json, serializeOptions);

Console.WriteLine($"Type: {result.Obj.Type}; Name: {((Person)result.Obj).Name}");

This seems to work and prints the expected result:

{"Obj":{"Name":"John","Type":"Person"}}
Type: Person; Name: John

Would you recommend going with this approach or are there any potential risks that I missed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants