Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON.NET plugin #334

Closed
roji opened this issue Mar 10, 2018 · 22 comments
Closed

JSON.NET plugin #334

roji opened this issue Mar 10, 2018 · 22 comments
Labels
enhancement New feature or request

Comments

@roji
Copy link
Member

roji commented Mar 10, 2018

With the new JSON.NET plugin coming to Npgsql 3.3, we need to create the corresponding plugin for the EF Core provider (depends on #300). It would provide a db context options extension, UseJsonNet(), which would do the following:

  • Add the Npgsql JsonNet plugin to the global type mapper
  • Add the appropriate RelationalTypeMapping to the NpgsqlTypeMappingSource
  • Add the appropriate translators for JSON operations.
@roji roji added the enhancement New feature or request label Mar 10, 2018
@roji roji added this to the 2.1.0 milestone Mar 10, 2018
@roji
Copy link
Member Author

roji commented May 11, 2018

After some research into this, there's one very annoying mismatch between Newtonsoft.JSON and PostgreSQL's JSON support that makes this difficult. Mapping JObject and JArray is easy enough, but operation translation is problematic.

In Newtonsoft., the way to access a string value in a JSON is:

string itemTitle = (string)rss["channel"]["item"][0]["title"];

It should be easy to map the indexers into PostgreSQL's -> operator (see the docs). However, the cast of the last value into string is problematic... In Newtonsoft, this cast yields the simple string value, but PostgreSQL treats it as a cast of a JSON document, so returns the string value surrounded by double-quotes. In other words, if the string value being access is foo, Newtonsoft will return foo while PostgreSQL will return "foo".

To work around this, we need to customize the casting behavior when it is applied to a json/jsonb. The Newtonsoft casting behavior can be implemented by rendering #>>'{}', which "looks up" the root element of the give JSON document and returns it as a string. However, it's difficult to see how we can override this behavior in NpgsqlQuerySqlGenerator from a plugin.

In theory we can replace the NpgsqlQuerySqlGenerator service from the plugin, although replacing a service with another service wrapping the first is really non-trivial with the dependency-injection system. Npgsql could register a factory lambda for IQuerySqlGenerator, which the JSON extension would replace with another ServiceDescriptor the does the wrapping.

However, if the replacement service extends NpgsqlQuerySqlGenerator, overriding only the relevant cast behavior, it's impossible to have multiple plugins changing bits of behavior in the SQL generator (each one would clobber the current generator with its own).

Because of the above complications I'm going to drop this for now and revisit later. In any case PG 11 is supposed to add SQL/JSON support, maybe it's better to wait for that and see what things look like.

@roji roji modified the milestones: 2.1.0, 2.2.0 May 11, 2018
@seriouz
Copy link

seriouz commented May 17, 2018

Probably @JamesNK has some ideas on this topic?

@JamesNK
Copy link

JamesNK commented May 17, 2018

If you want double quotes then use https://www.newtonsoft.com/json/help/html/M_Newtonsoft_Json_Linq_JToken_WriteTo.htm with a JsonTextWriter over the top of a StringWriter. Or manually check for a string and update the value with a call to JsonConvert.ToString(s) (I think).

@roji
Copy link
Member Author

roji commented May 17, 2018

Thanks @seriouz and @JamesNK.

The problem isn't really with Newtonsoft.JSON or how to do stuff with it... The goal of this plugin would be to translate idiomatic, simple JSON.NET code to PostgreSQL JSON operators in SQL. The problem is that there's an annoying mismatch between the regular way to do JSON.NET and the PostgreSQL operators where it comes to string casts. In other words, I'm sure I can write JSON.NET code that produces the double quotes, as you suggest above @JamesNK, but the point is to allow users to write normal, regular JSON.NET and for that to get evaluated seamlessly and out of the box on PostgreSQL.

Concretely, I want to translate the following JSON.NET sample:

string cpu = (string)o["CPU"];

The expression o["CPU"] in this sample is a JSON object (JToken in JSON.NET). But when we cast to string in JSON.NET, we simply get the string (no quotes), whereas in PostgreSQL we get quotes, since PostgreSQL considers that a text cast of a JSON object needs to be a valid JSON string representation.

So the problem isn't how to get JSON.NET to produce the quotes, but rather how to properly translate JSON.NET code to PostgreSQL in a way that everything works.

@markusschaber
Copy link

Hi, roji,
AFAICS, it should be possible to extract the string as string using json_extract_path_text() instead of using the cast, see https://www.postgresql.org/docs/10/static/functions-json.html.

@roji
Copy link
Member Author

roji commented May 17, 2018

@markusschaber but when would you call that function? The point is that the user wrote the expression (string)o["CPU"], and that's what you have to translate to SQL... You're right that we could translate the C# cast operation to json_extract_path_text() (or to the #>>'{}' operator as I suggested above, although you're proposal's better), but that would involve changing the translation for casts, which is problematic from a plugin, as I wrote above.

@greenboxal
Copy link

What's the current status on this? Can we contribute somehow?

@roji
Copy link
Member Author

roji commented Aug 11, 2018

@greenboxal this is definitely still planned for 2.2.0, but the issues above still need to be addressed. In addition, it seems wise to hold off on this as PG 11 will probably be adding SQL/JSON standard support, which might change the way we want to do this.

Finally, this may be a bit complicated for an outside contributor to tackle as a first foray into an EF Core provider, since it potentially touches on several internal details. However, if you want to help by researching the state of SQL/JSON in PostgreSQL 11, whether it's a good fit (better than the pre-11 JSON support) and whether it helps in any way with the issue described above - that would be really great.

@lmarszal
Copy link

@roji - I think I have an alternative solution. Instead of supporting casting, use Value<> extension:

string cpu = o["CPU"].Value<string>();

With this, you can create a method call translator, that would convert this to string.

I gave it a try and created small proof-of-concept, to see how this would work. You can review it here: https://github.com/lmarszal/NpgsqlJObjectPoC
It's far from being ready but can give us some starting point for the discussion.

Any thoughts?

@seriouz
Copy link

seriouz commented Oct 21, 2018

@lmarszal Nice idea!

@roji
Copy link
Member Author

roji commented Oct 21, 2018

@lmarszal, the point is that we don't want to force a strange constraint on the way users write their LINQ code over JSON.NET... The idea is to have users write normal, regular LINQ just as they would outside of a database context, and have that work as-is on PostgreSQL. I really don't want to see a plugin which forces an unnatural extension method which means the exact same thing as a C# cast, because of some internal issue like this...

I do intend to look at this again at some point, and I'm also hoping that by the time I do PostgreSQL will introduce SQL/JSON support which may make the issue go away (I admit that isn't grounded in anything though).

@lmarszal
Copy link

@roji, I got this. For me more natural than casting is using .Value<> - but this is only my opinion. And casting should be evaluated exactly the same as .Value, as their underlying logic implemented in JObject is almost identical. And last but not least both options should be supported (not to, as you said, force users to do something unnatural for them).

My thinking was, that since (1) at the end of the day .Value has to be supported too, and (2) cast is a problem now: we might introduce something that works now and just add support for cast later on.
Existing code wouldn't change, just new functionality will be added.

The additional benefit would be, that when SQL/JSON appear (and if this helps), we would just need to change the underlying code, as the LINQ expressions users would have written won't have to change.

@roji
Copy link
Member Author

roji commented Oct 21, 2018

For me more natural than casting is using .Value<> - but this is only my opinion.

Apart from subjective feelings, the point is that casting is the pattern shown in the basic JSON.NET documentation. I think that a plugin that fails on the most basic sample is a bit of a non-starter.

I'd rather we at least tried to solve the problem for casting - and as you say, implement translation for Value<T>() as well.

@lmarszal
Copy link

I did a next quick test and handling casting seems doable with IExpressionFragmentTranslator - sth like this:

internal class JTokenCastExpressionFragmentTranslator : IExpressionFragmentTranslator
{
    public Expression Translate(Expression expression)
    {
        if (expression is UnaryExpression unaryExpression &&
            unaryExpression.NodeType == ExpressionType.Convert &&
            unaryExpression.Type == typeof(string) &&
            unaryExpression.Operand.Type == typeof(JToken))
        {
            return new CustomUnaryExpression(unaryExpression.Operand, "->>0", typeof(string), postfix: true);
        }

        return null;
    }
}

This requires small changes to plugin model though:

  • Extend NpgsqlCompositeExpressionFragmentTranslator to iterate through plugins, like NpgsqlCompositeMethodCallTranslator does
  • Add virtual method to NpgsqlEntityFrameworkPlugin (sth like AddExpressionFragmentTranslators)

With this I've managed successfully run a call like this:

string foo = "text";
_dbContext.SampleItems.Where(x => (string)x.Body["Inner"]["Foo"] == foo).ToList();

@roji
Copy link
Member Author

roji commented Oct 21, 2018

Great, that's definitely a step in the right direction! In my original attempt I think I got a bit stuck working on NpgsqlQuerySqlGenerator, where NpgsqlCompositeExpressionFragmentTranslator may be a better extension point for this. I'd still want to evaluate the two approaches to see which is best.

Regardless, note that I'm currently in the process of converting the two plugins to use the new plugin model in EF Core 2.2 (which is based on my previous work). This is issue #658 and I'll probably submit a PR later today or tomorrow (very close to finished). This means that you shouldn't spend any time working on the code as it is, but rather look at how things are implemented in the current EF Core release/2.2 branch - the plugin model is slightly different.

As a stop gap, we can also integrate the special casting logic into the Npgsql provider (not via a plugin), using string comparison to identify the operation on JToken. This isn't ideal, but is a relatively harmless workaround until something clean can be done in the EF Core plugin model.

@lmarszal
Copy link

lmarszal commented Oct 21, 2018

Good - so I'm waiting for your PR and then then will try to figure out how to make this work with new the plugin model. Please keep me posted if, how and when I can contribute to this. I'd really love to see this feature fully implemented, so I can spend some time working on this :)

@roji
Copy link
Member Author

roji commented Oct 21, 2018

Here we go: #672

@roji roji modified the milestones: 2.2.0, 3.0.0 Nov 15, 2018
@nilmas
Copy link

nilmas commented Jan 2, 2019

FYI, in case this has flown under the radar .NET Core team is planning / implementing their version (not Newtonsoft's Json.NET) of JSON support:
dotnet/announcements#90 -> discussion at:
https://github.com/dotnet/corefx/issues/33115

@roji
Copy link
Member Author

roji commented Jan 2, 2019

Yep, we're waiting to see what comes out of that to see what to support in Npgsql.

@roji
Copy link
Member Author

roji commented Jul 5, 2019

Info on the new PostgreSQL 12 jsonpath: https://www.postgresql.org/docs/devel/datatype-json.html#DATATYPE-JSONPATH, https://paquier.xyz/postgresql-2/postgres-12-jsonpath. We should definitely take another look at this sometime.

@roji
Copy link
Member Author

roji commented Aug 12, 2019

FYI I am currently working on this - a preliminary implementation is working. Will share details soon.

@roji roji closed this as completed Aug 15, 2019
@roji roji mentioned this issue Aug 15, 2019
@roji
Copy link
Member Author

roji commented Aug 15, 2019

Superceded by #981

@roji roji removed this from the 3.0.0 milestone Aug 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants