Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility of using implicit version of OpenJson? #38

Closed
IanKemp opened this issue Jun 27, 2023 · 3 comments
Closed

Possibility of using implicit version of OpenJson? #38

IanKemp opened this issue Jun 27, 2023 · 3 comments
Labels
question Further information is requested wontfix This will not be worked on

Comments

@IanKemp
Copy link

IanKemp commented Jun 27, 2023

Currently the SQL generated by this library invokes OpenJson with explicit mappings, for example:

declare @p0 nvarchar(max) = '[{"X":0,"I":123},...,{"X:"99","I":987}]' // 100 integer values
declare @__p_1 int = 3000
declare @__p_2 int = 100

SELECT [r].[Id], [r].[Amount], [r.AccountId]
FROM [Transactions] AS [r]
WHERE EXISTS (
    SELECT 1
    FROM (
        SELECT TOP(@p1) [X], [I]
        FROM OPENJSON(@p0) WITH ([X] int, [I] int)
        ORDER BY [X]
    ) AS [b]
    WHERE [b].[I] = [r].[AccountId]) // AccountId is an integer
OFFSET @__p_1 ROWS FETCH NEXT @__p_2 ROWS ONLY

From my testing the above is slower than using the implicit mappings version:

declare @p0 nvarchar(max) = '[123,...,987]' // 100 integer values
declare @__p_1 int = 3000
declare @__p_2 int = 100

SELECT [r].[Id], [r].[Amount], [r.AccountId]
FROM [Transactions] AS [r]
WHERE EXISTS (
    SELECT 1
    FROM OPENJSON(@p0) AS [b]
    WHERE CONVERT(INT, [b].[value]) = [r].[AccountId])
OFFSET @__p_1 ROWS FETCH NEXT @__p_2 ROWS ONLY

because the former query incurs Sort and Top operations, versus only a Compute Scalar for the former. The latter also sends a smaller amount of data to the server in the JSON-serialised parameter.

What is the reason for using the explicit mappings version instead of implicit? And would you be willing to add functionality to control whether implicit or explicit is used?

@yv989c
Copy link
Owner

yv989c commented Jun 27, 2023

Hey @IanKemp , thanks for the feedback.

The OPENJSON_EXPLICIT approach currently used by this library should perform better because the query engine does not have to do any implicit casting at runtime, which is what happens when you use the implicit one you are suggesting. If you run your last query in SSMS and check its execution plan you should even see a warning about possible incorrect cardinality estimations. If you're curious, I had a detailed discussion about this with one member of the EF team here: dotnet/efcore#13617 (comment)

In the previous link you can see how EF 8.0.0-preview.4.23259.3 –which implemented the approach you are suggesting– performed significantly worse than QueryableValues, which relies on the explicit approach.

Regarding the TOP and ORDER BY. The TOP is an optimization/attempt to help the query engine with a more accurate memory grant estimation. The ORDER BY does incurs a small performance hit but it's needed for correctness in other scenarios. Unfortunately because of the EF infrastructure I'm using to make this work, I can't remove it in your use case where it isn't relevant.

If you have evidence or some metrics that can help demonstrate the issue you are having, please share them here. You can also try using an older version of this library before I introduced the fix for ordering guarantees.

If this library gives you some value, please consider ⭐ the repo. It helps.🙂

@yv989c
Copy link
Owner

yv989c commented Jun 27, 2023

Oh btw. If possible for your use case, you can also try with a Join instead of Contains. It will typically give you better performance. You can find an example here.

@IanKemp
Copy link
Author

IanKemp commented Jun 29, 2023

Thanks @yv989c for both this package and your response. I've given you a ⭐ as requested, it really is the least that I can do.

@yv989c yv989c added question Further information is requested wontfix This will not be worked on labels Nov 26, 2023
@yv989c yv989c closed this as completed Nov 26, 2023
@yv989c yv989c reopened this Nov 26, 2023
@yv989c yv989c closed this as not planned Won't fix, can't repro, duplicate, stale Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants