Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize multiple consecutive LIMITs #35384

Merged
merged 5 commits into from
Dec 24, 2024

Conversation

ranma42
Copy link
Contributor

@ranma42 ranma42 commented Dec 24, 2024

Fixes #35383 for the simple cases of constants/literals and repeated expressions.

@ranma42
Copy link
Contributor Author

ranma42 commented Dec 24, 2024

I believe the same optimization should be done for Cosmos.
I will try to take care of that ASAP

newLimit.TypeMapping
);
}
else if (Limit != null)
{
PushdownIntoSubquery();
Copy link
Contributor Author

@ranma42 ranma42 Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could actually aim for never doing a pushdown and instead replace the limit with LEAST(Limit, sqlExpression, but I am unsure if this is easy to do here (in this context we don't have access to TranslateMin)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you can't guarantee LEAST is supported at this stage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am experimenting with:

  • adding a SelectExpression.SetLimit method
  • replacing the ApplyLimit invocations with SetLimits as they are done from a context that also invokes GenerateLeast (and if that returns null, I can easily fall back to ApplyLimit)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably in the minority but it probably will break EFCore.Jet. I can only have a constant expression for TOP. And since I don't have LIMIT/OFFSET, TOP= LIMIT. And I do have a translation for LEAST

Copy link
Contributor Author

@ranma42 ranma42 Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm... how does that work with non-constant Take()? (parameters are likely the most common, but in theory any kind of expression is allowed there)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably in the minority but it probably will break EFCore.Jet. I can only have a constant expression for TOP. And since I don't have LIMIT/OFFSET, TOP= LIMIT. And I do have a translation for LEAST

Also, note that as long as the code is only LIMITing on constants, you still get a (combined) constant as the new LIMIT.
LEAST/MIN are emitted when combining non-constants (or a constant and a non-constant), so it might be possible to handle it in the same way as you are doing now (I would assume something with row numbers)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I throw a Not Translated error (no row_number or equivlalent in ms access).

But I'm happy as long as a constant expression stays as a plain constant expression

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm... how does that work with non-constant Take()? (parameters are likely the most common, but in theory any kind of expression is allowed there)

Just before I sent it off to Oledb/ODBC, the query is modified to inline the parameter. If it refers to a column, well it just fails

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guys, one of my major gripes with the current query pipeline design is the non-openness of SelectExpression, i.e. code there isn't overridable by providers; this case with integrating LEAST() instead of pushing down is a great example of why that's needed. I do have a plan in mind: I'd like to first make SelectExpression fully mutable, and then extract most/all actual logic out of it, so that SelectExpression becomes a regular, pure, immutable expression just like any other. The logic could go to QueryableMethodTranslatingEV, but some of these operations are probably also needed in postprocessing, so possible a separate service is maybe needed, that would be accessible from everywhere and be overridable by the provider. I hope to make at least some progress in this general direction in 10 (e.g. making SelectExpression fully immutable).

Until then, we're a bit stuck hacking around... @ranma42 what you've done in this draft looks good to me - let me know when this is ready for review.

Fixes dotnet#35383 for trivial cases (constants/literals and repeated expressions).
The new translation makes it work also in Sqlite.
@ranma42 ranma42 force-pushed the simplify-collection-trivial branch from e0e91fd to 08c8569 Compare December 24, 2024 09:10
@ranma42 ranma42 marked this pull request as ready for review December 24, 2024 09:46
@ranma42 ranma42 requested a review from a team as a code owner December 24, 2024 09:46
/// Sets a new limit of the <see cref="SelectExpression" /> to limit the number of rows returned in the result set.
/// </summary>
/// <param name="sqlExpression">An expression representing limit row count.</param>
public void SetLimit(SqlExpression sqlExpression)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add [EntityFrameworkInternal] as while this looks good for now, I don't see this API sticking around (e.g. once SelectExpression is immutable).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed I assume that making SelectExpression immutable will change its API quite a bit (which might also be a very good opportunity to re-organize some code around it 😉 )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely... That's high up on my list for 10.

Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ranma42!

@@ -299,17 +299,15 @@ public override async Task Skip_navigation_order_by_single_or_default(bool async
"""
SELECT [s0].[Id], [s0].[Name]
FROM [EntityOnes] AS [e]
OUTER APPLY (
SELECT TOP(1) [s].[Id], [s].[Name]
LEFT JOIN (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting change... The LEFT JOIN is obviously better than the OUTER APPLY (and also unlocks SQLite which doesn't support APPLY), but then we get the row number inside... Looks OK in any case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid we have that in multiple places; I might eventually investigate, but seems like something that is already happening :\

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. BTW CROSS/OUTER APPLY vs. ROW_NUMBER() is something that's already tracked in #30450, it's the kind of deep performance investigations we haven't gotten around to yet. I hope that once the basic query architecture refactoring is farther ahead, all this will be much easier to tackle.

@roji roji merged commit 02d37a2 into dotnet:main Dec 24, 2024
7 checks passed
@ranma42 ranma42 deleted the simplify-collection-trivial branch December 24, 2024 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Combine multiple LIMITs
3 participants