-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query: Translate aggregate over grouping element in separate pass #27931
Merged
+1,659
−1,011
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
smitpatel
force-pushed
the
smit/groupby1
branch
2 times, most recently
from
May 2, 2022 22:20
f06a130
to
a29f7c6
Compare
smitpatel
commented
May 3, 2022
maumar
reviewed
May 4, 2022
src/EFCore.Relational/Query/SqlExpressions/SelectExpression.Helper.cs
Outdated
Show resolved
Hide resolved
maumar
reviewed
May 4, 2022
src/EFCore.Relational/Query/RelationalSqlTranslatingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
roji
reviewed
May 4, 2022
roji
reviewed
May 4, 2022
src/EFCore.Relational/Query/SqlExpressions/SqlEnumerableExpression.cs
Outdated
Show resolved
Hide resolved
maumar
approved these changes
May 4, 2022
Design: - Introduce `SqlEnumerableExpression` - a holder class which indicates the `SqlExpression` is in form of a enumerable (or group) coming as a result of whole table selection or a grouping element. It also stores details about if `Distinct` is applied over grouping or if there are any orderings. - Due to above `DistinctExpression` has been removed. The token while used to denote `Distinct` over grouping element were not valid in other parts of SQL tree hence it makes more sense to combine it with `SqlEnumerableExpression`. - To support dual pass, `GroupByShaperExpression` contains 2 forms of grouping element. One element selector form which correlates directly with the parent grouped query, second subquery form which correlates to parent grouped query through a correlation predicate. Element selector is first used to translate aggregation. If that fails we use subquery form to translate as a subquery. Due to 2 forms of same component, GroupByShaperExpression disallows calling into VisitChildren method, any visitor which is visiting a tree containing GroupByShaperExpression (which appears only in `QueryExpression.ShaperExpression` or LINQ expression after remapping but before translation) must intercept the tree and either ignore or process it appropriately. - An internal visitor (`GroupByAggregateChainProcessor`) inside SqlTranslator visits and process chain of queryable operations on a grouping element before aggregate is called and condense it into `SqlEnumerableExpression` which is then passed to method which translates aggregate. This visitor only processes Where/Distinct/Select for now. Future PR will add processing for OrderBy/ThenBy(Descending) operations to generate orderings. - Side-effect above is that joins expanded over the grouping element (due to navigations used on aggregate chain), doesn't translate to aggregate anymore since we need to translate the join on parent query, remove the translated join if the chain didn't end in aggregate and also de-dupe same joins. Filing issue to improve this in future. Due to fragile nature of matching to lift the join, we shouldn't try to lift joins. - To support custom aggregate operations, we will either reused `IMethodCallTranslator` or create a parallel structure for aggregate methods and call into it from SqlTranslator by passing translated SqlEnumerableExpression as appropriate. - For complex grouping key, we cause a pushdown so that we can reference the grouping key through columns only. This allows us to reference the grouping key in correlation predicate for subquery without generating invalid SQL in many cases. - With complex grouping key converting to columns, now we are able to correctly generate identifiers for grouping queries which makes more queries with correlated collections (where either parent or inner both queries can be groupby query) translatable. - Erase client projection when applying aggregate operation over GroupBy result. - When processing result selector in GroupBy use the updated key selector if the select expression was pushed down. Resolves #27132 Resolves #27266 Resolves #27433 Resolves #23601 Resolves #27721 Resolves #27796 Resolves #27801 Resolves #19683 Relates to #22957
smitpatel
added a commit
that referenced
this pull request
May 6, 2022
This iterates over design in #27931 - Bring back DistinctExpression (yes the token in invalid in some places and incorrect tree will throw invalid SQL error in database - Introduce EnumerableExpression which is a holder/parameter object which contains facets of grouping element chain including Distinct/Predicate/Selector/Ordering - Translators are responsible for putting together pieces from EnumerableExpression to generate a SqlExpression as a result - Remove GroupByAggregateChainProcessor which failed to avoid double visitation. We need to refactor this code in future to avoid it when we implement public API for aggregate functions Resolves #27948 Resolves #27935
smitpatel
added a commit
that referenced
this pull request
May 6, 2022
This iterates over design in #27931 - Bring back DistinctExpression (yes the token in invalid in some places and incorrect tree will throw invalid SQL error in database - Introduce EnumerableExpression which is a holder/parameter object which contains facets of grouping element chain including Distinct/Predicate/Selector/Ordering - Translators are responsible for putting together pieces from EnumerableExpression to generate a SqlExpression as a result - Remove GroupByAggregateChainProcessor which failed to avoid double visitation. We need to refactor this code in future to avoid it when we implement public API for aggregate functions Resolves #27948 Resolves #27935
Is this expected to be released to 7.0 prview5? |
smitpatel
added a commit
that referenced
this pull request
May 10, 2022
This iterates over design in #27931 - Bring back DistinctExpression (yes the token in invalid in some places and incorrect tree will throw invalid SQL error in database - Introduce EnumerableExpression which is a holder/parameter object which contains facets of grouping element chain including Distinct/Predicate/Selector/Ordering - Translators are responsible for putting together pieces from EnumerableExpression to generate a SqlExpression as a result - Remove GroupByAggregateChainProcessor which failed to avoid double visitation. We need to refactor this code in future to avoid it when we implement public API for aggregate functions Resolves #27948 Resolves #27935
smitpatel
added a commit
that referenced
this pull request
May 10, 2022
This iterates over design in #27931 - Bring back DistinctExpression (yes the token in invalid in some places and incorrect tree will throw invalid SQL error in database - Introduce EnumerableExpression which is a holder/parameter object which contains facets of grouping element chain including Distinct/Predicate/Selector/Ordering - Translators are responsible for putting together pieces from EnumerableExpression to generate a SqlExpression as a result - Remove GroupByAggregateChainProcessor which failed to avoid double visitation. We need to refactor this code in future to avoid it when we implement public API for aggregate functions Resolves #27948 Resolves #27935
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Design:
SqlEnumerableExpression
- a holder class which indicates theSqlExpression
is in form of a enumerable (or group) coming as a result of whole table selection or a grouping element. It also stores details about ifDistinct
is applied over grouping or if there are any orderings.DistinctExpression
has been removed. The token while used to denoteDistinct
over grouping element were not valid in other parts of SQL tree hence it makes more sense to combine it withSqlEnumerableExpression
.GroupByShaperExpression
contains 2 forms of grouping element. One element selector form which correlates directly with the parent grouped query, second subquery form which correlates to parent grouped query through a correlation predicate. Element selector is first used to translate aggregation. If that fails we use subquery form to translate as a subquery. Due to 2 forms of same component, GroupByShaperExpression disallows calling into VisitChildren method, any visitor which is visiting a tree containing GroupByShaperExpression (which appears only inQueryExpression.ShaperExpression
or LINQ expression after remapping but before translation) must intercept the tree and either ignore or process it appropriately.GroupByAggregateChainProcessor
) inside SqlTranslator visits and process chain of queryable operations on a grouping element before aggregate is called and condense it intoSqlEnumerableExpression
which is then passed to method which translates aggregate. This visitor only processes Where/Distinct/Select for now. Future PR will add processing for OrderBy/ThenBy(Descending) operations to generate orderings.IMethodCallTranslator
or create a parallel structure for aggregate methods and call into it from SqlTranslator by passing translated SqlEnumerableExpression as appropriate.Resolves #27132
Resolves #27266
Resolves #27433
Resolves #23601
Resolves #27721
Resolves #27796
Resolves #27801
Resolves #19683
Relates to #22957