Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor RequiresMaterializationExpressionVisitor #7695

Closed
wants to merge 1 commit into from

Conversation

tuespetre
Copy link
Contributor

This is the code at the heart of the efforts I made in #7543 and #7650.

I don't know why it didn't occur to me to make a branch that had just this in it without removing the 'training wheels' in the QueryCompilationContext (the code that brute-force flags all group joins as requiring materialization.) It actually went really easy, I just had to make a tiny little patch to SelectExpression and update some tests' SQL output (one which has a MUCH better query now, another which doesn't pull unnecessary columns anymore, and the rest which just have smaller resulting SQL text.)

Whenever the team is ready to work on the things I accomplished in those other two PRs, all that would need to be done is:

  • Uncomment lines 416 and 417 in RequiresMaterializationExpressionVisitor.cs
  • Remove the code in QueryCompilationContext that forces all group joins to be materialized
  • Have fun

@tuespetre tuespetre force-pushed the requires-materialization branch 2 times, most recently from 2b2ba5c to 62c9810 Compare February 23, 2017 06:55
@tuespetre tuespetre mentioned this pull request Feb 23, 2017
@maumar maumar self-assigned this Feb 23, 2017
@@ -170,6 +170,11 @@ public virtual void AddTable([NotNull] TableExpressionBase tableExpression)
{
Check.NotNull(tableExpression, nameof(tableExpression));

if (_tables.Count == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't be needed after you rebase on current dev

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that may be the case, I saw @smitpatel's other change (ProjectStarTable)

@@ -188,11 +188,20 @@ protected override Expression VisitExtension(Expression node)
if (newCaller != nullConditionalExpression.Caller
&& newCaller.Type == typeof(ValueBuffer))
{
var newAccessOperation = Visit(nullConditionalExpression.AccessOperation);
var convertedAccessOperation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the conversion after the visit and only if needed

});
if (node.Expression.Type.IsGrouping() && node.Member.Name == "Key")
{
var groupResultOperator
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The contents of this if block should be simplified to:

if (node.Expression is QuerySourceReferenceExpression qsre)
{
    DemoteQuerySource(qsre.ReferencedQuerySource);
}

{
AddQuerySource(underlyingQuerySource);
var parentStatistics = _querySourceReferences[parentQuerySource];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

}
else
{
_querySourceReferences[querySource] += 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ++


if (_querySourceReferences.ContainsKey(querySource))
{
_querySourceReferences[querySource] -= 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: --

{
return newAccessOperation;
var test = ValueBufferNullComparisonCheck(newCaller);
var ifTrue = newAccessOperation;
Copy link
Contributor

@maumar maumar Feb 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var ifTrue = newAccessOperation.Type != nullConditionalExpression.Type 
  ? Expression.Convert(newAccessOperation, nullConditionalExpression.Type) 
  : nullConditionalExpression;

@@ -10,6 +10,9 @@
using Remotion.Linq.Clauses;
using Remotion.Linq.Clauses.Expressions;
using Remotion.Linq.Clauses.ResultOperators;
using Remotion.Linq.Clauses.StreamedData;
using System;
using Microsoft.EntityFrameworkCore.Query.Expressions.Internal;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sort usings

@@ -107,18 +107,6 @@ public void Query_logs_DatabaseErrorLogState_during_single_async()
Query_logs_DatabaseErrorLogState(c => c.Blogs.FirstOrDefaultAsync().Wait());
}

[Fact]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these tests are removed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they no longer materialize entities therefore exception is no longer thrown

var ifTrue = newAccessOperation;
var ifFalse = Expression.Default(convertedAccessOperation.Type);

return Expression.Condition(test, ifTrue, ifFalse);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For which scenarios simply returning visited AccessOperation (after we adjust for potential type difference) is not enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CompexNavigationsInMemoryTest, GroupJoin_on_right_side_being_a_subquery


if (isSubQuery && finalResultOperator is GroupResultOperator groupResultOperator)
{
//DemoteQuerySource(referencedQuerySource);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GroupResultOperator cannot be demoted until GROUP BY is being implemented. Those are the two lines I mentioned in my initial comments for this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the same in code comments? PR comments are disconnected with source (and even commit)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing

@maumar
Copy link
Contributor

maumar commented Feb 24, 2017

I hit a regression in a couple tests: ComplexNavigationsQuerySqlServerTest.Null_protection_logic_work_for_inner_key_access_of_manually_created_GroupJoin2
and
ComplexNavigationsQuerySqlServerTest.GroupJoin_on_complex_subquery_and_set_operation_on_grouping_but_nothing_from_grouping_is_projected

for the first one, with the following query model:

from Level1 l1 in DbSet<Level1>
join Level2 l2 in 
    from Level1 l1 in DbSet<Level1>
    join Level2 l1.OneToOne_Required_FK in DbSet<Level2>
    on Property(l1, "Id") equals Property(l1.OneToOne_Required_FK, "Level1_Required_Id") into l1.OneToOne_Required_FK_group
    from Level2 l1.OneToOne_Required_FK in 
        (from Level2 l1.OneToOne_Required_FK_groupItem in l1.OneToOne_Required_FK_group
        select l1.OneToOne_Required_FK_groupItem)
        .DefaultIfEmpty()
    select l1.OneToOne_Required_FK
on (Nullable<int>) l1.Id equals (?Property([l2], "Level1_Optional_Id")?) into l2s
select l1

the new algorithm marks:
from Level1 l1 in value(EntityQueryable`1[Level1]) and
join Level2 l1.OneToOne_Required_FK in value(EntityQueryable`1[Level2]) on Property([l1], "Id") equals Property([l1.OneToOne_Required_FK], "Level1_Required_Id")

for materialization.

This is incorrect, only Level1 should be marked as the result of subquery in not present in the final projection (that was the behavior before the change). The problem doesn't surface currently because the second query source gets marked for materialization anyway in QueryCompilationContext, but once we lift that excessive materialization from QCC this scenario will be blocked from being optimized

@tuespetre
Copy link
Contributor Author

That is probably because I was going off of the group join clause as the query source rather than its join clause to determine what required materialization. I will adjust for that, thanks for the example

@tuespetre tuespetre force-pushed the requires-materialization branch from 27eda26 to a8ca21f Compare February 24, 2017 02:46
@tuespetre
Copy link
Contributor Author

🆙📅

_queryModelStack.Peek().SelectClause.Selector,
referencedQuerySource);

if (traced == null || finalResultOperator is DefaultIfEmptyResultOperator)
Copy link
Contributor Author

@tuespetre tuespetre Feb 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maumar This seems to be what was needed (the || finalResultOperator is DefaultIfEmptyResultOperator)

Copy link
Contributor Author

@tuespetre tuespetre Feb 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Disregard that. 🐓 I tested that out on my branch that has all of the optimizations and it marked several things incorrectly. This change is right. (I tested this change on that branch as well and it improved both of the queries you pointed out @maumar.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, ran the updated code alongside my changes and the queries are better indeed

{
var choiceResultOperator = (ChoiceResultOperatorBase)finalResultOperator;

if (isSubQuery && choiceResultOperator.ReturnDefaultWhenEmpty)
Copy link
Contributor Author

@tuespetre tuespetre Feb 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this check (choiceResultOperator.ReturnDefaultWhenEmpty) really needs to be here, so this block can be collapsed with the preceding block. I've tested this idea on my branch with all of the optimizations and everything stayed the same test-wise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we are missing test coverage, there should definitely be a difference between First and FirstOrDefault when used in a subquery. FoD we can sometimes translate but first will always result in client evaluation. Wrt LastOrDefault, we can only translate it if the query also has orderby (we change the asc <-> desc and slap FirstOrDefault instead). Perhaps we can make this change as part of QueryModel optimization (before this step), so every time you encounter LoD here, you know it needs client eval.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether or not those result operators need to be client evaluated doesn't affect whether the underlying query source needing materialization or not though. customers.Select(c => c.Name).Single() should not materialize the whole customer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smitpatel yeah, but we can move that earlier

Copy link
Contributor

@maumar maumar Feb 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuespetre this is because in your example the subquery that Single is wrapped around doesn't need the c materialized on its own (it's not projected by that particular subquery). If the query model was instead:

from c in customers
select
{(from c2 in customers select c2).Single()}.Name

(that thing will be optimized by relinq, but lets assume its not)

then the Single will force materialization on the inner subquery, and since result of that subquery is c2, its query source should be marked.

What is happening in the code (in AdjustForResultOperators) is, we peek into the parent selector, and if that selector can't be traced back to the result of inner query, we demote the result of that inner query. This is ok for most cases, but with Single/First etc we still need the result of that inner subquery materialized.

However one could argue that this is not a concern of the RequiresMaterialization visitor, since it shouldn't care which operators can be translated.

Copy link
Contributor Author

@tuespetre tuespetre Feb 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maumar in the new example you gave, the customer still does not need to be materialized. But if it were to select a NotMapped property instead of Name, or call some client method on it instead of accessing the Name member, it should be materialized.

IOW: Single().Name can translate to SELECT TOP (2) c.Name FROM ..., no need for materialization. SQL example, but should hold true for any provider.

Copy link
Contributor

@maumar maumar Feb 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuespetre yeah that's because relinq optimized the case. I gave this (simplified) case with a caveat that the QM will be processed in this form. A real life example would be: (with the complex navigations model - using required navs so that QueryCompilationContext doesn't compensate)

ctx.LevelTwo.Select(l2 => ctx.LevelTwo.First().OneToOne_Required_FK_Inverse.Name)

this gets converted into the following QM:

from Level2 l1 in DbSet<Level2>
select 
    (from Level2 <generated>_1 in DbSet<Level2>
    join Level1 <generated>_1.OneToOne_Required_FK_Inverse in DbSet<Level1>
    on Property(<generated>_1, "Level1_Required_Id") equals Property(<generated>_1.OneToOne_Required_FK_Inverse, "Id")
    select <generated>_1.OneToOne_Required_FK_Inverse)
    .First().Name

now, this will be translated into the following query plan:

TRACKED: False
(QueryContext queryContext) => IEnumerable<string> _Select(
    source: IEnumerable<ValueBuffer> _ShapedQuery(
        queryContext: queryContext, 
        shaperCommandContext: SelectExpression: 
            SELECT 1
            FROM [Level2] AS [l1]
        , 
        shaper: ValueBufferShaper
    )
    , 
    selector: (ValueBuffer l1) => Level1 First(
        source: IEnumerable<Level1> _Select(
            source: IEnumerable<TransparentIdentifier<ValueBuffer, Level1>> _ShapedQuery(
                queryContext: queryContext, 
                shaperCommandContext: SelectExpression: 
                    SELECT TOP(1) [l0].[Id], [l0].[Date], [l0].[Name], [l0].[OneToMany_Optional_Self_InverseId], [l0].[OneToMany_Required_Self_InverseId], [l0].[OneToOne_Optional_SelfId]
                    FROM [Level2] AS [l]
                    INNER JOIN [Level1] AS [l0] ON [l].[Level1_Required_Id] = [l0].[Id]
                , 
                shaper: CompositeShaper`3
            )
            , 
            selector: (TransparentIdentifier<ValueBuffer, Level1> t0) => t0.Inner
        )
    )
    .Name
)

Note that the inner query, inside First() is materialized into entity, so everything works fine.
However, if you remove the && choiceResultOperator.ReturnDefaultWhenEmpty from the line 417 this query will now not be marked for materialization and will fail to compile.

From the perspective on QM translation to query plan, First, Single, SingleOrDefault are indistinguishable from a simple client method that takes a collection and returns single element.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case though, it should not fail to compile because the MemberAccessBindingExpressionVisitor will check if the type is ValueBuffer and use BindMemberToValueBuffer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuespetre I think you are right. Currently we can't properly bind in this case, but that's a bug in the compilation rather than the RequiresMaterializationExpressionVisitor

@maumar
Copy link
Contributor

maumar commented Feb 25, 2017

@tuespetre, I found some more issues. Those are not regressions (current code is also wrong) but perhaps you might want to look into it.

Tests in question are:
Collection_select_nav_prop_single_or_default_then_nav_prop_nested
Collection_select_nav_prop_first_or_default_then_nav_prop_nested
Collection_select_nav_prop_first_or_default_then_nav_prop_nested_with_orderby

a simplified version of the query:

ctx.Customers.Select(c => ctx.Orders.SingleOrDefault(o => o.OrderID == 10643).Customer.City)

and a variation (using client-side method):

ctx.Customers.Select(c => ClientFirstOrDefault(ctx.Orders).Customer.City));

        public static T ClientFirstOrDefault<T>(IQueryable<T> source)
        {
            return source.FirstOrDefault();
        }

query model:

from Customer c in DbSet<Customer>
select 
    (from Order o in DbSet<Order>
    join Customer o.Customer in DbSet<Customer>
    on Property(o, "CustomerID") equals Property(o.Customer, "CustomerID") into o.Customer_group
    from Customer o.Customer in 
        (from Customer o.Customer_groupItem in o.Customer_group
        select o.Customer_groupItem)
        .DefaultIfEmpty()
    where o.OrderID == 10643
    select o.Customer)
    .SingleOrDefault().City

so o.Customer should be marked for materialization, because it's behind SingleOrDefault (or a client method). Like with the previous example, it's not an issue currently because we blanket-materialize all joins, but will be a problem in the future.

@maumar
Copy link
Contributor

maumar commented Feb 25, 2017

for the client-side variant, the query model is:

from Customer c in DbSet<Customer>
select Order ClientFirstOrDefault(
    source: DbSet<Order>
)
.Customer.City

and orders don't get marked for materialization either


return node;
private void AddResultOperators(QueryModel queryModel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to pass querymodel here, also this could be more general-purpose, e.g.

PromoteQuerySources(IEnumerable<IQuerySource> querySources)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really the method could go away altogether, just needs to be:

if (queryModel.ResultOperators.LastOrDefault() is IQuerySource querySource) { PromoteQuerySource(querySource); }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or rather, foreach (var querySourceResultOperator in queryModel.ResultOperators.OfType<IQuerySource>()). I mean realistically only the GroupResultOperator is an IQuerySource (I think) but for the sake of correctness? 😛

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inlining sounds good.

@tuespetre
Copy link
Contributor Author

@maumar on your examples, with the SingleOrDefault, only the one field is selected so there is no materialization. The ClientFirstOrDefault is definitely an issue though.

@tuespetre
Copy link
Contributor Author

tuespetre commented Feb 25, 2017

@maumar I've investigated the ClientFirstOrDefault example further.

  • First, an exception is thrown when evaluating the query, because EntityQueryableExpressionVisitor tries to make a SelectExpression for it but the querySource is null because it is in the SelectClause. The EntityQueryableExpressionVisitor should skip over EntityQueryable expressions in such cases and leave them for evaluation during client evaluation of the projection. But if it were changed to ClientFirstOrDefault(from o in context.Orders select o), that would be different story because all of a sudden it can be assessed as a subquery instead of just an EntityQueryable.
  • Second, and related to that, during the first run through of determining the query sources requiring materialization (for the parent query) we would not expect to see the orders marked as requiring materialization but on the subsequent pass through RequiresMaterializationExpressionVisitor we should expect to see them marked so (happening when the projection is client-evaluated.)
  • Third, about how I mentioned ClientFirstOrDefault(from o in context.Orders select o), I've looked at that and there is indeed a problem there, because the orders are not marked when they should be.

EDIT

  • Fourth, I fixed the issue with the third point, but the query is no good anyways because of A second operation started on this context before a previous operation completed.

@tuespetre tuespetre force-pushed the requires-materialization branch from a8ca21f to b7a7e13 Compare February 25, 2017 20:35
&& (_model.FindEntityType(leftSubQueryExpression.Type) != null))
private Expression VisitBinaryOperand(Expression operand, ExpressionType comparison)
{
switch (comparison)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert this to a single if statement instead, there is only one case + default

@@ -176,91 +218,267 @@ protected override Expression VisitBinary(BinaryExpression node)
/// </summary>
protected override Expression VisitSubQuery(SubQueryExpression expression)
{
var oldParentSelector = _selector;
var oldQueryModel = _queryModel;
_queryModelStack.Push(expression.QueryModel);
Copy link
Contributor

@maumar maumar Feb 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we need to add group result operators here also? (like we do for the top level query)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do that on the parent query because it absolutely has to be, but for subqueries, any group result operators should be promoted/demoted as necessary via HandleUnderlyingQuerySources

@tuespetre
Copy link
Contributor Author

🆙📅


var finalResultOperator = queryModel.ResultOperators.LastOrDefault();

var unreachableFromParentSelector =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compute this after the GroupBy check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done ✔️

@tuespetre tuespetre force-pushed the requires-materialization branch from 80129f5 to 5586c0f Compare February 27, 2017 18:50
@ksmithRenweb
Copy link

@tuespetre @maumar Thank you for working on this. When it gets released it will solve some major performance issues in our app. (It should reduce one very frequent call from 1 second to 30 ms. Blame a table with 300 columns being materialized, when we only need 3 of them in the query.)

// that reference an entity query source does not suggest that
// materialization of that entity type may be required. This is true
// whether in a join predicate, where predicate, selector, etc.
if (operand is QuerySourceReferenceExpression && isEntityTypeExpression)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update comment that QSRE referencing entity type is re-written into key equality comparison.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure


var leftSubQueryExpression = node.Left as SubQueryExpression;
return node;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this method return same node instead of the visited node like other methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the version before my changes did the same. This visitor doesn't actually change anything, but I guess it wouldn't hurt to return node.Update(newLeft, node.Conversion, newRight).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be consistent about it. Either we return same expression in everything or we send updated. Change every overridden method to return the same expression which came via parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm changing to return node.Update(newLeft, node.Conversion, newRight)


var rightSubQueryExpression = node.Right as SubQueryExpression;
if (operand is SubQueryExpression subQueryExpression && isEntityTypeExpression)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the different when operand is subquery & its entitytype expression? This subquery expression will be revisited in VisitSubquery when you call Visit(operand)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous version did the same thing, it should be the same thing as the QuerySourceReferenceExpression case -- for key equality comparisons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding comments for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try removing this whole part and see how VisitSubquery works out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test 'Microsoft.EntityFrameworkCore.SqlServer.FunctionalTests.RowNumberPagingTest.Where_query_composition_is_not_null' failed:
	System.InvalidOperationException : Reference equality is not defined for the types 'Microsoft.EntityFrameworkCore.Storage.ValueBuffer' and 'System.Object'.

(Posted under the wrong comment at first!)


_queryModel.TransformExpressions(Visit);
AdjustForResultOperators(expression.QueryModel);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is guidance on using AdjustForResultOperators?
At very top level, we add result operator before visiting query model and then adjust result operators after visiting qm. Here we did not add result operators. Few lines above we also visiting query model but we are not doing anything with result operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The top-level IQuerySource result operators are added that way because it's the top level and they won't get promoted the same way they do for subqueries (via HandleUnderlyingQuerySources.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding comments for this.

{
AddQuerySource(underlyingQuerySource);
foreach (var sourceExpression in resultSetOperators)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be more than one sourceExpression matching current expression?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be, the old version used a foreach loop too, a more succinct version would be:

                    if (resultSetOperators.Any(r => r.Equals(expression)) 
                        && _querySourceReferences[parentQuerySource] > 0)
                    {
                        PromoteQuerySource(referencedQuerySource);
                    }

}

var elementSelectorQuerySource
= (groupResultOperator.ElementSelector as QuerySourceReferenceExpression)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is being repeated multiple times. Make a method which try safe cast to QSRE and returns referenced query source.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line was written before the VS2017 conversion, and now the language makes it so much nicer 😉

if (groupResultOperator.KeySelector is QuerySourceReferenceExpression keySelectorQsre)
{
    action(keySelectorQsre.ReferencedQuerySource);
}


var elementSelectorSubQueryQuerySource
= ((groupResultOperator.ElementSelector as SubQueryExpression)
?.QueryModel.SelectClause.Selector as QuerySourceReferenceExpression)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, function to get selector after safe casting to Subqueryexpression

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making a change that will make this more terse/compact but not quite enough to justify a method, IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is being repeated at multiple places in same pattern. It can be simple method TryGetSelectorFromSubquery

}

var unreachableFromParentSelector =
isSubQuery && new QuerySourceTracingExpressionVisitor()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new QuerySourceTracingExpressionVisitor() [](start = 31, length = 41)

This can (or should) be stored in static in this class so that we don't initialize it all the time. The same instance can serve all the invocations of the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not thread-safe so static is off limits. Caching a single instance per RequiresMaterializationExpressionVisitor may work.

return concatOperator.Source2;
// If not a subquery or if reachable from the parent selector
// we would not want to fall through to one of the next blocks.
if (isSubQuery && unreachableFromParentSelector)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unreachableFromParentSelector asserts isSubQuery already.

// we would not want to fall through to one of the next blocks.
if (isSubQuery && unreachableFromParentSelector)
{
DemoteQuerySourceWithUnderlying(referencedQuerySource);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DemoteQuerySourceWithUnderlying [](start = 20, length = 31)

rename function to something better. @maumar suggest a name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the name either, but can't come up with anything much better. Maybe DemoteQuerySourceAndItsFromClause? Method is not very complicated, so having a bad name is not a big deal. I'm fine with either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DemoteQuerySourceAndUnderlyingFromClause?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuespetre sounds good to me

/// This API supports the Entity Framework Core infrastructure and is not intended to be used
/// directly from your code. This API may change or be removed in future releases.
/// </summary>
protected override Expression VisitExtension(Expression node)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maumar to verify where and how this is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smitpatel it's needed because we are counting references and visiting each of the Caller and the NullableCaller and the AccessOperation would throw off the count (as I found when working on materialization changes.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can do a check similar to the one in binary expression here - if Caller/NullableCaller is qsre of type entity, we don't promote, but otherwise we do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need VisitMember to ensure proper counting though? What if the accessed property is NotMapped

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would visit the AccessExpression, just it's done currently, but on top of that peek into the Callers. They should be just entity qsres for all the real cases, so for all those cases, the code will behave the same way. But we would also cover our bases for the weird cases (e.g. when the user manually creates NullConditionalExpression)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we possibly revisit that later, if it becomes an issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we possibly revisit that later, if it becomes an issue

Are we removing this override fully then? If yes, then its fine to revisit later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smitpatel @maumar I'm not sure I understand.

If we just remove this override altogether, the base implementation skips visiting the NullConditionalExpression entirely. The override here is at least an improvement and makes sure we count the references as we should; we can take it out but then we may not be catching things we should (like NotMapped properties.)

And what about the other places that only work with AccessOperation, like ExpressionEqualityComparer and SqlTranslatingExpressionVisitor? That's why I'm asking if it can be revisited later: because places like that aren't doing anything special with Caller and NullableCaller, and if we're going to put that under a microscope here, shouldn't it be addressed in the other places too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the base implementation skips visiting the NullConditionalExpression entirely.

Wasn't aware of that. I assumed it would call VisitChildren in class. Since it is skipping over, lets keep this. @maumar - to file issue on usage of NullConditionalExpression (based on our discussion around caller) and link it to #7520

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we were under the assumptions that the VisitChildren would get called, which would visit all the nodes. However thats not the case (we special case NullConditional in the base visitor), so this is fine as it is. I will try to come up with some tests, see if I can break it, but this can be done later. Let's leave this method as is for now.

@smitpatel
Copy link
Contributor

Loos good 😄

@tuespetre tuespetre force-pushed the requires-materialization branch from 5586c0f to 276ebd3 Compare February 28, 2017 20:35
{
_querySources[querySourceReferenceExpression.ReferencedQuerySource]--;
action(keySelectorQsre.ReferencedQuerySource);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smitpatel It's only this block using this exact pattern, and only two times next to each other, just doesn't seem like enough to warrant a whole separate method...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is being used many times like line 156 or 237. I suppose this thing is being used at multiple places even outside of this file and a good candidate for extension method. You can leave it as is though. I will update it in future refactoring.


In reply to: 103548170 [](ancestors = 103548170)

@tuespetre
Copy link
Contributor Author

🆙📅 added more comments, made a few things more terse.

Copy link
Contributor

@maumar maumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit: phew! 95 comments, I'm glad we are done here ;)

@maumar
Copy link
Contributor

maumar commented Feb 28, 2017

@tuespetre please squash-rebase on current dev and I will merge it

Improved the algorithm of RequiresMaterializationExpressionVisitor to
provide better detection of which query sources need materialized.
Currently the QueryCompilationContext forces all group joins to be
materialized, but these changes produce the correct results for when the
time comes to remove the forcing. These changes also prepare for the
possibility of providers translating the GroupBy result operator.
@tuespetre tuespetre force-pushed the requires-materialization branch from 276ebd3 to c6ae0a8 Compare February 28, 2017 22:30
@tuespetre
Copy link
Contributor Author

@maumar done 🏁

@maumar
Copy link
Contributor

maumar commented Feb 28, 2017

@tuespetre merged in 67b4d22 Thank you for the contribution!

@maumar maumar closed this Feb 28, 2017
@tuespetre tuespetre deleted the requires-materialization branch March 1, 2017 03:11
@tuespetre
Copy link
Contributor Author

Awesome, thanks @maumar! Question now, is anyone on the team already working on the group join materialization stuff?

@maumar
Copy link
Contributor

maumar commented Mar 1, 2017

I'm almost done with a partial fix (one last issue left to figure out, I hope) that only requires materialization for queries when we actually introduce client GroupJoin. This fixes optional navs case where we can eliminate the client GroupJoin because the LOJ is flattened with SelectMany.

@maumar
Copy link
Contributor

maumar commented Mar 1, 2017

Also a while back @anpete was working on a more general solution but hit some roadblocks. IIRC the problematic case was when there was nothing from the outer in the final result, so you couldn't figure out how to split the groups. We put that solution on hold for now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants