Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query compilation error or data corruption for queries with GroupJoin wrapped in a subquery when the subset of columns is being projected #7290

Closed
shaulbehr opened this issue Dec 21, 2016 · 11 comments
Assignees
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Milestone

Comments

@shaulbehr
Copy link

shaulbehr commented Dec 21, 2016

This is a pretty specific corner case, but I've managed to repro it several times using different tables.

You need two tables, one with a non-mandatory FK to the other, e.g.

public class Salesperson
{
    public Guid SalespersonId {get; set;}
    public string Name {get; set;}
}
public class Customer
{
    public Guid CustomerId {get;set;}
    public Guid? SalespersonId {get;set;}
    public Salesperson Salesperson {get;set;}
}

Now you write a query to get the IDs of the top 100 Customers whose salesperson is named Fred:

var query = db.Customers
    .Where(q=>q.Salesperson.Name == "Fred")
    .Take(100)
    .Select(q=>q.SalespersonId)
    .ToList();

Result: a nasty error with sharp, pointy teeth:

System.ArgumentException : Expression of type 'Microsoft.EntityFrameworkCore.Query.ExpressionVisitors.Internal.ValueBufferShaper' cannot be used for parameter of type 'Microsoft.EntityFrameworkCore.Query.ExpressionVisitors.Internal.IShaper1[MyNamespace.Salesperson]' of method 'System.Collections.Generic.IEnumerable1[Microsoft.EntityFrameworkCore.Query.EntityQueryModelVisitor+TransparentIdentifier2[Microsoft.EntityFrameworkCore.Storage.ValueBuffer,System.Collections.Generic.IEnumerable1[MyNamespace.Salesperson]]] _GroupJoin[ValueBuffer,BookTag,Nullable1,TransparentIdentifier2](Microsoft.EntityFrameworkCore.Query.RelationalQueryContext, System.Collections.Generic.IEnumerable1[Microsoft.EntityFrameworkCore.Storage.ValueBuffer], Microsoft.EntityFrameworkCore.Query.ExpressionVisitors.Internal.IShaper1[Microsoft.EntityFrameworkCore.Storage.ValueBuffer], Microsoft.EntityFrameworkCore.Query.ExpressionVisitors.Internal.IShaper1[MyNamespace.Salesperson], System.Func2[MyNamespace.Salesperson,System.Nullable1[System.Guid]], System.Func3[Microsoft.EntityFrameworkCore.Storage.ValueBuffer,System.Collections.Generic.IEnumerable1[MyNamespace.Salesperson],Microsoft.EntityFrameworkCore.Query.EntityQueryModelVisitor+TransparentIdentifier2[Microsoft.EntityFrameworkCore.Storage.ValueBuffer,System.Collections.Generic.IEnumerable`1[MyNamespace.Salesperson]]], Microsoft.EntityFrameworkCore.Query.Internal.GroupJoinInclude, Microsoft.EntityFrameworkCore.Query.Internal.GroupJoinInclude)'

This error goes away if you change any of the following:

  • Remove the "Where" clause
  • Remove the "Take" clause
  • Remove the final "Select" clause (i.e. you get back the whole original object)
  • Move the "Select" to before the "Take"
  • Make the SalespersonId mandatory

It's a perfect storm of a corner case, but it did, nonetheless, show up in my code, and it is a bug. I discovered it in EFCore 1.1.0, but even in the latest prerelease of 1.2.0 it is still a problem.

@divega divega added this to the 2.0.0 milestone Jan 6, 2017
@divega divega added the type-bug label Jan 6, 2017
@divega divega assigned maumar and unassigned anpete Jan 6, 2017
@divega divega modified the milestones: 1.1.1, 2.0.0 Jan 6, 2017
@divega
Copy link
Contributor

divega commented Jan 6, 2017

@maumar this could be related to another 1.1.1 issue you are working on.

@maumar
Copy link
Contributor

maumar commented Jan 11, 2017

turns out to be unrelated to #7367
Problem is that we are not marking one of the groupjoin sources for materialization, because its wrapped in a subquery

@maumar maumar changed the title Where/Take/Select corner case Query compilation error for queries with GroupJoin wrapped in a subquery when the subset of columns is being projected Jan 12, 2017
@maumar
Copy link
Contributor

maumar commented Jan 20, 2017

Problem here is that our current mechanism to find the groupjoin sources that needs to be materialized only works on top level query. If the query has subqueries containing groupjoins, they will not be marked for materialization and cause the error.

The simple fix would be to introduce the visitor that walks the entire QueryModel and finds all the necessary query sources, however the problem with that approach is that we would regress perf in some scenarios - we would bring out even more data (which we currently don't do).

Better solution is to be smart about which sources we need - only mark for materialization sources that are part of the client side GroupJoin. If the GroupJoin is eliminated (flattened using SelectMany) - we don't need to materialize the main from clause. This effectively could address #6647. Problem is that there is significant risk associated with doing so, and it might not fit into 1.1.1

@maumar maumar removed this from the 1.1.1 milestone Jan 20, 2017
@maumar
Copy link
Contributor

maumar commented Jan 20, 2017

clearing milestone for re-triage

@maumar maumar added this to the 1.1.1 milestone Jan 27, 2017
@maumar
Copy link
Contributor

maumar commented Jan 27, 2017

Moving to 1.1.1. again since this bug may cause data corruption in some cases

@maumar maumar changed the title Query compilation error for queries with GroupJoin wrapped in a subquery when the subset of columns is being projected Query compilation error or data corruption for queries with GroupJoin wrapped in a subquery when the subset of columns is being projected Jan 27, 2017
maumar added a commit that referenced this issue Jan 28, 2017
… with GroupJoin wrapped in a subquery when the subset of columns is being projected

Problem was that we were incorrectly reasoning about which query sources need to be materialized in the GroupJoin scenario. Previously, when we encountered GroupJoin, we marked its main from clause and the joined clause for materialization, but we would not do that recursively. This causes problems if the GroupJoin (e.g. optional navigation) is in a subquery - we would not mark it for materialization. This would lead to compilation errors in some cases, and in others (when there was a client-evaluation component) to data corruption.

Naive implementation of the fix - marking every query source of the GroupJoin for materialization would result in performance regression (we would start to materialize more unnecessary columns - in many scenarios they are not needed). Therefore we also implemented an enhancement to the logic that chooses whether query source (only MainFromClause for now) needs to be materialized.

We don't need to materialize MainFromClause for the following case:

(
  (if we encounter GroupJoinClause followed by AdditionalFromClause, which references the group join and contains subquery with a DefaultIfEmpty - basically GroupJoin-SelectMany-DefaultIfEmpty pattern that can be flattened to LOJ and client GroupJoin will be removed)
  AND
  (reference to the group is NOT present anywhere else in the query model - if the group is referenced then we need to preserve client-side GroupJoin operation, which requires MainFromClause to be materialized)
)
AND
(
  (reference to MainFromClause is NOT projected in the selector - we need to materialize because it might be needed in the parent query model)
  OR
  (queryModel is wrapped in a result operator that returns a scalar - e.g. Count, even if we project MainFromClause, the wrapping result operator makes it inaccessible from the parent query, meaning that it's not needed after all)
)

Second part of the fix is that in LiftSubquery, when we update query sources, we would sometimes try to rewrite EntityShapers into ValueBufferShapers (if the lifted query itself is not marked for materialization). This however is not safe to do in case of GroupJoin. Fix is to only perform this conversion shapers inside ShapedQuery method.
@maumar maumar removed this from the 1.1.1 milestone Jan 30, 2017
@rowanmiller rowanmiller added this to the 1.1.1 milestone Jan 30, 2017
@rowanmiller
Copy link
Contributor

We are OK with taking a minimal fix to this in 1.1.1. A more complete fix (better perf, cleaner SQL, etc.) will go into 2.0.

maumar added a commit that referenced this issue Jan 30, 2017
… with GroupJoin wrapped in a subquery when the subset of columns is being projected

Problem was that we were incorrectly reasoning about which query sources need to be materialized in the GroupJoin scenario. Previously, when we encountered GroupJoin, we marked its main from clause and the joined clause for materialization, but we would not do that recursively. This causes problems if the GroupJoin (e.g. optional navigation) is in a subquery - we would not mark it for materialization. This would lead to compilation errors in some cases, and in others (when there was a client-evaluation component) to data corruption.

Fix is to use a QueryModelVisitor to walk the entire tree which allows all necessary query sources related to GroupJoin to be marked for materialization. This incurs a performance hit for some scenarios (groupjoin/optional navigation inside a subquery) because now we bring more columns from the database.

Second part of the fix is that in LiftSubquery, when we update query sources, we would sometimes try to rewrite EntityShapers into ValueBufferShapers (if the lifted query itself is not marked for materialization). This however is not safe to do in case of GroupJoin. Fix is to only perform this conversion shapers inside ShapedQuery method.
@maumar
Copy link
Contributor

maumar commented Jan 31, 2017

@divega
Justification to include this in 1.1.1 release:

Impact: High. Issue was reported by a customer and is a regression introduced in 1.1.0. It can also cause data corruption in specific cases.

Risk: Low/Medium. We already have the same logic that applied to GroupJoin clauses, but we were not doing it recursively. However, the fix incurs a performance hit - we will now yield more unnecessary data from database and materialize them as entities for the affected scenarios (groupjoins in a subquery). For simpler cases its just extra columns, more complicated ones can turn into N+1 queries.

maumar added a commit that referenced this issue Feb 4, 2017
… with GroupJoin wrapped in a subquery when the subset of columns is being projected

Problem was that we were incorrectly reasoning about which query sources need to be materialized in the GroupJoin scenario. Previously, when we encountered GroupJoin, we marked its main from clause and the joined clause for materialization, but we would not do that recursively. This causes problems if the GroupJoin (e.g. optional navigation) is in a subquery - we would not mark it for materialization. This would lead to compilation errors in some cases, and in others (when there was a client-evaluation component) to data corruption.

Fix is to use a QueryModelVisitor to walk the entire tree which allows all necessary query sources related to GroupJoin to be marked for materialization. This incurs a performance hit for some scenarios (groupjoin/optional navigation inside a subquery) because now we bring more columns from the database.

Second part of the fix is that in LiftSubquery, when we update query sources, we would sometimes try to rewrite EntityShapers into ValueBufferShapers (if the lifted query itself is not marked for materialization). This however is not safe to do in case of GroupJoin. Fix is to only perform this conversion shapers inside ShapedQuery method.
maumar added a commit that referenced this issue Feb 4, 2017
… with GroupJoin wrapped in a subquery when the subset of columns is being projected

Problem was that we were incorrectly reasoning about which query sources need to be materialized in the GroupJoin scenario. Previously, when we encountered GroupJoin, we marked its main from clause and the joined clause for materialization, but we would not do that recursively. This causes problems if the GroupJoin (e.g. optional navigation) is in a subquery - we would not mark it for materialization. This would lead to compilation errors in some cases, and in others (when there was a client-evaluation component) to data corruption.

Fix is to use a QueryModelVisitor to walk the entire tree which allows all necessary query sources related to GroupJoin to be marked for materialization. This incurs a performance hit for some scenarios (groupjoin/optional navigation inside a subquery) because now we bring more columns from the database.

Second part of the fix is that in LiftSubquery, when we update query sources, we would sometimes try to rewrite EntityShapers into ValueBufferShapers (if the lifted query itself is not marked for materialization). This however is not safe to do in case of GroupJoin. Fix is to only perform this conversion shapers inside ShapedQuery method.
@maumar
Copy link
Contributor

maumar commented Feb 4, 2017

Fixed in c99151d

@maumar maumar added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Feb 4, 2017
@shaulbehr
Copy link
Author

Great, thanks! Any idea when 1.1.1 will be released?

@maumar
Copy link
Contributor

maumar commented Feb 7, 2017

We don't have exact dates yet, but most likely sometime this month

@Eilon
Copy link
Member

Eilon commented Feb 8, 2017

This patch bug is approved. Please use the normal code review process w/ a PR and make sure the fix is in the correct branch, then close the bug and mark it as done.

@maumar maumar closed this as completed Feb 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Projects
None yet
Development

No branches or pull requests

6 participants