Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include() statement is incorrectly applying where clause to wrong instance of dataset when combind with select(... new ...) syntax. #647

Closed
Robert-Laverick opened this issue Oct 10, 2018 · 3 comments

Comments

@Robert-Laverick
Copy link

If I have two entities with a one to many relationship between them, and try and group by child records of a given type the navigation property on the parent will ONLY have the children that match the where if you try and select them into a new anonymous object.

In the minimal example I've put I've called them a Document and a MetaInfo record, where MetaInfo has a type and a Document has many MetaInfo records. I'm trying to find pairs of MetaInfo with the name "Author" and the Document they are associated with.

Here's how I first wrote the LINQ to do the select.

ctx.Configuration.LazyLoadingEnabled = false;

var DocsByCreator = ctx.Documents
    .Include(d => d.MetaInfo) // Load all the metaInfo for each object
    .SelectMany(d => d.MetaInfo.Where(m => m.Name == "Author") // For each Author
        .Select(m => new { Doc = d, Creator = m })) // Create an object with the Author and the Document they authored.
    .ToList(); // Actualize the collection

This results in a List of pairs of "Author" MetaInfo objects paired with the Document they are related to. However the Document object's MetaInfo collection is limited to ONLY the MetaInfo objects which match the inner "Author" clause.

I've tried this a few different ways around, after I'd narrowed down the problem I expected this one to work, but it demonstrates the same behaviour.

ctx.Configuration.LazyLoadingEnabled = false;

var DocsByCreator = ctx.Meta
    .Where(m => m.Name == "Author")
    .Include(m => m.Document.MetaInfo) // Load all the metaInfo for Document
    .Select(m => new { Doc = m.Document, Creator = m })
    .ToList(); // Actualize the collection

It turns out that moving either the Select(...new...), or the Where() clause after the data is actualised using ToList() causes the Document to have a complete MetaInfo collection.

I've put together a test project with a bunch of different ways to try and produce the same results here:
https://github.com/Robert-Laverick/EFIncludeIssue

I should note that if you enable Lazy Loading then data returned behaves correctly, but only as it's actually doing lazy loads for all the MetaInfo calls down from the document because it knows the Include hasn't populated the value, which is how I ran into the problem (finding 900 extra queries going on when I tried to access what I thought was a set of included properties on an object was a real performance issue) which is why I'm explicitly turning it off in this instance.

I've worked around this in the production instance by simply returning the equivalent of the document.ID and then having a second where contains based LINQ statement that gets the object with the includes and then manually build the result set I want, but I'm not able to do away with this kind of structure in my real application as there are before and after stages to the process that I'd rather do as part of the EF call.

I feel like this is some kind of optimisation going on when parsing the result set not realising that Where restricted set of data isn't sufficient to fulfil the match the include clause, and another instance of the table needs to be joined to the SQL, but I'm not familiar with the internals of EF so I don't know where to start trying to trace this.

Further technical details

EF version: 6.2.0
Database Provider: EntityFramework.SqlServer
Operating system: Windows 10
IDE: Visual Studio 2017 15.8.6

@ajcvickers
Copy link
Member

@Robert-Laverick The semantics of Include are that when returning instances of the original query shape, then related instances as specified in the Include will also be returned. However, if the query makes a projection such that it is returning a different shape, then the Includes are no longer applied. The behavior is currently the same in EF Core (although we are considering changing this for 3.0) and there is a more in-depth discussion here: dotnet/efcore#12446

@Robert-Laverick
Copy link
Author

Interestingly I've just been testing the EF Core version, and while it's true that the first case behaves similarly (it in fact leaves the relevant Collection as null rather than having a partially filled IEnumerable as EF6 does) The second instance I gave does return the full set of MetaInfo for the retrieved documents despite the where clause on Name. I'm guessing because it's not actually the same ICollection from a Document in both cases?

If I'm following the other threads correctly this also implies that the planned fix is to have global rules API to set certain values to always be loaded when a parent is? And is that really better than some kind of fix for Include() either being able to apply it to projections (sorry if I'm abusing the terminology here I'm still trying to get up to speed)

Given Include fails if I'm not using a known entitry type, can I create a entity class and tell EF about it, project to that and then use the existing include syntax after the projection, or is that just the same problem in a disguise?

@ajcvickers
Copy link
Member

@Robert-Laverick "global rules API to set certain values to always be loaded". Yes, that's the direction we're going with Core, but it's not fully decided yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants