-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translate embedded IQueryables to common table expressions (CTEs) #29918
Comments
@willnationsdev are you suggesting we generate a completely different SQL (with CTE instead of subqueries) specifically for logging SQL, and continue using the current subquery SQL for what's actually executed against the database? |
Hmmm...as far as I know, CTEs have an equivalent performance impact compared to copy/pasting subqueries, yes? Regardless, differing logged vs. executed queries would defeat the purpose of "improving debug log analysis" in the first place, so I wouldn't advocate for that. What shows up in logs should always be what is actually sent to the database to keep logs accurate, so provided the implementation work involved is the same, I think it would be better to change the query string for both logging and execution. If however including CTEs actually does have some detrimental impact on query handling for some reason, then perhaps this request should involve logging-specific query string generation on an as-requested, per-query basis. Then users can generate ad-hoc CTE-based queries when they need to review an especially complex query, but not employ the CTEs in the average use case. |
So you're asking to change EF to replace all subqueries in its SQL with CTEs? That seems like quite a request, especially if the motivation here is finding CTEs a bit more readable. In fact, since CTEs effectively extract out entire subtrees of the query tree, doing this systematically for all subqueries is very likely to degrade readability; especially since EF would have to mechanically generate the CTE identifiers, so they'd be meaningless (cte1, cte2...). Try figuring out which CTE means what in a complex query with many of these. Don't get me wrong - I think CTEs are great, and there are some specific situations where they're very helpful (I'm especially thinking of recursive CTEs, which allow expressing queries which are otherwise inexpressible). But the readability of simple subqueries vs. CTE seems largely a matter of personal taste, and I don't see us changing everything here as you suggest. (and all this before having thought or investigated any potential perf issues resulting from such a switch) |
@roji That makes sense. I was really looking for a way to more clearly distinguish/segregate which parts of a large & complex query were derived from which I just wasn't sure how hard it would be to do, and I knew that the few generated query strings I had looked at thus far included |
Yeah, but parameters are very, very different from subqueries. I think the way forward here is for us to implement explicit support for CTEs, where the user uses a LINQ operator to express SQL WITH (#26486). Once that happens, the query writer has the option to decide exactly what they want as a CTE and what name to give it. That would be generally better than just turning all subqueries into "anonymous" CTE blocks. I'm going to go ahead and close this, but please feel free to continue posting here and discussing. |
Can you explain how parameters are very very different to sub queries in terms of getting the name? If I've understood correctly, this request is not about turning all subqueries into a CTE but specifically about turning those generated from IQueryable variables into CTEs. Wouldn't the IQueryable variable name be captured in the expressions and so could be used as the CTE name? I feel there's quite good symmetry between, in the code/LINQ, defining an IQueryable separately to the main query and potentially using it multiple times in the main query, and, in the SQL, doing the same with a CTE. |
Oh, I wasn't referring specifically to the name - they're different in various other ways. Looking again at the above, I think I missed the fact that the request was specifically about when an external IQueryable variable is integrated into the query, rather than about subqueries in general (sorry). For that case we can indeed know the name and give it to a CTE, similar to how we do that for parameters. I do kinda like this idea - integrating an external IQueryable is a complex, "independent" query is being integrated into the main LINQ query, so it makes some sense to do the same on the SQL side via CTEs. But I'm still not sure doing this automatically is the right thing - at the very least a serious cross-database investigation would need to make sure the change to CTEs doesn't affect perf etc. And the gain is purely in more readable SQL. I'll reopen so we can at least discuss this on the team. |
This is a good point, and would be a possible perf improvement, as the subquery is only evaluated once rather than multiple times. On the downside, any sort of dynamic query construction (such as the integration of an IQueryable) is basically incompatible with any sort of static analysis, and so won't work with precompiled queries and AOT. Of course, whether the user integrates IQueryable or not is orthogonal to what SQL we produce from it (it's dynamic and AOT-incompatible in any case) - but it might mean it's less interesting to invest efforts into this. |
Until we would have full CTE support, would it be possible to have tagging support in embedded IQueryables? For logging purposes it would be nice if the existing
|
@joakimriedel query tags are implemented at the entire query level, and cannot be attached to one part of the query or another; so unfortunately this isn't possible at this point. |
I see, I thought it would be possible to do some rewriting magic since it would always be part of an expression tree. Alas, even stronger vote for CTE, I'm also heavily invested in sub queries like OP. |
Strongly in favour of this feature, we have queries on the hot path that would benefit from the perf gain here |
@Pepsi1x1 I may be wrong, but if you take a look at roji's comment...
...he seems to insinuate that embedding an |
@willnationsdev my comment was only about the incompatibility of IQueryable composition (which is a form of dynamic querying) with NativeAOT/precompilation - that's not something that interests everyone. There's nothing wrong with using this in regular, non-NativeAOT programs. (in fact, we may be able to support this very specific form of dynamic querying even in NativeAOT/precompilation - but that's not sure). But in any case, readability and performance align here: when you remove duplicated subqueries by introducing a CTE, you both make the SQL more readable and remove additional work for the database. |
When reusing a separate
IQueryable
in operations, e.g..Where(obj => otherQuery.Contains(obj))
, EF Core generates a subquery (as expected). However, in these cases, manually reviewing generated SQL in debug logs becomes more difficult to visually/mentally parse and damages the readability of the code, especially when a givenIQueryable
is injected multiple times and/or the generated SQL is especially large.I would like to have the ability to request that applicable queries replace all injected subqueries of this nature with common table expressions named after the injected
IQueryable
's symbol (likeotherQuery
in the previous example). This would in turn make reading the debug logs of executed SQL statements more readable.This could just be an opt-in optional behavior specifically for use in debug contexts. One could apply this strategy individually via something like
AsNoTracking()
or app-wide via theDbContextOptionsBuilder
like withEnableDetailedErrors()
.Related Issues: #26486 (CTEs via a special LINQ operator, including recursive)
The text was updated successfully, but these errors were encountered: