Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: Fix Set Operation design #17340

Merged
merged 1 commit into from
Aug 21, 2019
Merged

Conversation

smitpatel
Copy link
Member

Remove SetOperationType enum
Add support for Intersect/Except Distinct

Resolves #16709

Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but take a look particularly at the comment regarding returning a shaper.

Definitely looks better than my previous implementation. I'd like us to not drop generating OrderBy/Skip/Take on naked set operations (instead of pushdown) - or at least the possibility of implementing in providers. But not critical if really not possible for 3.0.

@smitpatel
Copy link
Member Author

Definitely looks better than my previous implementation. I'd like us to not drop generating OrderBy/Skip/Take on naked set operations (instead of pushdown) - or at least the possibility of implementing in providers. But not critical if really not possible for 3.0.

Also rest of team: @ajcvickers @divega @bricelam @AndriySvyryd @maumar

While thinking about set operations and OrderBy/Skip/Take can appear directly after them, I came to conclusion that those 3 operations are exception rather than norm when it comes to set operations and pushdown.

// Following is valid
Select 'a' AS [a]
Union
Select 'b' AS [a]
ORDER BY [a]
// Following needs pushdown
Select *
FROM (
    Select 'a' AS [a]
    Union
    Select 'b' AS [a]
) as [t]
Join Orders as [o] on [t].[a] = [o].[MyProp]
  • Deferring pushdown till we need required us to remap properties.
  • We did not have good projection mapping for set operations to keep referential integrity.

So I arrived at conclusion that Set Operation should generate subquery right away when it is applied. In case if it is not composed over then it would be pulled up while printing to remove nesting. (we did similar for non-composed FromSql to support sproc).

Now for OrderBy/Skip/Take not requiring pushdown is not black and white clear. It is true that you can generate 1 level of them. But if set operation is followed by series of skip/take/skip/take then you can apply first set of skip/take without pushdown but not the 2nd set. And then there is whole bunch of issues around ORDER BY printing in non-pushdown version.
The way I see to fix it, would be to have Orderings, Skip, Take on SetOperationBase.
When applying skip on SelectExpression, which is non-composed SetOperation and SetOperation does not have any operation applied, it can call and set on SetOperationBase directly. if either of the conditions are not true then it would apply on SelectExpression itself. That solves the issue of chaining. (Also applying Distinct on SelectExpression can convert the SetOperation's IsDistinct property.)

That is just rough idea. So should we add API surface on SetOperation to allow that in future or we can just add it later when we implement it. (Actual set operation types are concrete so it can be done in non-breaking way).

As for "possibility of implementing in providers" - All providers support OrderBy/Skip/Take without pushdown. (Yes, SqlServer also does if you don't generate Top and use Fetch/Offset). I don't think there is any need to create a provider hook for this. It is low value and high cost to provide such hook to extend.

@roji
Copy link
Member

roji commented Aug 21, 2019

The way I see to fix it, would be to have Orderings, Skip, Take on SetOperationBase.
When applying skip on SelectExpression, which is non-composed SetOperation and SetOperation does not have any operation applied, it can call and set on SetOperationBase directly. if either of the conditions are not true then it would apply on SelectExpression itself. That solves the issue of chaining.

Sounds good to me. We can use #16244 to track this.

Also applying Distinct on SelectExpression can convert the SetOperation's IsDistinct property.

Nice, that would translate Concat().Distinct() to Union().

All providers support OrderBy/Skip/Take without pushdown. (Yes, SqlServer also does if you don't generate Top and use Fetch/Offset

The point is that Fetch/Offset require OrderBy, whereas TOP doesn't. So there is a little bit of SQL Server-specific complexity there.

Note: this PR will also fix #16273, unless we want EF Core to also provide the extension methods expressing non-distinct Intersect/Except. If we don't providers can provide them instead.

@smitpatel
Copy link
Member Author

We inject OrderBy (Select 1) when needed already. I believe injecting Order By (Select 1) is better than causing a subquery.

Remove SetOperationType enum
Add support for Intersect/Except Distinct

Resolves #16709
@smitpatel
Copy link
Member Author

Last comment on #16273 talks about adding API on relational level on IQueryable. So leaving that issue open for future to add API, Infrastructure changes are already made. Perhaps, if you wish you can add API in postgre provider for now.

@smitpatel smitpatel merged commit ff99dc3 into release/3.0-preview9 Aug 21, 2019
@ghost ghost deleted the smit/setoperations branch August 21, 2019 16:05
@smitpatel
Copy link
Member Author

Merging without additional surface on SetOperations for now. We can add it later in non-breaking way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants