Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: Consider having predicate on SqlEnumerableExpression so providers can integrate it accordingly #27935

Closed
smitpatel opened this issue May 3, 2022 · 3 comments · Fixed by #27969
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Milestone

Comments

@smitpatel
Copy link
Member

Current we put a case block inside SqlExpression itself. It generates null result when predicate is not matched this could be issue with some aggregate functions and translator needs to unwrap it.

@roji
Copy link
Member

roji commented May 3, 2022

@smitpatel I'm probably missing something, but if we later add a Predicate on SqlEnumerableExpression, wouldn't the main underlying SqlExpression continue to have a CASE/WHEN, which providers would need to remove (since removing that would be a breaking change at that point)? If that's so, there's probably no point in adding Predicate...

One more thought about whether to introduce CASE/WHEN by default... If we don't introduce it, then I'm supposing translation will fail by default unless the provider specifically handles the separately-provided predicate (like for orderings). On the other hand, if we do CASE/WHEN by default, then translation will always succeed, but will return incorrect results where the function distinguishes nulls from non-nulls. It may be preferable to have "translation failure" as the default rather than "incorrect results" as the default behavior, for when a provider/plugin gets it wrong etc. But I agree it's not a super strong argument.

But to be clear, I'm fine with whatever decision is made here.

@smitpatel
Copy link
Member Author

if we later add a Predicate on SqlEnumerableExpression, wouldn't the main underlying SqlExpression continue to have a CASE/WHEN, which providers would need to remove (since removing that would be a breaking change at that point)?

If we add a Predicate later then we will remove CASE/WHEN from SqlExpression. Removing it wouldn't be a breaking change because the Predicate on grouping element is optional. (you can write g.Select(..).Max() too). So providers which are unwrapping CASE/WHEN block needs to do it in fault-tolerant way. It wouldn't break the code where predicate is not used but it will break the code where providers needs to start using the Predicate field. i.e. That would need to react either way.

From default perspective, you argument is certainly true. I think another point (where predicate vs ordering differs), is orderings for many aggregate functions is irrelevant while predicate is not. So providers will always need specific processing for separate predicate. It goes into same bucket (albeit different direction), that when we could have generated case block anyway, why force everyone to create it. (the other direction is, when providers multiple times have to unwrap case block then why don't we provide it separate only).

This is one of the hard to figure out right default kinda decision because it is totally data driven based on translations added which is not even customer facing.

@roji
Copy link
Member

roji commented May 3, 2022

[...] but it will break the code where providers needs to start using the Predicate field

Yeah, that's what I had in mind - if we remove CASE/WHEN later, any query that has a filter would stop working until the provider reacts (by handling the new Predicate).

I think another point (where predicate vs ordering differs), is orderings for many aggregate functions is irrelevant while predicate is not. So providers will always need specific processing for separate predicate.

Yeah, that's absolutely true, and I guess it's an argument in favor of doing CASE/WHEN, since filters would just work without any handling on the provider side.

It may also be worth keeping user function mappings in mind here - they would also possibly need to unwrap the case block if they want to map a function that cares about nulls.

In any case, whatever decision you make here, I'm not sure we'd ever want to reverse it (i.e. move from CASE/WHEN to separate predicate or vice versa).

@smitpatel smitpatel self-assigned this May 6, 2022
@ajcvickers ajcvickers added this to the 7.0.0 milestone May 6, 2022
@smitpatel smitpatel added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label May 6, 2022
smitpatel added a commit that referenced this issue May 6, 2022
This iterates over design in #27931

- Bring back DistinctExpression (yes the token in invalid in some places and incorrect tree will throw invalid SQL error in database
- Introduce EnumerableExpression which is a holder/parameter object which contains facets of grouping element chain including Distinct/Predicate/Selector/Ordering
- Translators are responsible for putting together pieces from EnumerableExpression to generate a SqlExpression as a result
- Remove GroupByAggregateChainProcessor which failed to avoid double visitation. We need to refactor this code in future to avoid it when we implement public API for aggregate functions

Resolves #27948
Resolves #27935
smitpatel added a commit that referenced this issue May 6, 2022
This iterates over design in #27931

- Bring back DistinctExpression (yes the token in invalid in some places and incorrect tree will throw invalid SQL error in database
- Introduce EnumerableExpression which is a holder/parameter object which contains facets of grouping element chain including Distinct/Predicate/Selector/Ordering
- Translators are responsible for putting together pieces from EnumerableExpression to generate a SqlExpression as a result
- Remove GroupByAggregateChainProcessor which failed to avoid double visitation. We need to refactor this code in future to avoid it when we implement public API for aggregate functions

Resolves #27948
Resolves #27935
smitpatel added a commit that referenced this issue May 10, 2022
This iterates over design in #27931

- Bring back DistinctExpression (yes the token in invalid in some places and incorrect tree will throw invalid SQL error in database
- Introduce EnumerableExpression which is a holder/parameter object which contains facets of grouping element chain including Distinct/Predicate/Selector/Ordering
- Translators are responsible for putting together pieces from EnumerableExpression to generate a SqlExpression as a result
- Remove GroupByAggregateChainProcessor which failed to avoid double visitation. We need to refactor this code in future to avoid it when we implement public API for aggregate functions

Resolves #27948
Resolves #27935
smitpatel added a commit that referenced this issue May 10, 2022
This iterates over design in #27931

- Bring back DistinctExpression (yes the token in invalid in some places and incorrect tree will throw invalid SQL error in database
- Introduce EnumerableExpression which is a holder/parameter object which contains facets of grouping element chain including Distinct/Predicate/Selector/Ordering
- Translators are responsible for putting together pieces from EnumerableExpression to generate a SqlExpression as a result
- Remove GroupByAggregateChainProcessor which failed to avoid double visitation. We need to refactor this code in future to avoid it when we implement public API for aggregate functions

Resolves #27948
Resolves #27935
@ajcvickers ajcvickers modified the milestones: 7.0.0, 7.0.0-preview5 May 25, 2022
@ajcvickers ajcvickers modified the milestones: 7.0.0-preview5, 7.0.0 Nov 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants