Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: consider updating select expression identifiers when projecting a subset of column and adding distinct #22049

Closed
maumar opened this issue Aug 13, 2020 · 2 comments · Fixed by #24293
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Milestone

Comments

@maumar
Copy link
Contributor

maumar commented Aug 13, 2020

Currently we use keys of all entities in chain as identifiers for the select expression. This becomes a problem when applying distinct - if the keys are not part of the projection we can no longer identify it. However in case of distinct we could change identifiers to become all of the projected (at least in some cases). need to investigate whether its correct thing to do for complex scenarios (nav chains?) and what is perf impact

@ajcvickers
Copy link
Member

See also scenario in #23897

@maumar
Copy link
Contributor Author

maumar commented Feb 12, 2021

lots of people are hitting this, we should re-triage and fix it sooner if possible

@ajcvickers ajcvickers added this to the 6.0.0 milestone Feb 12, 2021
maumar added a commit that referenced this issue Feb 27, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
@maumar maumar added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Feb 27, 2021
maumar added a commit that referenced this issue Feb 27, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 2, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 2, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 2, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 2, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 2, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 2, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 2, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 5, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 5, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 5, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 5, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 6, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
maumar added a commit that referenced this issue Mar 9, 2021
…s when projecting a subset of column and adding distinct

As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates.

Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique.
In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key.

Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect

we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate).

Fixes #22049
Fixes #24288
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Projects
None yet
2 participants