-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query: consider updating select expression identifiers when projecting a subset of column and adding distinct #22049
Labels
area-query
closed-fixed
The issue has been fixed and is/will be included in the release indicated by the issue milestone.
type-enhancement
Milestone
Comments
See also scenario in #23897 |
lots of people are hitting this, we should re-triage and fix it sooner if possible |
maumar
added a commit
that referenced
this issue
Feb 27, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added
the
closed-fixed
The issue has been fixed and is/will be included in the release indicated by the issue milestone.
label
Feb 27, 2021
maumar
added a commit
that referenced
this issue
Feb 27, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 2, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 2, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 2, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 2, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 2, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 2, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 2, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 5, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 5, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 5, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 5, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 6, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
maumar
added a commit
that referenced
this issue
Mar 9, 2021
…s when projecting a subset of column and adding distinct As fix to #15873 we started blocking some scenarios that used to work (by accident) - when we have a subquery using Distinct or GroupBy that doesn't happen to have any duplicates. Fix is to enable those scenarios (and others) by modifying identifier columns in case of distinct and group by, if the original identifiers are not already present. In case of distinct, the entire projection becomes unique identifier, as distinct guarantees it to be unique. In case of groupby, the grouping key becomes the identifier - since we only support grouping key or group aggregate in the projection, we are also guaranteed to have 1 row per unique grouping key. Also fix to #24288 - Query: add collection join tries to convert correlated collection from APPLY to JOIN for subqueries with Distinct and GroupBy, which is incorrect we would always try to convert subquery with groupby and distinct from apply to join, however we can only do this if the projection already contains the join key. Otherwise, adding the join key to the projection would change the meaning of operation in case of distinct and create invalid query in case of group by (projecting column that is not part of grouping key or aggregate). Fixes #22049 Fixes #24288
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area-query
closed-fixed
The issue has been fixed and is/will be included in the release indicated by the issue milestone.
type-enhancement
Currently we use keys of all entities in chain as identifiers for the select expression. This becomes a problem when applying distinct - if the keys are not part of the projection we can no longer identify it. However in case of distinct we could change identifiers to become all of the projected (at least in some cases). need to investigate whether its correct thing to do for complex scenarios (nav chains?) and what is perf impact
The text was updated successfully, but these errors were encountered: