Skip to content

Conversation

@mihailomilosevic2001
Copy link
Contributor

What changes were proposed in this pull request?

This is a followup of #43797 . GROUP BY ALIAS has the same bug and this PR applies the same fix to GROUP BY ALIAS

Why are the changes needed?

For advanced users or Spark plugins, they may manipulate the logical plans directly. We need to make the framework more reliable.

Does this PR introduce any user-facing change?

Yes, we will fix the error.

How was this patch tested?

Test added.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Mar 31, 2025
@cloud-fan
Copy link
Contributor

thanks, merging to master/4.0!

@cloud-fan cloud-fan closed this in 02db872 Mar 31, 2025
cloud-fan pushed a commit that referenced this pull request Mar 31, 2025
### What changes were proposed in this pull request?
This is a followup of #43797 . GROUP BY ALIAS has the same bug and this PR applies the same fix to GROUP BY ALIAS

### Why are the changes needed?
For advanced users or Spark plugins, they may manipulate the logical plans directly. We need to make the framework more reliable.

### Does this PR introduce _any_ user-facing change?
Yes, we will fix the error.

### How was this patch tested?
Test added.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #50461 from mihailom-db/mihailom-db/fixGroupBy.

Authored-by: Mihailo Milosevic <mihailo.milosevic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 02db872)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan pushed a commit that referenced this pull request Apr 14, 2025
### What changes were proposed in this pull request?
This PR reverts #50461 because it introduces a correctness issue by replacing an `Alias` with incorrect literal. A followup will be made with a different way to fix this issue.

### Why are the changes needed?
In the below example, alias `abc` is replaced with `2` resulting in 2 rows instead of correct 3.
Before erronous PR:
![image](https://github.com/user-attachments/assets/dcc98323-369d-4f5e-b0ad-de5b76ffc5c3)

After erronous PR:
![image](https://github.com/user-attachments/assets/31c24125-6654-48f8-9b55-40a6a667ed23)

### Does this PR introduce _any_ user-facing change?
User now sees an error message instead of an incorrect result.

### How was this patch tested?
Added a test case to check for this behavior in the future

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #50567 from mihailotim-db/mihailotim-db/revert_group_by.

Authored-by: Mihailo Timotic <mihailo.timotic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan pushed a commit that referenced this pull request Apr 14, 2025
### What changes were proposed in this pull request?
This PR reverts #50461 because it introduces a correctness issue by replacing an `Alias` with incorrect literal. A followup will be made with a different way to fix this issue.

### Why are the changes needed?
In the below example, alias `abc` is replaced with `2` resulting in 2 rows instead of correct 3.
Before erronous PR:
![image](https://github.com/user-attachments/assets/dcc98323-369d-4f5e-b0ad-de5b76ffc5c3)

After erronous PR:
![image](https://github.com/user-attachments/assets/31c24125-6654-48f8-9b55-40a6a667ed23)

### Does this PR introduce _any_ user-facing change?
User now sees an error message instead of an incorrect result.

### How was this patch tested?
Added a test case to check for this behavior in the future

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #50567 from mihailotim-db/mihailotim-db/revert_group_by.

Authored-by: Mihailo Timotic <mihailo.timotic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 8718eba)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan pushed a commit that referenced this pull request Apr 18, 2025
…sis to avoid issue with group by ordinal

### What changes were proposed in this pull request?
This is a followup to #43797 and #50461 where hacks were introduced in order to solve the issue of repeated analysis on plans that have a group by ordinal. The latter PR caused a regression so the hack needs to be removed. This PR proposed a move of `UnresolvedOrdinal` construction before Analyzer runs.

### Why are the changes needed?
We are reverting a hack introduced in the previous PRs to improve the behavior of group by ordinal and additionally fix the issue that #50461 was trying to solve.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Existing tests.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #50606 from mihailotim-db/mihailotim-db/new_group_by_ordinal.

Authored-by: Mihailo Timotic <mihailo.timotic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To @mihailom-db , I just want to give you the context in order to prevent future accident like the following.

02db87282da [SPARK-47895][SQL] group by alias should be idempotent
b5bb75ca240 [SPARK-47895][SQL] group by all should be idempotent

ASF JIRA IDs have Fixed Versions field. Even for the direct follow-up PRs, we cannot reuse JIRA IDs when the original patch is released already. It's because the original PR and the follow-up PR cannot have the same Fixed Versions field. Please make it sure that ASF JIRA ID is different from any internal customer ticket systems.

Screenshot 2025-04-28 at 22 08 09

So, please be careful about JIRA IDs next time.

zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 14, 2025
### What changes were proposed in this pull request?
This is a followup of apache#43797 . GROUP BY ALIAS has the same bug and this PR applies the same fix to GROUP BY ALIAS

### Why are the changes needed?
For advanced users or Spark plugins, they may manipulate the logical plans directly. We need to make the framework more reliable.

### Does this PR introduce _any_ user-facing change?
Yes, we will fix the error.

### How was this patch tested?
Test added.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#50461 from mihailom-db/mihailom-db/fixGroupBy.

Authored-by: Mihailo Milosevic <mihailo.milosevic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 9cc8b18)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 14, 2025
### What changes were proposed in this pull request?
This PR reverts apache#50461 because it introduces a correctness issue by replacing an `Alias` with incorrect literal. A followup will be made with a different way to fix this issue.

### Why are the changes needed?
In the below example, alias `abc` is replaced with `2` resulting in 2 rows instead of correct 3.
Before erronous PR:
![image](https://github.com/user-attachments/assets/dcc98323-369d-4f5e-b0ad-de5b76ffc5c3)

After erronous PR:
![image](https://github.com/user-attachments/assets/31c24125-6654-48f8-9b55-40a6a667ed23)

### Does this PR introduce _any_ user-facing change?
User now sees an error message instead of an incorrect result.

### How was this patch tested?
Added a test case to check for this behavior in the future

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#50567 from mihailotim-db/mihailotim-db/revert_group_by.

Authored-by: Mihailo Timotic <mihailo.timotic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit f970fe2)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants