-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-33989][SQL] Strip auto-generated cast when using Cast.sql #31034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Seems the change is quite big but mostly are golden files cc @maropu @cloud-fan |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #133667 has finished for PR 31034 at commit
|
|
(I'm neutral on this though) why is the name consistency important? Is there the concrete scenario where the name inconsistency can cause any issue? |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #133698 has finished for PR 31034 at commit
|
| // back to SQL query string. | ||
| case _: ArrayType | _: MapType | _: StructType => child.sql | ||
| case _ => s"CAST(${child.sql} AS ${dataType.sql})" | ||
| override def sql: String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not change the sql method.
ResolveAlias calls toPrettySQL to generate the alias name, and we should strip auto-generated alias in toPrettySQL.
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #133796 has finished for PR 31034 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #133924 has finished for PR 31034 at commit
|
|
Test build #133929 has finished for PR 31034 at commit
|
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveAliasesSuite.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveAliasesSuite.scala
Outdated
Show resolved
Hide resolved
|
I like this idea to exclude auto-added cast in the auto-generated alias. This should be a safe change as the auto-generated alias is mostly for display purposes. Let's get more feedback though. cc @maropu @viirya @dongjoon-hyun @HyukjinKwon |
|
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #133987 has finished for PR 31034 at commit
|
|
How about add a config like |
| case r: RuntimeReplaceable => | ||
| PrettyAttribute(r.mkString(r.exprsReplaced.map(toPrettySQL)), r.dataType) | ||
| case c: CastBase if !c.getTagValue(Cast.USER_SPECIFIED_CAST).getOrElse(false) => | ||
| PrettyAttribute(usePrettyExpression(c.child).sql, c.dataType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the purpose is to strip Cast only from toPrettySQL? Only for changing column name but the actual Cast expression still works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate it clearly in the PR description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's the only place which we changed. Update the description.
If the change is restricted to only auto-generated alias, it sounds okay. But this still sounds like a breaking change as the column name is changed? Do we need to update migration guide for this? |
yea better to add. But I do wonder if people will do to select that column... |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #134005 has finished for PR 31034 at commit
|
|
Test build #134008 has finished for PR 31034 at commit
|
|
thanks, merging to master! |
|
i took a close look too. LGTM. |
|
thanks all ! |
What changes were proposed in this pull request?
This PR aims to strip auto-generated cast. The main logic is:
PrettyAttributein usePrettyExpression.Why are the changes needed?
Make sql consistent with dsl. Here is an inconsistent example before this PR:
Note that, we don't remove the
Castso the auto-generatedCastcan still work. The only changed place isusePrettyExpression, we usePrettyAttributereplaceCastto give a better sql string.Does this PR introduce any user-facing change?
Yes, the default field name may change.
How was this patch tested?
Add test and pass exists test.