-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-30568][SQL] Invalidate interval type as a field table schema #27277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #116994 has finished for PR 27277 at commit
|
|
Test build #116995 has finished for PR 27277 at commit
|
|
This does reduce the scope of d67b98e to cast only, but we need more than just "keep behavior the same as 2.4", we need to hide interval type from external data sources/catalogs. In 2.4, CREATE TABLE can still has interval type column, even if parser doesn't allow it. This is because we have a java API In 3.0, the catalog becomes an API, and it's possible that we leak interval type column to external catalog implementations. I think it's OK to allow interval type in the parser, as we allow it in |
| checkAnswer(spark.internalCreateDataFrame(rdd, table.schema), Seq.empty) | ||
| } | ||
|
|
||
| test("CreateTable: invalid schema if has interval type") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also test CTAS and replace table?
|
Test build #117089 has finished for PR 27277 at commit
|
|
Test build #117102 has finished for PR 27277 at commit
|
|
Test build #117115 has finished for PR 27277 at commit
|
|
Test build #117119 has finished for PR 27277 at commit
|
|
Test build #117126 has finished for PR 27277 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
After this commit d67b98e, we are able to create table or alter table with interval column types if the external catalog accepts which is varying the interval type's purpose for internal usage. With d67b98e 's original purpose it should only work from cast logic.
Instead of adding type checker for the interval type from commands to commands to work among catalogs, It much simpler to treat interval as an invalid data type but can be identified by cast only.
Why are the changes needed?
enhance interval internal usage purpose.
Does this PR introduce any user-facing change?
NO,
Additionally, this PR restores user behavior when using interval type to create/alter table schema, e.g. for hive catalog
for 2.4,
for master after d67b98e
now with this pr, we restore the type checker in spark side.
How was this patch tested?
add more ut