-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-28443][SQL] Spark sql add exception when create field type NullType #25198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
merge master
pull master
|
cc @cloud-fan |
|
ok to test |
|
Test build #107883 has finished for PR 25198 at commit
|
|
Looks like a good idea to check this when creating a table with NullType |
| object CreateTableCheck extends Rule[LogicalPlan] { | ||
| override def apply(plan: LogicalPlan): LogicalPlan = { | ||
| plan match { | ||
| case ct: CreateTable if ct.tableDesc.schema.exists { f => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move it to the rule PreWriteCheck?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consider that it's a little different between create table(DDL) and insert data (DML). Maybe split in two rules ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea we can add a DDLCheck rule. Please follow PreWriteCheck to make it a pure-checking rule.
Also don't forget to handle DS v2 CREATE TABLE as well, which has a different logical plan.
CreateV2TableCreateTableAsSelectReplaceTableReplaceTableAsSelect
also cc @rdblue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I will check this later
|
Test build #108054 has finished for PR 25198 at commit
|
| failAnalysis("DataType NullType is not supported for create table") | ||
|
|
||
| // DataSourceStrategy will convert CreateTable to CreateDataSourceTableCommand before check | ||
| case cdstc: CreateDataSourceTableCommand if cdstc.table.schema.exists { f => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this isn't much code, but I think it would be better to refactor this exists and the failAnalysis call into a method so it isn't repeated several times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your advise.
|
Test build #108135 has finished for PR 25198 at commit
|
|
Test build #108758 has finished for PR 25198 at commit
|
| } | ||
|
|
||
| /** | ||
| * SPARK-28443: Spark sql add exception when create field type NullType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: SPARK-28443: fail the DDL command if it creates a table with null-type columns
| def failAnalysis(msg: String): Unit = { throw new AnalysisException(msg) } | ||
|
|
||
| def throwWhenExistsNullType(schema: StructType): Unit = { | ||
| if (schema.exists(f => DataTypes.NullType.sameType(f.dataType))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: f.dataType == NullType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about nested filed? Do other databases forbid it as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hive also forbid nested field with NullType
| case ReplaceTable(_, _, tableSchema, _, _, _) => | ||
| throwWhenExistsNullType(tableSchema) | ||
|
|
||
| case _ => // OK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CreateTableAsSelect, ReplaceTableAsSelect are missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spark allowed CreateTableAsSelect and ReplaceTableAsSelect with NullType, isn't it ?
E.g. create table test as select null as c1.
|
Test build #108791 has finished for PR 25198 at commit
|
| case CreateTable(tableDesc, _, _) => | ||
| checkSchema(tableDesc.schema) | ||
|
|
||
| case CreateV2Table(_, _, tableSchema, _, _, _) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the rationale here is that, we don't want to create tables with null type field. It should be same for both CREATE TABLE and CTAS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, and also add AlterTable.
|
Test build #108811 has finished for PR 25198 at commit
|
| test("SPARK-28443: fail the DDL command if it creates a table with null-type columns") { | ||
| withTable("t") { | ||
| val e = intercept[AnalysisException]{ | ||
| sql(s"CREATE TABLE t as SELECT NULL AS c") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also test inner fields?
| fields.foreach{field => throwWhenExistsNullType(field.dataType)} | ||
|
|
||
| case other if other == NullType => | ||
| failAnalysis("DataType NullType is not supported for create table.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "cannot create tables with null-type column/field"
|
|
||
| def failAnalysis(msg: String): Unit = { throw new AnalysisException(msg) } | ||
|
|
||
| def throwWhenExistsNullType(dataType: DataType): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failNullType
| } | ||
| } | ||
|
|
||
| test("SPARK-28443: fail the DDL command if it creates a table with null-type columns") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW do we need this test? I think https://github.com/apache/spark/pull/25198/files#diff-85f240f8452fa64de6671e41801fe68fR550 is good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forget that spark not support create table t as select null as c now. Remove this.
|
Test build #108833 has finished for PR 25198 at commit
|
|
Test build #108848 has finished for PR 25198 at commit
|
|
ping @cloud-fan |
|
Test build #111595 has finished for PR 25198 at commit
|
6a1b4f5 to
358e2a2
Compare
|
Test build #111605 has finished for PR 25198 at commit
|
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
More detail see 25085.
This PR is to discuss details that
add exception when create field type NullTypeNow, it's ok to create table like:
But it's not the spark idea
How was this patch tested?
UT