-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-14388][SQL] Implement CREATE TABLE #12271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
014c38e
15bb3b6
b6b4d29
5e0fe03
f7501d9
66970a8
3af954d
2e95ecf
c8edb75
250f402
efecac9
50a2054
8dc554a
a4f67f2
045820c
8e273fd
59edce3
7b1a1e3
a60e66a
02738fe
55957bd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -220,14 +220,30 @@ case class CatalogTable( | |
| tableType: CatalogTableType, | ||
| storage: CatalogStorageFormat, | ||
| schema: Seq[CatalogColumn], | ||
| partitionColumns: Seq[CatalogColumn] = Seq.empty, | ||
| sortColumns: Seq[CatalogColumn] = Seq.empty, | ||
| numBuckets: Int = 0, | ||
| partitionColumnNames: Seq[String] = Seq.empty, | ||
| sortColumnNames: Seq[String] = Seq.empty, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How do we maintain the ordering of the sort columns here? For the clause SORTED BY (col_name [ASC|DESC], ...), within CLUSTERED BY clause.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we do not support SORTED BY and CLUSTERED BY for now.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also that has nothing to do with this patch |
||
| bucketColumnNames: Seq[String] = Seq.empty, | ||
| numBuckets: Int = -1, | ||
| createTime: Long = System.currentTimeMillis, | ||
| lastAccessTime: Long = System.currentTimeMillis, | ||
| lastAccessTime: Long = -1, | ||
| properties: Map[String, String] = Map.empty, | ||
| viewOriginalText: Option[String] = None, | ||
| viewText: Option[String] = None) { | ||
| viewText: Option[String] = None, | ||
| comment: Option[String] = None) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe renaming it to
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this "comment" mean the comment for the table, as defined in the syntax
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I actually prefer to keep it as comment because it's
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok |
||
|
|
||
| // Verify that the provided columns are part of the schema | ||
| private val colNames = schema.map(_.name).toSet | ||
| private def requireSubsetOfSchema(cols: Seq[String], colType: String): Unit = { | ||
| require(cols.toSet.subsetOf(colNames), s"$colType columns (${cols.mkString(", ")}) " + | ||
| s"must be a subset of schema (${colNames.mkString(", ")}) in table '$identifier'") | ||
| } | ||
| requireSubsetOfSchema(partitionColumnNames, "partition") | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find the following example from Hive's doc Will this CREATE TABLE statement trigger an error at here?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no because we put the partition columns in the schema when we parse it. There's a comment in
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! |
||
| requireSubsetOfSchema(sortColumnNames, "sort") | ||
| requireSubsetOfSchema(bucketColumnNames, "bucket") | ||
|
|
||
| /** Columns this table is partitioned by. */ | ||
| def partitionColumns: Seq[CatalogColumn] = | ||
| schema.filter { c => partitionColumnNames.contains(c.name) } | ||
|
|
||
| /** Return the database this table was specified to belong to, assuming it exists. */ | ||
| def database: String = identifier.database.getOrElse { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,80 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.spark.sql.execution.command | ||
|
|
||
| import org.apache.spark.sql.{Row, SQLContext} | ||
| import org.apache.spark.sql.catalyst.TableIdentifier | ||
| import org.apache.spark.sql.catalyst.catalog.CatalogTable | ||
|
|
||
|
|
||
| // TODO: move the rest of the table commands from ddl.scala to this file | ||
|
|
||
| /** | ||
| * A command to create a table. | ||
| * | ||
| * Note: This is currently used only for creating Hive tables. | ||
| * This is not intended for temporary tables. | ||
| * | ||
| * The syntax of using this command in SQL is: | ||
| * {{{ | ||
| * CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name | ||
| * [(col1 data_type [COMMENT col_comment], ...)] | ||
| * [COMMENT table_comment] | ||
| * [PARTITIONED BY (col3 data_type [COMMENT col_comment], ...)] | ||
| * [CLUSTERED BY (col1, ...) [SORTED BY (col1 [ASC|DESC], ...)] INTO num_buckets BUCKETS] | ||
| * [SKEWED BY (col1, col2, ...) ON ((col_value, col_value, ...), ...) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @andrewor14 I just did a quick check with our
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok |
||
| * [STORED AS DIRECTORIES] | ||
| * [ROW FORMAT row_format] | ||
| * [STORED AS file_format | STORED BY storage_handler_class [WITH SERDEPROPERTIES (...)]] | ||
| * [LOCATION path] | ||
| * [TBLPROPERTIES (property_name=property_value, ...)] | ||
| * [AS select_statement]; | ||
| * }}} | ||
| */ | ||
| case class CreateTable(table: CatalogTable, ifNotExists: Boolean) extends RunnableCommand { | ||
|
|
||
| override def run(sqlContext: SQLContext): Seq[Row] = { | ||
| sqlContext.sessionState.catalog.createTable(table, ifNotExists) | ||
| Seq.empty[Row] | ||
| } | ||
|
|
||
| } | ||
|
|
||
|
|
||
| /** | ||
| * A command that renames a table/view. | ||
| * | ||
| * The syntax of this command is: | ||
| * {{{ | ||
| * ALTER TABLE table1 RENAME TO table2; | ||
| * ALTER VIEW view1 RENAME TO view2; | ||
| * }}} | ||
| */ | ||
| case class AlterTableRename( | ||
| oldName: TableIdentifier, | ||
| newName: TableIdentifier) | ||
| extends RunnableCommand { | ||
|
|
||
| override def run(sqlContext: SQLContext): Seq[Row] = { | ||
| val catalog = sqlContext.sessionState.catalog | ||
| catalog.invalidateTable(oldName) | ||
| catalog.renameTable(oldName, newName) | ||
| Seq.empty[Row] | ||
| } | ||
|
|
||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -380,8 +380,6 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach { | |
| } | ||
| } | ||
|
|
||
| // TODO: ADD a testcase for Drop Database in Restric when we can create tables in SQLContext | ||
|
|
||
| test("show tables") { | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this TODO comment is duplicated in line 197 |
||
| withTempTable("show1a", "show2b") { | ||
| sql( | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hvanhovell I deleted the
INPUTDRIVERandOUTPUTDRIVERhere because Hive doesn't support it. Why was this added in the first place? Is there any supporting documentation for this somewhere?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewor14 I wanted to make sure we supported the same grammar as Hive and I used their grammars as a basis. So this is defined in the following two locations:
The main idea was that I could throw better errors. But if it is not supported by Hive itself then please remove it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove it. I have never seen this before and it is not documented anywhere.