-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-31133][SQL][DOC] fix sql ref doc for DML #27891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,9 +9,9 @@ license: | | |
| The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| (the "License"); you may not use this file except in compliance with | ||
| the License. You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software | ||
| distributed under the License is distributed on an "AS IS" BASIS, | ||
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
|
|
@@ -20,7 +20,7 @@ license: | | |
| --- | ||
|
|
||
| ### Description | ||
| `LOAD DATA` statement loads the data into a table from the user specified directory or file. If a directory is specified then all the files from the directory are loaded. If a file is specified then only the single file is loaded. Additionally the `LOAD DATA` statement takes an optional partition specification. When a partition is specified, the data files (when input source is a directory) or the single file (when input source is a file) are loaded into the partition of the target table. | ||
| `LOAD DATA` statement loads the data into a Hive serde table from the user specified directory or file. If a directory is specified then all the files from the directory are loaded. If a file is specified then only the single file is loaded. Additionally the `LOAD DATA` statement takes an optional partition specification. When a partition is specified, the data files (when input source is a directory) or the single file (when input source is a file) are loaded into the partition of the target table. | ||
|
|
||
| ### Syntax | ||
| {% highlight sql %} | ||
|
|
@@ -78,7 +78,7 @@ LOAD DATA [ LOCAL ] INPATH path [ OVERWRITE ] INTO TABLE table_identifier [ part | |
| | Amy Smith | 123 Park Ave, San Jose | 111111 | | ||
| + -------------- + ------------------------------ + -------------- + | ||
|
|
||
| CREATE TABLE test_load (name VARCHAR(64), address VARCHAR(64), student_id INT); | ||
| CREATE TABLE test_load (name VARCHAR(64), address VARCHAR(64), student_id INT) USING HIVE; | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This must be a hive table
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi, @cloud-fan . This reminds me the following. @marmbrus and @gatorsmile . Do we need to revert SPARK-30098 due to its silent behavior change? Also, cc @rxin since he is the release manager for 3.0.0.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I filed SPARK-31136 to track the discussion and the final result.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The cost of break is mostly users can't run
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please don't get me wrong. You know that I loved that patch and tried to minimize the impact while embracing it. However, the new policy is designed to ban this kind of behavior change (SPARK-30098). Technically,
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I updated the JIRA description (SPARK-31136) with @cloud-fan 's example.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is a completely underestimate. Please see my comment in #27894 (comment) |
||
|
|
||
| -- Assuming the students table is in '/user/hive/warehouse/' | ||
| LOAD DATA LOCAL INPATH '/user/hive/warehouse/students' OVERWRITE INTO TABLE test_load; | ||
|
|
@@ -92,7 +92,7 @@ LOAD DATA [ LOCAL ] INPATH path [ OVERWRITE ] INTO TABLE table_identifier [ part | |
| + -------------- + ------------------------------ + -------------- + | ||
|
|
||
| -- Example with partition specification. | ||
| CREATE TABLE test_partition (c1 INT, c2 INT, c3 INT) USING HIVE PARTITIONED BY (c2, c3); | ||
| CREATE TABLE test_partition (c1 INT, c2 INT, c3 INT) PARTITIONED BY (c2, c3); | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is used as the source so doesn't need to be a hive table. |
||
|
|
||
| INSERT INTO test_partition PARTITION (c2 = 2, c3 = 3) VALUES (1); | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``LOAD DATA` only works for hive table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!