Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 2 additions & 50 deletions docs/sql-ref-syntax-ddl-create-table-hiveformat.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,14 +39,6 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
[ LOCATION path ]
[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
[ AS select_statement ]

row_format:
: SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ]
| DELIMITED [ FIELDS TERMINATED BY fields_terminated_char [ ESCAPED BY escaped_char ] ]
[ COLLECTION ITEMS TERMINATED BY collection_items_terminated_char ]
[ MAP KEYS TERMINATED BY map_key_terminated_char ]
[ LINES TERMINATED BY row_terminated_char ]
[ NULL DEFINED AS null_char ]
```

Note that, the clauses between the columns definition clause and the AS SELECT clause can come in
Expand Down Expand Up @@ -82,50 +74,10 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
* **INTO num_buckets BUCKETS**

Specifies buckets numbers, which is used in `CLUSTERED BY` clause.

* **row_format**

Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, use the `DELIMITED` clause to use the native SerDe and specify the delimiter, escape character, null character and so on.

* **SERDE**

Specifies a custom SerDe for one table.

* **serde_class**

Specifies a fully-qualified class name of a custom SerDe.

* **SERDEPROPERTIES**

A list of key-value pairs that is used to tag the SerDe definition.

* **DELIMITED**

The `DELIMITED` clause can be used to specify the native SerDe and state the delimiter, escape character, null character and so on.

* **FIELDS TERMINATED BY**

Used to define a column separator.

* **COLLECTION ITEMS TERMINATED BY**

Used to define a collection item separator.

* **MAP KEYS TERMINATED BY**

Used to define a map key separator.

* **LINES TERMINATED BY**

Used to define a row separator.

* **NULL DEFINED AS**

Used to define the specific value for NULL.

* **ESCAPED BY**
* **row_format**

Used for escape mechanism.
Specifies the row format for input and output. See [HIVE FORMAT](sql-ref-syntax-hive-format.html) for more syntax details.

* **STORED AS**

Expand Down
73 changes: 73 additions & 0 deletions docs/sql-ref-syntax-hive-format.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
layout: global
title: Hive Row Format
displayTitle: Hive Row Format
license: |
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
---

### Description

Spark supports a Hive row format in `CREATE TABLE` and `TRANSFORM` clause to specify serde or text delimiter.
There are two ways to define a row format in `row_format` of `CREATE TABLE` and `TRANSFORM` clauses.
1. `SERDE` clause to specify a custom SerDe class.
2. `DELIMITED` clause to specify a delimiter, an escape character, a null character, and so on for the native SerDe.

### Syntax

```sql
row_format:
SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ]
| DELIMITED [ FIELDS TERMINATED BY fields_terminated_char [ ESCAPED BY escaped_char ] ]
[ COLLECTION ITEMS TERMINATED BY collection_items_terminated_char ]
[ MAP KEYS TERMINATED BY map_key_terminated_char ]
[ LINES TERMINATED BY row_terminated_char ]
[ NULL DEFINED AS null_char ]
```

### Parameters

* **SERDE serde_class**

Specifies a fully-qualified class name of custom SerDe.

* **SERDEPROPERTIES**

A list of key-value pairs that is used to tag the SerDe definition.

* **FIELDS TERMINATED BY**

Used to define a column separator.

* **COLLECTION ITEMS TERMINATED BY**

Used to define a collection item separator.

* **MAP KEYS TERMINATED BY**

Used to define a map key separator.

* **LINES TERMINATED BY**

Used to define a row separator.

* **NULL DEFINED AS**

Used to define the specific value for NULL.

* **ESCAPED BY**

Used for escape mechanism.
48 changes: 2 additions & 46 deletions docs/sql-ref-syntax-qry-select-transform.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,6 @@ SELECT TRANSFORM ( expression [ , ... ] )
USING command_or_script [ AS ( [ col_name [ col_type ] ] [ , ... ] ) ]
[ ROW FORMAT row_format ]
[ RECORDREADER record_reader_class ]

row_format:
SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ]
| DELIMITED [ FIELDS TERMINATED BY fields_terminated_char [ ESCAPED BY escaped_char ] ]
[ COLLECTION ITEMS TERMINATED BY collection_items_terminated_char ]
[ MAP KEYS TERMINATED BY map_key_terminated_char ]
[ LINES TERMINATED BY row_terminated_char ]
[ NULL DEFINED AS null_char ]
```

### Parameters
Expand All @@ -49,45 +41,9 @@ row_format:

Specifies a combination of one or more values, operators and SQL functions that results in a value.

* **row_format**

Otherwise, uses the `DELIMITED` clause to specify the native SerDe and state the delimiter, escape character, null character and so on.

* **SERDE**

Specifies a custom SerDe for one table.

* **serde_class**

Specifies a fully-qualified class name of a custom SerDe.

* **DELIMITED**

The `DELIMITED` clause can be used to specify the native SerDe and state the delimiter, escape character, null character and so on.

* **FIELDS TERMINATED BY**

Used to define a column separator.

* **COLLECTION ITEMS TERMINATED BY**

Used to define a collection item separator.

* **MAP KEYS TERMINATED BY**

Used to define a map key separator.

* **LINES TERMINATED BY**

Used to define a row separator.

* **NULL DEFINED AS**

Used to define the specific value for NULL.

* **ESCAPED BY**
* **row_format**

Used for escape mechanism.
Specifies the row format for input and output. See [HIVE FORMAT](sql-ref-syntax-hive-format.html) for more syntax details.

* **RECORDWRITER**

Expand Down