Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support SHOW COLUMNS #1027

Closed
allisonport-db opened this issue Mar 24, 2022 · 4 comments
Closed

Support SHOW COLUMNS #1027

allisonport-db opened this issue Mar 24, 2022 · 4 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@allisonport-db
Copy link
Collaborator

Support SQL SHOW COLUMNS to display information about the columns in a given Delta table, since currently SHOW COLUMNS is not supported in Spark for V2 tables.

Example usage:

spark.CreateDataFrame([[0, "Mike"], [1, "Mel"]], ["id", "name"]).write.mode("overwrite").format("delta").save("/tmp/showcolumns")
spark.sql("SHOW COLUMNS in delta.`/tmp/showcolumns`")

  +---------+
  |col_name |
  +---------+
  |id       |
  |name     |
  +---------+

Syntax details: SHOW COLUMNS

@allisonport-db allisonport-db added the enhancement New feature or request label Mar 24, 2022
@yifeng-chen
Copy link

Hi @allisonport-db , have you ever try this statement on Databricks?
The SHOW COLUMNS IN ${talbe} statement works well on Databricks, I'm wondering if this's a business strategy for promoting deltalake?

@zsxwing
Copy link
Member

zsxwing commented Mar 29, 2022

This is an oversight in Delta. We will fix it. But if you have free time to work on this, feel free to open an PR.

@6a0juu
Copy link
Contributor

6a0juu commented Jun 2, 2022

Hi @yifeng-chen , I'm currently working on this.

@harry19023
Copy link

Chiming in from #1163 , I'm requesting that nullable, is_partition, and is_bucket be included in this output.

mmengarelli pushed a commit to mmengarelli/delta.io that referenced this issue Jul 26, 2022
Resolves delta-io#1027 .
```
SHOW COLUMNS (FROM | IN) table_identifier [(FROM | IN) database];
```
Compared with [Spark SQL syntax](https://spark.apache.org/docs/3.0.0/sql-ref-syntax-aux-show-columns.html), this command added the support of representing the table by file path. The Delta command `Describe Detail` adds the similar support path based table extension to Apache Spark.

```
SHOW COLUMNS (FROM | IN) ${schema_name}.${table_name}
SHOW COLUMNS (FROM | IN) ${table_name} (FROM | IN) ${schema_name}
```

This feature was tested with 8 cases. Including:
- Delta table and non-Delta table.
- Tables with wrong table identity.
- Tables represented by separated schema name.

And some other edge cases. See [ShowTableColumnsSuite.scala](https://github.com/6a0juu/delta/blob/1f77fae9dce98441dee43eade932d985272b41be/core/src/test/scala/org/apache/spark/sql/delta/ShowTableColumnsSuite.scala) for details.

Yes. Before this PR, when making `SHOW COLUMNS` query, like:
```
spark.sql(s"SHOW COLUMNS IN delta.`test_table`").show()
```
It returns:
```
org.apache.spark.sql.AnalysisException: SHOW COLUMNS is not supported for v2 tables.
```
But with this PR, the output would be like:
```
+----------+
|  col_name|
+----------+
|   column1|
|   column2|
+----------+
```

Closes delta-io#1203

Signed-off-by: Jiawei Bao <jiawei.bao@databricks.com>
GitOrigin-RevId: f68947004bced59a4fcbce693b462604df63a39e
@allisonport-db allisonport-db added this to the 2.1.0 milestone Aug 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants