-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-16959] [SQL] Rebuild Table Comment when Retrieving Metadata from Hive Metastore #14550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #63392 has finished for PR 14550 at commit
|
|
cc @cloud-fan . This is a tiny fix. Could you review this? |
| val tableMetadata = catalog.getTableMetadata(TableIdentifier(tabName, Some("default"))) | ||
| val viewMetadata = catalog.getTableMetadata(TableIdentifier(viewName, Some("default"))) | ||
| assert(tableMetadata.comment == Option("BLABLA")) | ||
| assert(tableMetadata.properties.get("comment") == Option("BLABLA")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also remove the comment from table properties, to not surprise users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As explained below, Hive keeps comment in the table properties. Should we keep it too?
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. HiveClient should be symmetrical, the table meta read back should be same with what users saved into. We don't need to follow hive here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I see. Let me fix it now.
|
do you know why we put comment in table properties instead of table comment field? Some hive tricks? |
|
After an investigation, Hive does not have such a field in For example, this is the Hive output of one table with table comment |
| .map(_.asScala.toMap).orNull | ||
| ), | ||
| properties = properties, | ||
| properties = properties.filter(kv => kv._1 != "path"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use filterKeys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of filterKeys will create a new object.
override def filterKeys(p: A => Boolean): Map[A, B] = new FilteredKeys(p) with DefaultMap[A, B]
Thus, I got the following error:
org.apache.spark.SparkException: Task not serializable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey, we should filter out comment here, not path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha, sure. Working on multiple PRs at the same time. Will open a new PR tomorrow regarding path...
|
Test build #63508 has finished for PR 14550 at commit
|
|
thanks, merging to master! |
|
Test build #63517 has finished for PR 14550 at commit
|
|
@cloud-fan To backport #14531, we need to backport this. Do you want me to do it? |
…m Hive Metastore ### What changes were proposed in this pull request? The `comment` in `CatalogTable` returned from Hive is always empty. We store it in the table property when creating a table. However, when we try to retrieve the table metadata from Hive metastore, we do not rebuild it. The `comment` is always empty. This PR is to fix the issue. ### How was this patch tested? Fixed the test case to verify the change. Author: gatorsmile <gatorsmile@gmail.com> Closes #14550 from gatorsmile/tableComment. (cherry picked from commit bdd5371) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
|
backported to 2.0 |
|
Thank you! Let me fix the test case failure in Branch 2.0 |
What changes were proposed in this pull request?
The
commentinCatalogTablereturned from Hive is always empty. We store it in the table property when creating a table. However, when we try to retrieve the table metadata from Hive metastore, we do not rebuild it. Thecommentis always empty.This PR is to fix the issue.
How was this patch tested?
Fixed the test case to verify the change.