-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-16803] [SQL] SaveAsTable does not work when source DataFrame is built on a Hive Table #14410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #63019 has finished for PR 14410 at commit
|
|
Test build #63025 has finished for PR 14410 at commit
|
| existingSchema = Some(DDLUtils.getSchemaFromTableProperties(s.metadata)) | ||
| // When the source table is a Hive table, the `saveAsTable` API of DataFrameWriter | ||
| // does not work. Instead, use the `insertInto` API. | ||
| case o if o.getClass.getName == "org.apache.spark.sql.hive.MetastoreRelation" => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's really hacky to compare with the class name string... What's the error message before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree! Do you have any idea how to improve it?
Below is the current error message:
org.apache.spark.sql.AnalysisException: Saving data in MetastoreRelation sample, sample
is not supported.;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case o: MetastoreRelation doesn't work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MetastoreRelation is in another package hive
|
@cloud-fan @yhuai https://github.com/apache/spark/compare/master...gatorsmile:saveAsTableFix?expand=1 This is the prototype for supporting this missing function. We can fix it with minor changes. To avoid too many calls on Let me know if we should do it in 2.0.1 or just issue an exception in 2.0.1. Thanks! |
|
Hi @gatorsmile can you send out a PR first? We definitely need it for 2.1, and we can decide if we should backport it to 2.0 later. |
|
@cloud-fan Sure, will do it soon. Thanks! |
|
We can support it in another PR: #14612. This can be closed |
What changes were proposed in this pull request?
In Spark 2.0,
SaveAsTabledoes not work when source DataFrame is built on a Hive Table, but Spark 1.6 works.Spark 1.6
Spark 2.0
So far, we do not plan to support it in Spark 2.0. Spark 1.6 works because it internally uses
insertInto. But, if we change it back it will break the semantic ofsaveAsTable(this method uses by-name resolution instead of using by-position resolution used byinsertInto).Instead, users should use
insertIntoAPI. This PR corrects the error messages. Users can understand how to bypass it before we support it in a separate PR.How was this patch tested?
Test cases are added