-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-13738][SQL] Cleanup Data Source resolution #11572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #52624 has finished for PR 11572 at commit
|
|
Test build #52625 has finished for PR 11572 at commit
|
|
Test build #52637 has finished for PR 11572 at commit
|
|
Test build #52639 has finished for PR 11572 at commit
|
|
Test build #52634 has finished for PR 11572 at commit
|
| * The main class responsible for representing a pluggable Data Source in Spark SQL. In addition to | ||
| * acting as the canonical set of parameters that can describe a Data Source, this class is used to | ||
| * resolve a description to a concrete implementation that can be used in a query plan | ||
| * (either batch or streaming) or to write out out data using an external library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: write out ...
|
While taking a look, I also saw some nits in the previous PR, #11509. At
|
|
Test build #52674 has finished for PR 11572 at commit
|
|
LGTM - merging in master. |
Follow-up to apache#11509, that simply refactors the interface that we use when resolving a pluggable `DataSource`. - Multiple functions share the same set of arguments so we make this a case class, called `DataSource`. Actual resolution is now done by calling a function on this class. - Instead of having multiple methods named `apply` (some of which do writing some of which do reading) we now explicitly have `resolveRelation()` and `write(mode, df)`. - Get rid of `Array[String]` since this is an internal API and was forcing us to awkwardly call `toArray` in a bunch of places. Author: Michael Armbrust <michael@databricks.com> Closes apache#11572 from marmbrus/dataSourceResolution.
Follow-up to #11509, that simply refactors the interface that we use when resolving a pluggable
DataSource.DataSource. Actual resolution is now done by calling a function on this class.apply(some of which do writing some of which do reading) we now explicitly haveresolveRelation()andwrite(mode, df).Array[String]since this is an internal API and was forcing us to awkwardly calltoArrayin a bunch of places.