-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 #25771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc: @cloud-fan @rdblue |
|
@cloud-fan / @rdblue This will probably be refactored after #25747 is merged, but I wanted to send this out to get some feedback on the usage of current catalog. |
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceResolution.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala
Show resolved
Hide resolved
|
Test build #110501 has finished for PR 25771 at commit
|
|
|
||
| test("UseCatalog: use catalog with v2 catalog") { | ||
| val catalogManager = spark.sessionState.catalogManager | ||
| assert(catalogManager.currentCatalog.name() == "session") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be updated to "spark_session" right?
sql/catalyst/src/main/scala/org/apache/spark/sql/catalog/v2/CatalogManager.scala
Outdated
Show resolved
Hide resolved
imback82
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update this when #25747 is merged.
|
Test build #110708 has finished for PR 25771 at commit
|
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
Outdated
Show resolved
Hide resolved
|
Test build #110797 has started for PR 25771 at commit |
|
test this please |
|
Test build #110806 has finished for PR 25771 at commit
|
|
Test build #110822 has finished for PR 25771 at commit
|
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/BaseSessionCatalog.java
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceResolution.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala
Outdated
Show resolved
Hide resolved
| assert(catalogManager.currentCatalog.name() == "session") | ||
| assert(catalogManager.currentNamespace === Array("default")) | ||
|
|
||
| // The following implicitly creates namespaces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this reminds that we should have a CREATE NAMESPACE command. We can do it in followup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I intend to do it right after this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a JIRA to track this along with SHOW CURRENT CATALOG and SHOW CURRENT NAMESPACE as you suggested.
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
Show resolved
Hide resolved
|
Test build #111474 has finished for PR 25771 at commit
|
|
Test build #111498 has finished for PR 25771 at commit
|
|
Test build #111519 has finished for PR 25771 at commit
|
|
@cloud-fan this is ready for another round of review. Thanks! |
...alyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
Outdated
Show resolved
Hide resolved
| namespace: Option[Seq[String]]): Unit = { | ||
| namespace.map { ns => | ||
| val nsArray = ns.toArray | ||
| if (!catalog.asNamespaceCatalog.namespaceExists(nsArray)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that we can't set current namespace for a v2 catalog that doesn't implement SupportsNamespace.
This is a little counter-intuitive as TableCatalog.loadTable does accept a namespace parameter. Maybe we shouldn't do the namespace check when setting current namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this check. For V1, database check will be performed in SessionCatalog.setCurrentDatabase (so the behavior remains the same) and for V2, there will be no checks performed.
cloud-fan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only a few minor comments
|
Test build #111609 has started for PR 25771 at commit |
|
I think it's ready to go. @imback82 can you fix the conflict? thanks! |
|
I just resolved the conflicts. Thanks @cloud-fan! |
|
Test build #111670 has finished for PR 25771 at commit
|
|
Test build #111681 has finished for PR 25771 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
This PR exposes USE CATALOG/USE SQL commands as described in this SPIP
It also exposes
currentCataloginCatalogManager.Finally, it changes
SHOW NAMESPACESandSHOW TABLESto use the current catalog if no catalog is specified (instead of default catalog).Why are the changes needed?
There is currently no mechanism to change current catalog/namespace thru SQL commands.
Does this PR introduce any user-facing change?
Yes, you can perform the following:
How was this patch tested?
Added new unit tests.