-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49436][Connect][SQL] Common interface for SQLContext #48958
Conversation
connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala
Outdated
Show resolved
Hide resolved
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SparkSession.scala
Outdated
Show resolved
Hide resolved
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SQLContext.scala
Outdated
Show resolved
Hide resolved
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SQLContext.scala
Outdated
Show resolved
Hide resolved
@@ -105,7 +105,7 @@ class RocksDBStateStoreSuite extends StateStoreSuiteBase[RocksDBStateStoreProvid | |||
|
|||
// Create state store in a task and get the RocksDBConf from the instantiated RocksDB instance | |||
val rocksDBConfInTask: RocksDBConf = testRDD.mapPartitionsWithStateStore[RocksDBConf]( | |||
spark.sqlContext, testStateInfo, testSchema, testSchema, | |||
spark, testStateInfo, testSchema, testSchema, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a follow-up should be to weed out all existing uses of SQLContext in the code base. There are a couple of user facing APIs that we can't change, but most of it should go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I reverted this change. Let's to it in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall approach LGTM
WIP: to be done in this PR after we agree on the interface implementation.
Do you mean the interface impl in the changes at this moment or are there other decisions?
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/UnsupportedFeaturesSuite.scala
Show resolved
Hide resolved
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SQLContext.scala
Outdated
Show resolved
Hide resolved
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SQLContext.scala
Outdated
Show resolved
Hide resolved
/** | ||
* An interface to register custom QueryExecutionListener that listen for execution metrics. | ||
*/ | ||
def listenerManager: ExecutionListenerManager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can implemented. The same applies for udf, streams, experimental, streams, ... you will have to override them, but that should ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to push back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree and I consider them low priority. Do we have tickets for them?
A couple of small comments. |
I was considering testing APIs that have a new implementation, such as |
def listenerManager: ExecutionListenerManager = sparkSession.listenerManager | ||
|
||
/** @inheritdoc */ | ||
def setConf(props: Properties): Unit = sparkSession.conf.synchronized { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you trying to accomplish with this lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this behaviour from
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
Lines 6290 to 6293 in 1d6932c
/** Set Spark SQL configuration properties. */ | |
def setConf(props: Properties): Unit = settings.synchronized { | |
props.asScala.foreach { case (k, v) => setConfString(k, v) } | |
} |
I believe it's to avoid having two setConf
with multiple KVs running at the same time. The underlying Map won't have any issue without the lock.
@@ -99,6 +99,28 @@ class SQLContextSuite extends SparkFunSuite with SharedSparkContext { | |||
assert(sqlContext.tables().filter("tableName = 'listtablessuitetable'").count() === 0) | |||
} | |||
|
|||
test("get tables from a database") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not tested????
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah at least I didn't see any...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small question, otherwise LGTM.
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SQLContext.scala
Outdated
Show resolved
Hide resolved
Merged to master. |
What changes were proposed in this pull request?
This PR adds an abstraction for
SQLContext
in thespark-api
package. Both sides (Classic and Connect) maintain their own implementation.Why are the changes needed?
To unify the API interface and make
SQLContext
available to Spark Connect.Does this PR introduce any user-facing change?
Yes. Connect users are now able to call
sparkSession.sqlContext
and the APIs it provides.How was this patch tested?
Not needed. All new methods are mirrored from SparkSession except
tables()
, which is covered by existing tests inListTablesSuite
.Was this patch authored or co-authored using generative AI tooling?
No.