-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25680][SQL] SQL execution listener shouldn't happen on execution thread #22674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
a456226
436197b
642ddd3
a25524b
3ffa536
0bfc240
6e3a345
c42b499
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -17,15 +17,15 @@ | |
|
|
||
| package org.apache.spark.sql.execution.ui | ||
|
|
||
| import com.fasterxml.jackson.annotation.JsonIgnore | ||
| import com.fasterxml.jackson.databind.JavaType | ||
| import com.fasterxml.jackson.databind.`type`.TypeFactory | ||
| import com.fasterxml.jackson.databind.annotation.JsonDeserialize | ||
| import com.fasterxml.jackson.databind.util.Converter | ||
|
|
||
| import org.apache.spark.annotation.DeveloperApi | ||
| import org.apache.spark.scheduler._ | ||
| import org.apache.spark.sql.execution.SparkPlanInfo | ||
| import org.apache.spark.sql.execution.metric._ | ||
| import org.apache.spark.sql.execution.{QueryExecution, SparkPlanInfo} | ||
|
|
||
| @DeveloperApi | ||
| case class SparkListenerSQLExecutionStart( | ||
|
|
@@ -39,7 +39,22 @@ case class SparkListenerSQLExecutionStart( | |
|
|
||
| @DeveloperApi | ||
| case class SparkListenerSQLExecutionEnd(executionId: Long, time: Long) | ||
| extends SparkListenerEvent | ||
| extends SparkListenerEvent { | ||
|
|
||
| // The name of the execution, e.g. `df.collect` will trigger a SQL execution with name "collect". | ||
| @JsonIgnore private[sql] var executionName: Option[String] = None | ||
|
||
|
|
||
| // The following 3 fields are only accessed when `executionName` is defined. | ||
|
|
||
| // The duration of the SQL execution, in nanoseconds. | ||
| @JsonIgnore private[sql] var duration: Long = 0L | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. did you verify that the JsonIgnore annotation actually works? For some reason, I actually needed to annotate the class as @JsonIgnoreProperties(Array("a", b", "c"))
class SomeClass {
@JsonProperty("a") val a: ...
@JsonProperty("b") val b: ...
}the reason being Json4s understands that API better. I believe we use Json4s for all of these events
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a test to verify it: https://github.com/apache/spark/pull/22674/files#diff-6fa1d00d1cb20554dda238f2a3bc3ecbR55 I also used |
||
|
|
||
| // The `QueryExecution` instance that represents the SQL execution | ||
| @JsonIgnore private[sql] var qe: QueryExecution = null | ||
|
|
||
| // The exception object that caused this execution to fail. None if the execution doesn't fail. | ||
| @JsonIgnore private[sql] var executionFailure: Option[Exception] = None | ||
| } | ||
|
|
||
| /** | ||
| * A message used to update SQL metric value for driver-side updates (which doesn't get reflected | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -17,17 +17,16 @@ | |
|
|
||
| package org.apache.spark.sql.util | ||
|
|
||
| import java.util.concurrent.locks.ReentrantReadWriteLock | ||
| import scala.collection.JavaConverters._ | ||
|
|
||
| import scala.collection.mutable.ListBuffer | ||
| import scala.util.control.NonFatal | ||
|
|
||
| import org.apache.spark.SparkConf | ||
| import org.apache.spark.annotation.{DeveloperApi, Experimental, InterfaceStability} | ||
| import org.apache.spark.internal.Logging | ||
| import org.apache.spark.scheduler.{SparkListener, SparkListenerEvent} | ||
| import org.apache.spark.sql.SparkSession | ||
| import org.apache.spark.sql.execution.QueryExecution | ||
| import org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionEnd | ||
| import org.apache.spark.sql.internal.StaticSQLConf._ | ||
| import org.apache.spark.util.Utils | ||
| import org.apache.spark.util.{ListenerBus, Utils} | ||
|
|
||
| /** | ||
| * :: Experimental :: | ||
|
|
@@ -75,10 +74,18 @@ trait QueryExecutionListener { | |
| */ | ||
| @Experimental | ||
| @InterfaceStability.Evolving | ||
| class ExecutionListenerManager private extends Logging { | ||
|
|
||
| private[sql] def this(conf: SparkConf) = { | ||
| this() | ||
| // The `session` is used to indicate which session carries this listener manager, and we only | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this not a class doc?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The constructor is private, so we should not make it visible in the class doc |
||
| // catch SQL executions which are launched by the same session. | ||
| // The `loadExtensions` flag is used to indicate whether we should load the pre-defined, | ||
| // user-specified listeners during construction. We should not do it when cloning this listener | ||
| // manager, as we will copy all listeners to the cloned listener manager. | ||
| class ExecutionListenerManager private[sql](session: SparkSession, loadExtensions: Boolean) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: we shall add param comments. |
||
| extends Logging { | ||
|
|
||
| private val listenerBus = new ExecutionListenerBus(session) | ||
|
|
||
| if (loadExtensions) { | ||
| val conf = session.sparkContext.conf | ||
| conf.get(QUERY_EXECUTION_LISTENERS).foreach { classNames => | ||
| Utils.loadExtensions(classOf[QueryExecutionListener], classNames, conf).foreach(register) | ||
| } | ||
|
|
@@ -88,82 +95,63 @@ class ExecutionListenerManager private extends Logging { | |
| * Registers the specified [[QueryExecutionListener]]. | ||
| */ | ||
| @DeveloperApi | ||
| def register(listener: QueryExecutionListener): Unit = writeLock { | ||
| listeners += listener | ||
| def register(listener: QueryExecutionListener): Unit = { | ||
| listenerBus.addListener(listener) | ||
| } | ||
|
|
||
| /** | ||
| * Unregisters the specified [[QueryExecutionListener]]. | ||
| */ | ||
| @DeveloperApi | ||
| def unregister(listener: QueryExecutionListener): Unit = writeLock { | ||
| listeners -= listener | ||
| def unregister(listener: QueryExecutionListener): Unit = { | ||
| listenerBus.removeListener(listener) | ||
| } | ||
|
|
||
| /** | ||
| * Removes all the registered [[QueryExecutionListener]]. | ||
| */ | ||
| @DeveloperApi | ||
| def clear(): Unit = writeLock { | ||
| listeners.clear() | ||
| def clear(): Unit = { | ||
| listenerBus.removeAllListeners() | ||
| } | ||
|
|
||
| /** | ||
| * Get an identical copy of this listener manager. | ||
| */ | ||
| @DeveloperApi | ||
| override def clone(): ExecutionListenerManager = writeLock { | ||
| val newListenerManager = new ExecutionListenerManager | ||
| listeners.foreach(newListenerManager.register) | ||
| private[sql] def clone(session: SparkSession): ExecutionListenerManager = { | ||
| val newListenerManager = new ExecutionListenerManager(session, loadExtensions = false) | ||
| listenerBus.listeners.asScala.foreach(newListenerManager.register) | ||
| newListenerManager | ||
| } | ||
| } | ||
|
|
||
| private[sql] def onSuccess(funcName: String, qe: QueryExecution, duration: Long): Unit = { | ||
| readLock { | ||
| withErrorHandling { listener => | ||
| listener.onSuccess(funcName, qe, duration) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| private[sql] def onFailure(funcName: String, qe: QueryExecution, exception: Exception): Unit = { | ||
| readLock { | ||
| withErrorHandling { listener => | ||
| listener.onFailure(funcName, qe, exception) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| private[this] val listeners = ListBuffer.empty[QueryExecutionListener] | ||
| private[sql] class ExecutionListenerBus(session: SparkSession) | ||
| extends SparkListener with ListenerBus[QueryExecutionListener, SparkListenerSQLExecutionEnd] { | ||
|
|
||
| /** A lock to prevent updating the list of listeners while we are traversing through them. */ | ||
| private[this] val lock = new ReentrantReadWriteLock() | ||
| session.sparkContext.listenerBus.addToSharedQueue(this) | ||
|
|
||
| private def withErrorHandling(f: QueryExecutionListener => Unit): Unit = { | ||
| for (listener <- listeners) { | ||
| try { | ||
| f(listener) | ||
| } catch { | ||
| case NonFatal(e) => logWarning("Error executing query execution listener", e) | ||
| } | ||
| } | ||
| override def onOtherEvent(event: SparkListenerEvent): Unit = event match { | ||
| case e: SparkListenerSQLExecutionEnd => postToAll(e) | ||
| case _ => | ||
| } | ||
|
|
||
| /** Acquires a read lock on the cache for the duration of `f`. */ | ||
| private def readLock[A](f: => A): A = { | ||
| val rl = lock.readLock() | ||
| rl.lock() | ||
| try f finally { | ||
| rl.unlock() | ||
| override protected def doPostEvent( | ||
| listener: QueryExecutionListener, | ||
| event: SparkListenerSQLExecutionEnd): Unit = { | ||
| if (shouldReport(event)) { | ||
| val funcName = event.executionName.get | ||
| event.executionFailure match { | ||
| case Some(ex) => | ||
| listener.onFailure(funcName, event.qe, ex) | ||
| case _ => | ||
| listener.onSuccess(funcName, event.qe, event.duration) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| /** Acquires a write lock on the cache for the duration of `f`. */ | ||
| private def writeLock[A](f: => A): A = { | ||
| val wl = lock.writeLock() | ||
| wl.lock() | ||
| try f finally { | ||
| wl.unlock() | ||
| } | ||
| private def shouldReport(e: SparkListenerSQLExecutionEnd): Boolean = { | ||
| // Only catch SQL execution with a name, and triggered by the same spark session that this | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So this is what bugs me. You are adding separation between the SparkSession and its listeners, to undo that here. It seems like a bit of a hassle to go through because you basically need async execution.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yea. Assuming we have many spark sessions, running queries at the same time. Each session sends query execution events to the central event bus, and sets up a listener to watch its own query execution events, asynchronously. To make it work, the most straightforward way is to carry the session identifier in the events, and the listener only watch events with the expected session identifier. Maybe a better way is to introduce session in the Spark core, so the listener framework can dispatch events w.r.t. session automatically. But that's a lot of work.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we had the same problem in the StreamingQueryListener. You can check how we solved it in
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @brkyvz thanks for the information! It seems the |
||
| // listener manager belongs. | ||
| e.executionName.isDefined && e.qe.sparkSession.eq(this.session) | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duration used to be reported in nanos. Now it's millis. I would still report it as nanos if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah good catch!