-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-44776][CONNECT] Add ProducedRowCount to SparkListenerConnectOperationFinished #42454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...server/src/main/scala/org/apache/spark/sql/connect/execution/SparkConnectPlanExecution.scala
Show resolved
Hide resolved
...onnect/server/src/main/scala/org/apache/spark/sql/connect/service/ExecuteEventsManager.scala
Outdated
Show resolved
Hide resolved
...ct/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectServiceSuite.scala
Show resolved
Hide resolved
...server/src/main/scala/org/apache/spark/sql/connect/execution/SparkConnectPlanExecution.scala
Outdated
Show resolved
Hide resolved
juliuszsompolski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run
./build/mvn -Pscala-2.12 scalafmt:format -Dscalafmt.skip=false -Dscalafmt.validateOnly=false -Dscalafmt.changedOnly=false -pl connector/connect/common -pl connector/connect/server -pl connector/connect/client/jvm
to lint
...server/src/main/scala/org/apache/spark/sql/connect/execution/SparkConnectPlanExecution.scala
Outdated
Show resolved
Hide resolved
...onnect/server/src/main/scala/org/apache/spark/sql/connect/service/ExecuteEventsManager.scala
Show resolved
Hide resolved
juliuszsompolski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, two nits
...server/src/main/scala/org/apache/spark/sql/connect/execution/SparkConnectPlanExecution.scala
Show resolved
Hide resolved
...onnect/server/src/main/scala/org/apache/spark/sql/connect/service/ExecuteEventsManager.scala
Show resolved
Hide resolved
| /** | ||
| * Post @link org.apache.spark.sql.connect.service.SparkListenerConnectOperationFinished. | ||
| * @param producedRowsCountOpt | ||
| * Number of rows that are returned to the user. None is expected when the operation does not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QQ: why not use 0 if not rows are returned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think returning None would make it clear to us whether the corresponding query should have produced row or not, for example for a SELECT statement that has no result, we would put 0 here. And for query Like INSERT INTO, we would put a None instead of 0 here. This could better tell us whether a produced row is expected or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with gjxdxh that it's clearer with None that no rows are expected for this operation.
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I would like @gengliangwang 's comment to be addressed before merging it :-).
|
Merged to master and branch-3.5. |
…erationFinished ### What changes were proposed in this pull request? Add ProducedRowCount field to SparkListenerConnectOperationFinished ### Why are the changes needed? Needed for showing number of rows getting produced ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added Unit test Closes #42454 from gjxdxh/SPARK-44776. Authored-by: Lingkai Kong <lingkai.kong@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit 4646991) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
…erationFinished ### What changes were proposed in this pull request? Add ProducedRowCount field to SparkListenerConnectOperationFinished ### Why are the changes needed? Needed for showing number of rows getting produced ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added Unit test Closes apache#42454 from gjxdxh/SPARK-44776. Authored-by: Lingkai Kong <lingkai.kong@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
…-43923: commands send events - get_resources_command` ### What changes were proposed in this pull request? This PR aims to disable a flaky test, `SparkConnectServiceSuite.SPARK-43923: commands send events - get_resources_command`, temporarily. To re-enable this, SPARK-48164 is created as a blocker issue for 4.0.0. ### Why are the changes needed? This test case was added at `Apache Spark 3.5.0`, but it has been flaky and causes many re-tries in our GitHub Action CI environment. - #42454 - https://github.com/apache/spark/actions/runs/8979348499/job/24661200052 ``` [info] - SPARK-43923: commands send events ((get_resources_command { [info] } [info] ,None)) *** FAILED *** (35 milliseconds) [info] VerifyEvents.this.listener.executeHolder.isDefined was false (SparkConnectServiceSuite.scala:873) ``` This PR aims to stabilize CI first and to focus this flaky issue as a blocker level before going on `Spark Connect GA` in SPARK-48164 before Apache Spark 4.0.0. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46425 from dongjoon-hyun/SPARK-48163. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
Add ProducedRowCount field to SparkListenerConnectOperationFinished
Why are the changes needed?
Needed for showing number of rows getting produced
Does this PR introduce any user-facing change?
No
How was this patch tested?
Added Unit test