[SPARK-50511][PYTHON][FOLLOWUP] Avoid wrapping streaming Python data source error messages #49532

allisonwang-db · 2025-01-16T18:36:31Z

What changes were proposed in this pull request?

This PR is a follow up for #49092. It removes the extra try catch during streaming Python data source execution.

Why are the changes needed?

To make the error message more user-friendly and avoid nested error messages:

error1
During handling of the above exception, another exception occurred:
error2

Does this PR introduce any user-facing change?

no

How was this patch tested?

existing tests

Was this patch authored or co-authored using generative AI tooling?

no

allisonwang-db · 2025-01-16T18:36:53Z

cc @HyukjinKwon

dongjoon-hyun

+1, LGTM. Thank you, @allisonwang-db .

dongjoon-hyun

Oh, the removal of PySparkRuntimeError causes two Python linter error. Could you fix them, @allisonwang-db ?

./python/pyspark/sql/streaming/python_streaming_source_runner.py:24:1: F401 'pyspark.errors.PySparkRuntimeError' imported but unused
from pyspark.errors import IllegalArgumentException, PySparkAssertionError, PySparkRuntimeError
^
./python/pyspark/sql/worker/python_streaming_sink_runner.py:23:1: F401 'pyspark.errors.PySparkRuntimeError' imported but unused
from pyspark.errors import PySparkAssertionError, PySparkRuntimeError
^
2     F401 'pyspark.errors.PySparkRuntimeError' imported but unused

dongjoon-hyun

Thank you for the fix. Pending CIs.

Stale

dongjoon-hyun

To @allisonwang-db , please rebase this PR once more and fix the unit test.

[info] *** 4 TESTS FAILED ***
[error] Failed: Total 12090, Failed 4, Errors 0, Passed 12086, Ignored 33, Canceled 1
[error] Failed tests:
[error] 	org.apache.spark.sql.execution.python.PythonStreamingDataSourceSimpleSuite
[error] 	org.apache.spark.sql.execution.python.PythonStreamingDataSourceSuite
[error] (sql / Test / test) sbt.TestsFailedException: Tests unsuccessful

Stale review.

github-actions bot added SQL STRUCTURED STREAMING PYTHON labels Jan 16, 2025

dongjoon-hyun approved these changes Jan 16, 2025

View reviewed changes

dongjoon-hyun requested changes Jan 16, 2025

View reviewed changes

dongjoon-hyun previously approved these changes Jan 17, 2025

View reviewed changes

HyukjinKwon approved these changes Jan 17, 2025

View reviewed changes

dongjoon-hyun previously requested changes Jan 22, 2025

View reviewed changes

allisonwang-db added 4 commits March 5, 2025 14:48

update err

b20f6a2

address comments

ce20b44

update

0d1c5a3

more fix

2d0a638

allisonwang-db force-pushed the spark-50511-streaming-pyds-err branch from f6a66c1 to 2d0a638 Compare March 5, 2025 23:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-50511][PYTHON][FOLLOWUP] Avoid wrapping streaming Python data source error messages #49532

[SPARK-50511][PYTHON][FOLLOWUP] Avoid wrapping streaming Python data source error messages #49532

allisonwang-db commented Jan 16, 2025

allisonwang-db commented Jan 16, 2025

dongjoon-hyun left a comment

dongjoon-hyun left a comment

dongjoon-hyun left a comment

dongjoon-hyun left a comment

[SPARK-50511][PYTHON][FOLLOWUP] Avoid wrapping streaming Python data source error messages #49532

Are you sure you want to change the base?

[SPARK-50511][PYTHON][FOLLOWUP] Avoid wrapping streaming Python data source error messages #49532

Conversation

allisonwang-db commented Jan 16, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

allisonwang-db commented Jan 16, 2025

dongjoon-hyun left a comment

Choose a reason for hiding this comment

dongjoon-hyun left a comment

Choose a reason for hiding this comment

dongjoon-hyun left a comment

Choose a reason for hiding this comment

dongjoon-hyun left a comment

Choose a reason for hiding this comment