-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-42342][PYTHON][CONNECT] Introduce base hierarchy to exceptions #39882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM from my side. Thank you, @ueshin .
itholic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good. Jut left nit question for my understanding.
| msg = je.toString().split(": ", 1)[1] # Drop the Java StreamingQueryException type info | ||
| stackTrace = "\n\t at ".join(map(lambda x: x.toString(), je.getStackTrace())) | ||
| return StreamingQueryException(msg, stackTrace, je.getCause()) | ||
| return CapturedStreamingQueryException(msg, stackTrace, je.getCause()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for clear understanding, will it raise StreamingQueryException in user space, right ?
I just want to clarify because we don't use such an alias for connect exceptions for example:
from pyspark.errors.exceptions.connect import (
AnalysisException as ConnectAnalysisException
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name StreamingQueryException is already imported for the return type annotation, so we just need to use another name.
spark/python/pyspark/sql/streaming/query.py
Lines 24 to 27 in 673d2de
| from pyspark.errors import StreamingQueryException | |
| from pyspark.errors.exceptions.captured import ( | |
| StreamingQueryException as CapturedStreamingQueryException, | |
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed it.
LGTM!
|
|
||
| class ColumnParityTests(ColumnTestsMixin, ReusedConnectTestCase): | ||
| # TODO(SPARK-42017): Different error type AnalysisException vs SparkConnectAnalysisException | ||
| # TODO(SPARK-42017): df["bad_key"] does not raise AnalysisException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| finally: | ||
| sq.stop() | ||
| self.assertTrue(type(sq.exception()) is StreamingQueryException) | ||
| self.assertIsInstance(sq.exception(), StreamingQueryException) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
Merged to master and branch-3.4. |
### What changes were proposed in this pull request? Introduces base hierarchy to exceptions. As a common hierarchy for users, base exception classes are subclasses of `PySparkException`. The concrete classes for both PySpark and Spark Connect inherits the base classes that should not be exposed to users. ### Why are the changes needed? Currently exception class hierarchy is separated between PySpark and Spark Connect. If users want to check the exception type, they need to switch the error classes based on whether they are running on PySpark or Spark Connect, but it's not ideal. ### Does this PR introduce _any_ user-facing change? No. Users still can use the existing exception classes to check the exception type. ### How was this patch tested? Updated tests. Closes #39882 from ueshin/issues/SPARK-42342/exceptions. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit bd34b16) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
…ise_error to call the proper test ### What changes were proposed in this pull request? This is a follow-up of #39882. Fixes `FunctionsParityTests.test_raise_error` to call the proper test. ### Why are the changes needed? `FunctionsParityTests.test_raise_error` should've called `check_raise_error`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? The fixed test. Closes #40021 from ueshin/issues/SPARK-42342/test. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
…ise_error to call the proper test ### What changes were proposed in this pull request? This is a follow-up of #39882. Fixes `FunctionsParityTests.test_raise_error` to call the proper test. ### Why are the changes needed? `FunctionsParityTests.test_raise_error` should've called `check_raise_error`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? The fixed test. Closes #40021 from ueshin/issues/SPARK-42342/test. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit 3ed1b95) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
### What changes were proposed in this pull request? Introduces base hierarchy to exceptions. As a common hierarchy for users, base exception classes are subclasses of `PySparkException`. The concrete classes for both PySpark and Spark Connect inherits the base classes that should not be exposed to users. ### Why are the changes needed? Currently exception class hierarchy is separated between PySpark and Spark Connect. If users want to check the exception type, they need to switch the error classes based on whether they are running on PySpark or Spark Connect, but it's not ideal. ### Does this PR introduce _any_ user-facing change? No. Users still can use the existing exception classes to check the exception type. ### How was this patch tested? Updated tests. Closes apache#39882 from ueshin/issues/SPARK-42342/exceptions. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit bd34b16) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
…ise_error to call the proper test ### What changes were proposed in this pull request? This is a follow-up of apache#39882. Fixes `FunctionsParityTests.test_raise_error` to call the proper test. ### Why are the changes needed? `FunctionsParityTests.test_raise_error` should've called `check_raise_error`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? The fixed test. Closes apache#40021 from ueshin/issues/SPARK-42342/test. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit 3ed1b95) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
What changes were proposed in this pull request?
Introduces base hierarchy to exceptions.
As a common hierarchy for users, base exception classes are subclasses of
PySparkException.The concrete classes for both PySpark and Spark Connect inherits the base classes that should not be exposed to users.
Why are the changes needed?
Currently exception class hierarchy is separated between PySpark and Spark Connect.
If users want to check the exception type, they need to switch the error classes based on whether they are running on PySpark or Spark Connect, but it's not ideal.
Does this PR introduce any user-facing change?
No. Users still can use the existing exception classes to check the exception type.
How was this patch tested?
Updated tests.