[SPARK-40453][SPARK-41715][CONNECT] Take super class into account when throwing an exception #39947

HyukjinKwon · 2023-02-09T03:20:41Z

What changes were proposed in this pull request?

This PR proposes to take the super classes into account when throwing an exception from the server to Python side by adding more metadata of classes, causes and traceback in JVM.

In addition, this PR matches the exceptions being thrown to the regular PySpark exceptions defined:

spark/python/pyspark/errors/exceptions/captured.py

Lines 108 to 147 in 04550ed

    
           def convert_exception(e: Py4JJavaError) -> CapturedException: 
        
               assert e is not None 
        
               assert SparkContext._jvm is not None 
        
               assert SparkContext._gateway is not None 
        
               jvm = SparkContext._jvm 
        
               gw = SparkContext._gateway 
        
               if is_instance_of(gw, e, "org.apache.spark.sql.catalyst.parser.ParseException"): 
        
                   return ParseException(origin=e) 
        
               # Order matters. ParseException inherits AnalysisException. 
        
               elif is_instance_of(gw, e, "org.apache.spark.sql.AnalysisException"): 
        
                   return AnalysisException(origin=e) 
        
               elif is_instance_of(gw, e, "org.apache.spark.sql.streaming.StreamingQueryException"): 
        
                   return StreamingQueryException(origin=e) 
        
               elif is_instance_of(gw, e, "org.apache.spark.sql.execution.QueryExecutionException"): 
        
                   return QueryExecutionException(origin=e) 
        
               elif is_instance_of(gw, e, "java.lang.IllegalArgumentException"): 
        
                   return IllegalArgumentException(origin=e) 
        
               elif is_instance_of(gw, e, "org.apache.spark.SparkUpgradeException"): 
        
                   return SparkUpgradeException(origin=e) 
        
               c: Py4JJavaError = e.getCause() 
        
               stacktrace: str = jvm.org.apache.spark.util.Utils.exceptionString(e) 
        
               if c is not None and ( 
        
                   is_instance_of(gw, c, "org.apache.spark.api.python.PythonException") 
        
                   # To make sure this only catches Python UDFs. 
        
                   and any( 
        
                       map( 
        
                           lambda v: "org.apache.spark.sql.execution.python" in v.toString(), c.getStackTrace() 
        
                       ) 
        
                   ) 
        
               ): 
        
                   msg = ( 
        
                       "\n  An exception was thrown from the Python worker. " 
        
                       "Please see the stack trace below.\n%s" % c.getMessage() 
        
                   ) 
        
                   return PythonException(msg, stacktrace) 
        
               return UnknownException(desc=e.toString(), stackTrace=stacktrace, cause=c)

Why are the changes needed?

Right now, many exceptions cannot be handled (e.g., NoSuchDatabaseException that inherits AnalysisException) in Python side.

Does this PR introduce any user-facing change?

No to end users.
Yes, it matches the exceptions to the regular PySpark exceptions.

How was this patch tested?

Unittests fixed.

HyukjinKwon · 2023-02-09T05:22:18Z

...connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala

Otherwise, it complains the header length (8KiB limit). It can be configured below via NettyChannelBuilder.maxInboundMessageSize but I didn't change it here, see also https://stackoverflow.com/a/686243

@HyukjinKwon Am I understand correctly the header limit is for the HTTP header? Can we put it in the body? Is there still a ~4K limit?

Yeah, I think we could put it in the body.

Why is this needed? We have the abbreviated message already in the body

Eh, it was already there before in metadata to print out the stacktrace from the server to the client. The exception was thrown when the size of stacktrace was too big so I made the stractrace string truncated.

HyukjinKwon · 2023-02-09T05:23:47Z

python/pyspark/errors/exceptions/connect.py

The original AnalysisException.getMessage contains the string representation of the underlying plan.

Example:

spark.range(1).select("a").show()

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/.../spark/python/pyspark/sql/connect/dataframe.py", line 776, in show print(self._show_string(n, truncate, vertical)) File "/.../spark/python/pyspark/sql/connect/dataframe.py", line 619, in _show_string pdf = DataFrame.withPlan( File "/.../spark/python/pyspark/sql/connect/dataframe.py", line 1325, in toPandas return self._session.client.to_pandas(query) File "/.../spark/python/pyspark/sql/connect/client.py", line 449, in to_pandas table, metrics = self._execute_and_fetch(req) File "/.../spark/python/pyspark/sql/connect/client.py", line 636, in _execute_and_fetch self._handle_error(rpc_error) File "/.../spark/python/pyspark/sql/connect/client.py", line 670, in _handle_error raise convert_exception(info, status.message) from None pyspark.errors.exceptions.connect.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `a` cannot be resolved. Did you mean one of the following? [`id`].; 'Project ['a] +- Range (0, 1, step=1, splits=Some(16))

What's the captured one's stack trace like?

Captured one:

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/.../spark/python/pyspark/sql/dataframe.py", line 2987, in select jdf = self._jdf.select(self._jcols(*cols)) File "/.../spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1322, in __call__ File "/.../sparkk/python/pyspark/errors/exceptions/captured.py", line 159, in deco raise converted from None pyspark.errors.exceptions.captured.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `a` cannot be resolved. Did you mean one of the following? [`id`].; 'Project ['a] +- Range (0, 1, step=1, splits=Some(16))

In that case, should we still show the plan to be consistent?

eh, yeah. It still shows the plan. This is the part of getMessage.

Ah, got it. 👍

HyukjinKwon · 2023-02-09T07:37:45Z

...connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala

Example:

from pyspark.sql.functions import udf @udf def aa(a): 1/0 spark.range(1).select(aa("id")).show()

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/.../spark/python/pyspark/sql/connect/dataframe.py", line 776, in show print(self._show_string(n, truncate, vertical)) File "/.../spark/python/pyspark/sql/connect/dataframe.py", line 619, in _show_string pdf = DataFrame.withPlan( File "/.../spark/python/pyspark/sql/connect/dataframe.py", line 1325, in toPandas return self._session.client.to_pandas(query) File "/.../spark/python/pyspark/sql/connect/client.py", line 449, in to_pandas table, metrics = self._execute_and_fetch(req) File "/.../spark/python/pyspark/sql/connect/client.py", line 636, in _execute_and_fetch self._handle_error(rpc_error) File "/.../spark/python/pyspark/sql/connect/client.py", line 670, in _handle_error raise convert_exception(info, status.message) from None pyspark.errors.exceptions.connect.PythonException: An exception was thrown from the Python worker. Please see the stack trace below. Traceback (most recent call last): File "<stdin>", line 3, in aa ZeroDivisionError: division by zero

HyukjinKwon · 2023-02-09T12:11:50Z

...connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala

getLocalizedMessage is not used in our codebase.

and the doc of setMessage mentions that it's fine to send non-localized errors (and expect the client to localize it).

HyukjinKwon · 2023-02-09T13:07:31Z

python/pyspark/sql/tests/pandas/test_pandas_udf.py

The error message is too long, and it gets abbreviated in Connect case.

HyukjinKwon · 2023-02-09T13:08:17Z

cc @xinrong-meng @grundprinzip @ueshin @zhengruifeng PTAL

...connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala

ueshin · 2023-02-09T23:38:29Z

python/pyspark/errors/exceptions/connect.py

What's the captured one's stack trace like?

...connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala

ueshin

LGTM, pending tests.

HyukjinKwon · 2023-02-10T01:49:35Z

Merged to master and branch-3.4.

…n throwing an exception ### What changes were proposed in this pull request? This PR proposes to take the super classes into account when throwing an exception from the server to Python side by adding more metadata of classes, causes and traceback in JVM. In addition, this PR matches the exceptions being thrown to the regular PySpark exceptions defined: https://github.com/apache/spark/blob/04550edd49ee587656d215e59d6a072772d7d5ec/python/pyspark/errors/exceptions/captured.py#L108-L147 ### Why are the changes needed? Right now, many exceptions cannot be handled (e.g., `NoSuchDatabaseException` that inherits `AnalysisException`) in Python side. ### Does this PR introduce _any_ user-facing change? No to end users. Yes, it matches the exceptions to the regular PySpark exceptions. ### How was this patch tested? Unittests fixed. Closes #39947 from HyukjinKwon/SPARK-41715. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit c5230e4) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

dongjoon-hyun · 2023-02-13T00:41:27Z

python/pyspark/sql/dataframe.py

        --------
        >>> df = spark.createDataFrame([(1, 11), (1, 11), (3, 10), (4, 8), (4, 8)], ["c1", "c2"])
-        >>> df.freqItems(["c1", "c2"]).show()  # doctest: +SKIP
+        >>> df.freqItems(["c1", "c2"]).show()


Hi, @HyukjinKwon and @ueshin . Unfortunately, this broke Scala 2.13 CI. I made a followup.

[SPARK-40453][SPARK-41715][CONNECT][TESTS][FOLLOWUP] Skip freqItems doctest due to Scala 2.13 failure #39983

…octest due to Scala 2.13 failure ### What changes were proposed in this pull request? This is a follow-up of #39947 to ignore `freqItems` doctest back. ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Closes #39983 from dongjoon-hyun/SPARK-40453. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

…octest due to Scala 2.13 failure ### What changes were proposed in this pull request? This is a follow-up of #39947 to ignore `freqItems` doctest back. ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Closes #39983 from dongjoon-hyun/SPARK-40453. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit d703808) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

…n throwing an exception ### What changes were proposed in this pull request? This PR proposes to take the super classes into account when throwing an exception from the server to Python side by adding more metadata of classes, causes and traceback in JVM. In addition, this PR matches the exceptions being thrown to the regular PySpark exceptions defined: https://github.com/apache/spark/blob/04550edd49ee587656d215e59d6a072772d7d5ec/python/pyspark/errors/exceptions/captured.py#L108-L147 ### Why are the changes needed? Right now, many exceptions cannot be handled (e.g., `NoSuchDatabaseException` that inherits `AnalysisException`) in Python side. ### Does this PR introduce _any_ user-facing change? No to end users. Yes, it matches the exceptions to the regular PySpark exceptions. ### How was this patch tested? Unittests fixed. Closes apache#39947 from HyukjinKwon/SPARK-41715. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit c5230e4) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

…octest due to Scala 2.13 failure ### What changes were proposed in this pull request? This is a follow-up of apache#39947 to ignore `freqItems` doctest back. ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Closes apache#39983 from dongjoon-hyun/SPARK-40453. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit d703808) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

HyukjinKwon marked this pull request as draft February 9, 2023 03:20

github-actions bot added CONNECT CORE PYTHON SQL labels Feb 9, 2023

HyukjinKwon force-pushed the SPARK-41715 branch 6 times, most recently from d7ecbd5 to b47faf7 Compare February 9, 2023 05:16

HyukjinKwon commented Feb 9, 2023

View reviewed changes

HyukjinKwon force-pushed the SPARK-41715 branch 4 times, most recently from 21fa863 to eb13838 Compare February 9, 2023 07:34

HyukjinKwon marked this pull request as ready for review February 9, 2023 07:37

HyukjinKwon commented Feb 9, 2023

View reviewed changes

HyukjinKwon force-pushed the SPARK-41715 branch 2 times, most recently from 2435bb9 to fed524b Compare February 9, 2023 12:11

HyukjinKwon commented Feb 9, 2023

View reviewed changes

HyukjinKwon force-pushed the SPARK-41715 branch 7 times, most recently from f4d4cf6 to 041458c Compare February 9, 2023 13:06

HyukjinKwon commented Feb 9, 2023

View reviewed changes

HyukjinKwon force-pushed the SPARK-41715 branch from 041458c to 58a1a9f Compare February 9, 2023 13:14

ueshin reviewed Feb 9, 2023

View reviewed changes

HyukjinKwon force-pushed the SPARK-41715 branch from 58a1a9f to 155eb78 Compare February 9, 2023 23:52

Better exception

25ab26b

HyukjinKwon force-pushed the SPARK-41715 branch from 155eb78 to 25ab26b Compare February 9, 2023 23:54

ueshin approved these changes Feb 10, 2023

View reviewed changes

HyukjinKwon closed this in c5230e4 Feb 10, 2023

dongjoon-hyun reviewed Feb 13, 2023

View reviewed changes

dongjoon-hyun mentioned this pull request Feb 13, 2023

[SPARK-40453][SPARK-41715][CONNECT][TESTS][FOLLOWUP] Skip freqItems doctest due to Scala 2.13 failure #39983

Closed

HyukjinKwon deleted the SPARK-41715 branch January 15, 2024 00:48

	def convert_exception(e: Py4JJavaError) -> CapturedException:
	assert e is not None
	assert SparkContext._jvm is not None
	assert SparkContext._gateway is not None

	jvm = SparkContext._jvm
	gw = SparkContext._gateway

	if is_instance_of(gw, e, "org.apache.spark.sql.catalyst.parser.ParseException"):
	return ParseException(origin=e)
	# Order matters. ParseException inherits AnalysisException.
	elif is_instance_of(gw, e, "org.apache.spark.sql.AnalysisException"):
	return AnalysisException(origin=e)
	elif is_instance_of(gw, e, "org.apache.spark.sql.streaming.StreamingQueryException"):
	return StreamingQueryException(origin=e)
	elif is_instance_of(gw, e, "org.apache.spark.sql.execution.QueryExecutionException"):
	return QueryExecutionException(origin=e)
	elif is_instance_of(gw, e, "java.lang.IllegalArgumentException"):
	return IllegalArgumentException(origin=e)
	elif is_instance_of(gw, e, "org.apache.spark.SparkUpgradeException"):
	return SparkUpgradeException(origin=e)

	c: Py4JJavaError = e.getCause()
	stacktrace: str = jvm.org.apache.spark.util.Utils.exceptionString(e)
	if c is not None and (
	is_instance_of(gw, c, "org.apache.spark.api.python.PythonException")
	# To make sure this only catches Python UDFs.
	and any(
	map(
	lambda v: "org.apache.spark.sql.execution.python" in v.toString(), c.getStackTrace()
	)
	)
	):
	msg = (
	"\n An exception was thrown from the Python worker. "
	"Please see the stack trace below.\n%s" % c.getMessage()
	)
	return PythonException(msg, stacktrace)

	return UnknownException(desc=e.toString(), stackTrace=stacktrace, cause=c)

[SPARK-40453][SPARK-41715][CONNECT] Take super class into account when throwing an exception #39947

[SPARK-40453][SPARK-41715][CONNECT] Take super class into account when throwing an exception #39947

Uh oh!

Conversation

HyukjinKwon commented Feb 9, 2023

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhenlineo Mar 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Feb 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon commented Feb 9, 2023

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ueshin left a comment

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon commented Feb 10, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zhenlineo Mar 23, 2023 •

edited

Loading

HyukjinKwon Feb 9, 2023 •

edited

Loading